[Net 2000 Ltd. Home][DataBee Home][DataBee Manual][DataBee FAQ]

The DataBee Set Loader
Run Statistics Tab

The DataBee Set Loader Run Statistics Tab

The primary function of the Set Loader Run Statistics tab is to indicate the progress of the currently executing loader rules and also display aggregate statistics for the operational loader set. The current status of each running rule is presented in the top part of the page and statistics for the set as a whole are displayed towards the bottom.

The panel at the top of the Run Statistics tab lists the rules currently executing in the loader set. It is possible to execute up to eight rules simultaneously in independent processes called worker threads. However, since most loader rules are of the Rule Manager type, it is more typical to see all threads consumed by a single manager rule executing its many internal operations in parallel. For example, a single active Truncate Manager rule will use all available threads to perform in parallel the table truncations specified by its internal configuration.

Each worker thread is provided with a row in the display. Note that only information on actively executing rules will be present in this display. When a loader rule stops executing (whether through completion or by error) its information will be removed from the display. Historical statistics and status information on each rule, whether active or not, can be found on the Rule Statistics Tab.

Note that double clicking on any worker thread actively running a rule will launch a form which displays that rules configuration.

There are six vertical columns in the loader rule display. These columns provide specific information on various components of each worker thread and the rule it is running.

What the columns in the Worker Thread Panel mean

Worker#
This column indicates the ID number of the worker running the rule. It is not significant other than as an identifier for diagnostics in the log file - all workers are identical in functionality.
Status
A text field which indicates if the worker is currently operational. DataBee will only run rules simultaneously if they are in the same rule block. It is possible to temporarily see workers listed as Idle if there are no rules remaining in the current rule block that are available to run.
Rule ID
A summary of the rule block, rule ID and the rule type. If the thread is running an operation on behalf of a Rule Manager type rule, the rule ID will have extra information appended to it to indicate the specific item ID within the manager rule. For example, a rule ID of 30-0005-12 means: rule block 30, rule number 0005 and a sub item within the rule block of 12. Double clicking with the mouse on the rule ID field will open up the rule and the target on which the thread is operating can be identified.
Rows Loaded
This column indicates the number of rows loaded into a target table by a thread being used by a Load Manager rule. This field is unused by all other rule types and will be blank when not required.
Run Time (sec)
This is the total run time in seconds for the thread. Note that since a thread will be occupied by a single action from a Manager Rule (a table truncate, a foreign key disable, a table load etc.) this value reflects the time taken for that specific action rather than for the rule as a whole.
Rows/Sec
If the thread is currently being used by a Load Manager rule this field will provide an indication of how many rows the rule is loading into that table per second. The value is derived from the other statistics via the formula: (Rows Loaded)/(Time in Seconds). If the Time in Seconds field is less than 1 then a value of 1 second is assumed. Since fractional seconds are not tracked, this can make the calculated rows per second value seem much lower than it really is for small amounts of data. This field is unused for threads which are not executing operations on behalf of a Load Manager rule;

The Number of Rule Worker Threads Setting

The Number of Rule Worker Threads combo-box offers the option of increasing or decreasing the number of workers available for simultaneous operations. This value can be adjusted (up or down) while the set is running and the workers will enable or disable as required. Although the number of worker threads can be changed in the Set Loader application, this information cannot be saved back to the loader set (the Set Loader does not have that capability). The Set Designer application is used to permanently change the number of Rule Worker Threads and save it to the loader set.

The most effective number of workers to use depends on a number of factors: the speed and memory of the PC on which the Set Loader application is running, the speed of the network connection to the remote Oracle database, the speed of the Oracle database and number of CPU's on the Oracle database platform. As a general rule, it is inefficient to set the number of workers too high since bottlenecks will occur that can cause the total execution time to be longer than if fewer workers were running. Since the correct setting depends on a number of variables, finding the most effective setting is something of a trial and error process. A typical setting would be to set the number of worker threads to be equal to the number of CPU's on the Oracle database server.

The Loader Set Statistics Panel

Various statistics for the currently running loader set are displayed at the bottom of the Run Statistics tab. The statistics displayed are for the loader set as a whole - if there are multiple Rule Controllers in the loader set, the values shown will be the aggregate values for all Rule Controllers.

Total run time
The total time the loader set has been operational.
Total tables
The total number of enabled tables eligible to be loaded.
Total tables loaded
The total number of enabled tables eligible to be loaded which have been loaded.
Total tables not loaded
The total number of enabled tables eligible to be loaded which have not been loaded.
Total loaded rows
The number of rows loaded into all tables. If threads loading tables on behalf of a Load Manager rule are currently executing, this display can be seen to update periodically.
Average loaded rows per second
This column indicates the number of rows loaded per second by the loader set Load Manager rule. If threads loading tables on behalf of a Load Manager rule are currently executing, this display can be seen to update periodically. Note that the elapsed time is only calculated to the nearest second. For load operations that take small amounts of time this figure can seem lower than it really is.

Statistics for the Current Load Manager Rule

A loader set can be configured to load multiple schemas. This is done by configuring the loader set with more than one Rule Controller each of which will contain its own Load Manager rule. If there are multiple Rule Controllers, then the statistics in the Loader Set Statistics Panel will be aggregate statistics for all Rule Controllers. Often it is desirable to be able to see the statistics for the currently operational Load Manager rule. The Statistics for the Current Load Manager Rule panel displays such summary statistics for the last operational Load Manager rule.

Extracted rows
The number of extracted rows. This information is obtained from the source schema temporary tables. The extracted row information in the source schema temporary tables is placed there by an extraction set which has been run in the Set Extractor application.
Percent complete
The number of loaded rows divided by the number of extracted rows expressed as a percentage.
Est. remaining time
The amount of time remaining in the load calculated by using the elapsed time for the rows currently loaded and then calculating the time to load the remaining extracted rows from it.
Elligible tables loaded
The total number of tables marked as enabled in the Load Manager rule which have been loaded.
Elligible tables remaining to be loaded
The total number of tables marked as enabled in the Load Manager rule which have not yet been loaded.
Tables extracted, marked for load
The total number of tables which actually have had data extracted for them in the source schema which are also marked as enabled in the Load Manager rule.
Tables extracted, not marked for load
The total number of tables which actually have had data extracted for them in the source schema which are marked as disabled in the Load Manager rule. Such tables will never be populated with data even though data was extracted for them.
Tables marked for load, not extracted
The total number of tables which are marked as enabled in the Load Manager rule but which had no rows extracted in the source schema for them.
Tables extracted, unknown to set
The number of tables in the source schema which have had data extracted for them but which the Load Manager rule knows nothing about. Such tables will never be populated with data even though data was extracted for them.
Tables extracted, but with zero rows
The number of tables in the source schema on which extraction operations have been performed but which have had zero rows extracted for them. This usually just means that the rules and driver data in the extraction set did not require any rows from those tables to be present in the target schema.

What the buttons do

Show Error Summary
This button displays a summary of any errors which have occurred during the run of the loader set. Detailed information on specific errors which have occurred is also available by pressing on the Error button associated with each rule on the Set Loader Rule Statistics tab.


[Net 2000 Ltd. Home][DataBee Home][DataBee Manual][DataBee FAQ]