User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Workflow Resource Management

The workflow engine lets you limit the resources consumed by workflow tasks to prevent server overload. This is particularly useful in the following cases:

  • Assume there are two workflows on your server that, if run simultaneously, would consume more memory than available. You want to make sure the memory-intensive tasks in these workflows are not executed at the same time.

  • Assume you have many workflows with tasks that query a database, for example, Execute SQL, Read SQL Result. To prevent the overloading of the database server, you can limit the number of simultaneous database connections.

This article provides an overview of the available options for resource management.

Server resources

Defining server resources is a comprehensive method for controlling resource consumption automatically and making sure that ONE Runtime Server and other surrounding infrastructure, for example, database servers, are not overloaded.

The Workflow Component controls resource consumption by preventing tasks with high demand of the same resource from running at the same time. This is the most complete and flexible way of workflow resource management.

This feature does not let you actually control and limit resource consumption on your infrastructure. The definition of server resources and resource requirement of tasks relies on the accuracy of your estimates of resource consumption by workflows on your server.

To set up server resource management:

  1. Define the resources available to the server. See Define server resources.

  2. Define resource requirements on tasks with high resource consumption. See Define task resource requirements.

  3. (Optionally) Define task priority.

Define server resources

You can define resources in Runtime Configuration. The following example defines the db-oracle resource limited to four units while memory is limited to 4096 units, and so on.

All resources are defined in "units" where a unit is whatever is defined by the user’s logic. It can represent a database connection, MBs of RAM space, storage space, and the like.

Changing server resource definitions requires restarting ONE Runtime Server.
Runtime configuration
<runtimeconfig>
    ...
    <resources>
        <resource id="db-oracle" units="4" name="DB Oracle (connections)"/>
        <resource id="memory" units="4096" name="Memory (MB)" />
    </resources>
    ...
</runtimeconfig>
Attribute Required Description

id

Yes

Unique code of the resource. Tasks refer to this ID when defining the resource request.

units

No

The number of units of the resource available in the system.

name

No

A human-readable name of the resource or description.

Once you define the unit meaning for the resource, you must keep it across the whole workflow (server). Since all tasks of all workflow configurations compete for resources, their requests must be in the same units to be comparable.

When multiple tasks of either multiple workflows or the same workflow (parallel tasks) compete for resources, the following is taken into consideration:

  1. Task priority

  2. Task resource demand

Example configuration

Assume you have two workflows, both of which contain Run DQC tasks that launch complex plans. You have measured that Workflow 1 consumes 1.5 GB or memory and Workflow 2 consumes 1 GB of memory.

Assuming you have only 2 GB of memory allocated for the whole server, you need to make sure these two tasks do not run at the same time if the two workflows are running simultaneously. To prevent an OutOfMemory exception caused by simultaneous execution of the two workflows:

  1. Define a resource in the runtime configuration and limit it to 2048 units.

    <runtimeconfig>
        ...
        <resources>
            <resource id="memory" units="2048" name="Memory (MB)" />
        </resources>
        ...
    </runtimeconfig>
  2. Set resource requirements for the two Run DQC tasks as follows:

    Workflow Task ID Units

    Workflow 1

    Run DQC

    memory

    1536

    Workflow 2

    Run DQC

    memory

    1024

If the two workflows are executed simultaneously and both Run DQC tasks are queued to be executed at the same time, only one task is run at a time because the sum of 1536 and 1024 is greater than 2048.

Define task resource requirements

Each task might request some number of each resource. If no resource requirements are defined on the task, the task is assumed to not need any resources (that is, it requires zero units of each resource). Such task is always accepted by the resource manager and scheduled for processing (according to its priority).

You can define resource requests for the task on the Resource tab of the task properties. The following definition means that at a moment there are multiple tasks waiting to be run, this task is selected for run if:

  • All higher priority tasks of all workflows have been started.

  • The resource manager has enough resources for this task. In this case, it means that the following resources are available:

    • 1 unit of db-oracle.

    • 256 units of memory.

Resources - Task requirements

Define task priority

If two tasks cannot be executed at the same time because of limited resources, then the Priority setting in the global workflow properties determines which is run first.

Task priority is an integer (non-negative) value. The higher the number, the higher the task priority in the queue. The default value for all tasks is zero.

The workflow engine tries to maximize resource utilization, so if there are some tasks of lower priority that can be run (even though there are still some unassigned tasks of higher priority), these tasks are given resources instead.

Monitor server resources consumption

You can monitor how many resources the tasks consume in the ONE Runtime Server Admin, under Workflows > Resources.

The total resource consumption by currently running tasks is shown under Resources summary. The tasks that have been allocated with resources and are running are listed under Currently allocated tasks. The tasks waiting to be run due to insufficient resources are listed under Tasks waiting for resources.

Server resources consumption

Limitations

  • The resource management mechanism does not prevent workflows from being started. The workflow is started but its tasks might have to wait until the resource manager assigns them with resources (in such situation, the tasks remain in the IN_QUEUE state).

  • Asynchronously run tasks (such as OS commands run via Run Windows Command with waitFor=false) release resources after the task is executed (which means almost immediately); it does not mean after an asynchronously run process quits.

  • Expressions or semi-expressions in all resource-related configurations mentioned previously are not supported. "Static" resource definition is expected instead.

Multiplicity

The multiplicity setting defines the maximum number of simultaneously running instances of a workflow. If set to zero (or if the setting is not defined), the number of instances of a given workflow is not limited. The default value is zero.

Multiplicity is set on the General tab of global workflow properties. When launching or resuming the workflow instance, the engine checks the multiplicity constraint and:

  • If the number of currently running instances of this workflow is lower than the workflow multiplicity setting, it is launched normally.

  • If the number of currently running instances of this workflow is greater than the workflow multiplicity setting, the instance moves into the deferred workflow queue.

This is particularly useful when you need to:

  • Limit the number of running child workflow instances of a workflow containing the Iterate task with iteration type set to PARALLEL. SQL Row Iterator can result in a large number of rows returned (for example, 100), which results in 100 workflow instances being launched simultaneously.

  • Limit the number of running instances of a generic workflow used for launching a plan, with an input variable that determines the file to processed by the plan. If several parties execute this workflow simultaneously, it can overload the server.

Deferred workflow queue

This queue contains all workflow instances that were triggered for execution but could not be started yet because of the multiplicity constraint of a given workflow. These instances are not active until some of the blocking instances quit.

Once it is possible to run the instance, it is automatically picked up from the deferred queue and run. Deferred instances are run in the same order in which they were put to the deferred queue (from the oldest to newest). The deferred queue is saved if the server is not running: if ONE Runtime Server is stopped before a deferred instance is run, the instance appears as deferred on the next server startup. After the server starts, the engine checks and starts all the deferred instances that can be run.

All information needed for resumin workflows is copied to the resume object in the deferred queue, so even if the related state does not exist anymore (for example, it has already been deleted because of the history depth limit), the resume instance from the deferred queue can still be run.

If the underlying workflow definition changes after the instance has been put into the deferred queue but before it is run, the deferred instance is considered as invalid and removed.

Manage deferred workflows

The state of the deferred queue can be monitored via the Deferred Workflows entry listed in the ONE Runtime Server Admin, under the Workflows section.

This screen displays all deferred workflow instances in the order in which they were put into the deferred queue. You can delete deferred instances from the queue.

Deferred workflows

Number of tasks running in parallel

A simple (but radical) means of managing resource consumption by workflows is setting the maximum number of tasks that can run in parallel per workflow. This setting is controlled by the EWFThreadSlots parameter. The default value is 3. The allowed values are 3 to 30.

To change this setting, pass it as a Java parameter when starting the server. See How to Start or Stop the Server.

Changing this parameter requires restarting ONE Runtime Server.

Was this page useful?