Datapool

The Datapool is an Orchestrator module that enables us to efficiently manage batch item processing.

Through the Datapool, we can have control and granularity over the queue of items that need to be processed, making it an essential resource for automations that process large volumes of data.

Overview¶

Key Features¶

Parallel Item Processing: Allows consumption and processing of items in parallel.
Automatic Reprocessing: Automatically re-inserts items that were processed with error back into the queue for reprocessing.
Data Standardization: Possibility of creating a predefined structure (schema) for items, defining the expected fields for each added item.
Task Triggering: Automatic triggering of tasks for execution, based on the creation of new items in the queue.

Key Benefits¶

Complexity Abstraction: Eliminates the need to implement complex logic in code, as it has native treatments for concurrent consumption and reprocessing scenarios.
Simplified Integrations: Read from any database and transform data into Datapool items, adding them to the processing queue through APIs, SDKs and .csv files. Easily extract processing data and integrate with your data visualization platforms.
Centralized Management: Offers centralized management directly in the Orchestrator, facilitating control and monitoring, and is already integrated with other platform resources. Eliminates the need to depend on external databases and auxiliary tools.

Use Cases¶

Batch Item Processing: Ideal for automations that need to process large volumes of data (item batches), such as registrations, updates and queries.
Producer/Consumer Scenarios: Excellent alternative for scenarios where one automation generates demand (creates new items in the Datapool), and another automation consumes the queue and performs the processing flow of these items.