Process Flow designing

Performance considerations based on different use cases
- Files and Bulk Data processing
- Use cases – Business data exchange with customers and partners (B2B), Data Integration (ETL), Self-service for business users
- Technical needs – Lots of file transfers, Polling Events and Triggers, Secure Data Transfers, Complex data transformation rules, Data validations, Error notifications, Ability to fix errors and re-run files
- Performance considerations – Ability to process very large files (more than available memory), Apply complex rules efficiently, Ability to queue jobs, Efficient handling of events, Detailed logs for analysis, Recover failed jobs, Responsive UI
Real time Transactions
- Use cases – APIs and Web Services, Application Integration / ESB
- Technical needs – Published Web Services, Real-time Events, Sub-second response times, Simpler data transformation rules, SLAs, Notifications
- Performance considerations – Ability to process transactional data quickly, Handle large volume of concurrent jobs, Detailed logs not needed, Ability to recover jobs not needed

Files and bulk data

Attributes that impact performance and throughput per instance of Process Flow
- Volume of Bulk Data
  - # of Records, # of Fields per Record, # of Megabytes
Complexity of Data Mapping Rules
- Straight Maps vs. Complex Maps
- # of Fields that have Complex Maps
- Complexity of Data Validations
- Calling External Programs Inside the Mapping Rules
- # of Database Lookups
- Source data encrypted or compressed
Design approaches for optimizing performance of files and bulk data
- Use dynamic process flows – One process flow can handle multiple, different sources/file types etc.
- Use fewer events to minimize polling, e.g. Watch parent folder with one event rather than sub-folders
- Large File Data Ingestion (Streaming)
  - Design process flow to use streaming and mark data mapping to run in streaming mode

You may be interested in...