- Pipeline Coordination and Scheduling: Automatically manage parallel processes against data that are located at specific compute nodes. Also deal with late arriving data.
- Reentry: Ability to restart a process at the point of failure.
- Data Life Cycle Management: Automatically archive aging data and the removal of expired data from the warehouse.
- Standard Import and Export: Standard handling of exporting and importing data from and to external landing locations and related error handling and protocols.
- Data Lineage and Traceability: Tracking the flow of data from its initial entry into the warehouse through the pipelines to its final destinations. This also needs to enable the ability to determine downstream impacts when data is missing or processes fail.
- Data Service Level Agreement Management: Track the scheduled window of time a dataset should be available for downstream usage and what internal and external parties depend on the data.
- Data Replication Management: Automatic replication of specific data for backup purposes.
Checkout http://pragmaticworks.com/Products/BI-xPress.aspx for SSIS based data warehouse pipelines.
Checkout http://ssisetlframework.codeplex.com/ for SSIS based data warehouse pipelines. It doesn't cover everything but its a great place to start and its free.