Thursday, December 27, 2007

Free Code: Bulk Loading and Partitioning Tables



ZIP File: TablePartitioningTool.zip

This experimental toolset is a result of my desire to have a simple repeatable solution for a complex problem: Table Loading and Partitioning Management for time based event data.

1. Loading can be a sore spot in any ETL pipeline when it is too slow. This results in an ever-increasing backlog of import data.
2. Partitioning within SQL Server 2005 is an extremely powerful and essential feature for dealing with large tables, but as a result the partitioning logic can be very complex to write at times.
3. Each time you create a partitioning and loading process for a given table you may implement each one differently. This creates unpredictable behavior of your overall service, application, or warehouse thus making it more costly to maintain and improve.

By using a code-generating approach for your loading and partitioning needs, you can quickly create solutions throughout your company. This solution offers the added bonus of creating a general standard to handle this complex problem, thus lowering your company’s overall cost of ownership.

FEATURES:

1. Creates complex stored procedures which handle all the loading, partitioning, switching, and merging logic to the specifications of a given table template.

2. Provides table loading management
  • Ensures minimum transaction logging
  • Manages micro-loading*
  • Keeps the table “Online” while loading the table
3. Provides partitioning management
  • Manages explicit time-based partitioning at grains of Hour, Day, Week, Month, or Year for any given table
  • Manages rolling off partitions older than a configurable maximum
* NOTE: Micro-Loading in this context represents a process of loading a smaller range of data than the explicit partition of the target table. Example: You load files every five minutes that contain data with a time range of five minutes into your table that is explicitly partitioned by day.

No comments: