Tuesday, August 14, 2007

Free Code: Audit Tables and Change Control



The above link provides the following:
1. DBAuditTrackingTool.zip
Description:
This tool audits your database tables without changing the table orginal table schema. Tracks what, when, and who changed a row within a table. Each table has a corrisponding history table and trigger to audit a table.

Feature List:
1. Track all who, what, and when changes have been made to any table.
2. Auto Generates the "Audit Trigger" for tables being audited.

3. Auto Generates the "Audit Table" for tables being audited.
4. Has a function called the "TableTardis" to reconstruct what the data looked like for any period of time in the past or future.

Current Requirements:
1. PK must exists.
2. Audit tables and sprocs are all in the same database as the tables being audited.

2. DBChangeControlAndTrackingTool.zip
Description:
This tool creates a staging environment for change tracking a database. Tracks what, when, and who changed a row within a table and group them within a batch.

Technique:
Uses minimum logging techique to save disk space. If 10% of 1 billion rows change over a period of a year, you will have 100 million rows of log audits. Batches are auto-managed and a batch window is for one full day starting at midnight.

Feature List:
1. Track all who, what, and when changes have been made to any table.
2. Auto Generates the "Audit Trigger" for tables being audited.
3. Auto Generates the "Audit Views" for the table to view current data or history.
4. Has a function called the "TableTardis" to reconstruct what the data looked like for any period of time in the past or future.
5. Automatically track groups of changes into a batch. Batches can be manually or auto controlled. Auto batch tracking will group all changes for a day. This is useful for change release to production database.

Current Requirements:
1. PK must exists.
2. Audit tables views and sprocs are all in the same database as the tables being audited.

3. ServiceControlCenter.zip
Desciption:
Set of free code that demonstrates how to create a Service Control Center for any application, service, or utility that need a central database to control and monitor them. This an essential to manage complex systems that contain numerous services. Pipelines found in data warehouses is one example. This is complete with data model and stored procedures.

Wednesday, August 01, 2007

How To: Simple Query-able Compression (No need to decompress to read file)

Do you wish to compress your data without having to decompress it to read it? Most people will think of RAR or ZIP compression when they need to save space on the hard drive. This may save storage and I/O load, but the side effect of this approach will require you to decompress the file every time you need to access it. The following is a means to compress your data without requiring decompression to read it.

Normalization: Compression is a natural byproduct when normalizing your data (Please see articles below on normalization and modeling). By normalizing your data you remove redundant data. It’s an effective non-destructive means of compressing your data into a query enabled format.
Binary Conversion: Converting from a string formatted file into a binary formatted file is another natural means of compressing your data. Reducing a string value of “1002000032” in to 4 bytes saves 6 bytes. The strongly data typed binary file can be trusted and read by other business processes without the need to do string conversion.
Hashing Long String Values: Hashing long string values into a binary hash value and placing the string and corresponding hash value into a lookup table is another natural means of compressing your data. URL links are common storage hogs. Reducing a 255 byte URL string to a 64 bit hash can save lots of space if that URL string occurs multiple times within the file. (NOTE: Make sure you select the most appropriate Hashing algorithm and the right hash bit length to reduce your odds of collisions.)
Roll Ups (Aggregation): By only recording one unique row and placing an aggregation count for each time it was recorded within a unit of time you can reduce the amount of data being recorded(Example: John Doe hit your website home page 3 times in 1 hr. In the log there would be one record with an aggregate count value of 3). This is destructive to your data set, because you lose the retreading of a user’s event path. But this may be a minimal and acceptable loss of data depending on your business.

Or Get Up To 36x Compression
The typical compression results from using one or more of the above suggestions can result in 2x-6x compression ratio. The above suggestions are extremely valuable even if you don’t care about having query enabled compression. If you add RAR on top of it all you can save another 6x compression which can give you between 12x-36x compression. Not bad for saving space eh!