Wednesday, December 28, 2005

Instrumentation of Websites

If you want to successfully mine useful data from your web logs, you should carefully instrument your websites. Garbage in, Garbage out. Parameters in the URL can be easily mined, but they should be standardized across all your websites or properties. This will simplify the Business and Customer Intelligence mining process and increase accuracy.

Somethings to Track:
1. Page Views
2. User Session
3. Click Through Path
4. Entry, Exit pages
5. External Reference Views
6. IP addresses - > Reverse lookup general location of user
7. Page mapping to business owner
8. Logical Page Groupings
9. Hosting Servers and their Geographical Locations, if your distributed
10. Original Web log File that this information came from
11. Passthrough parameters to be further processed and mined by the business specific to a given website.

Common Things

The following is a list of common concepts that are commonly found within a data warehouse or transaction database.

Some things to think about:
1. Table Names
a. Domain Tables
b. Domension Tables
c. Rule Tables
d. Fact Tables
e. Index Tables
f. Log or Change History Tables
g. General Table
2. Table Column Names
a. ID, SK
b. Code, Name, Description
c. Binary Indicators/Flags
d. Ranges: Start and End
e. CreatedBy, ModifiedBy, CreatedDateTime, ModifiedDateTime
f. EffectiveDate, ExpiredDate
3. View Names
a. Current Views
b. Historical Views
c. Dimension Views
d. Pipeline Views
4. Stored Procedure Names
a. Insert, Update, Delete, Select
b. Import, Export, Transform, Map, Sessionize
5. Trigger Names
a. Insert, Update, Delete Triggers
6. Function Names
a. Encrypt
b. Hash
c. Specialized Calculations

Tuesday, December 27, 2005

Data Modeling Verses Building a Universal Meta Data Model

Arguments I use for people that over use meta-data Models. (Mostly by over zealous C++, C#, Java Developers, UI developers, and beginning data modelers)

You May Not Know Your Business
The purpose of Data Modeling, in short, is to accurately define your business and relationships between business concepts in great detail. Meta Modeling your whole business sends a red flag which states that you may not know your business or that your business is not well defined to begin with. Business is all about knowing the needs of your customers. So go back and do your ‘Use Case Modeling’ before building yourself a crystal castle. Go and actually get to know your customer’s needs. Otherwise, you build a system for a non-existing customer or worse; you build to your own ego.
Meta Models Are Expensive
Meta Models are expensive over the life of an application/service/tool. The more complex data it must contain, the more it costs to maintain. Beware, Just because you meta modeled the system doesn’t mean that you made your system easy to extend its features over the life of the application. Some times it makes it prohibitedly expensive/risky to modify, especially when the original developers move on to other projects or companies. KISS the project (Keep It Simple Stupid). Where possible, make the project easy enough to maintain by contractors off the street.
Over Engineering
Don’t get me wrong: Meta data models are great for well defined uses. But question its existence and use it sparingly. If in doubt, define it out. Basically Model the business out first. Then only collapse specific areas into metal Models where it becomes PAINFULLY obvious that you need it for the Business/Process. There are many types of meta models. You may find that you only need a simple meta model for one very specific thing rather then having to create a unification theory to solve the worlds problems (Over Engineering).
SQL Server is already a Meta data engine
SQL Server or any other database product is already a Meta data engine. So build around its strengths and strengthen specific areas where it has been imperially determined to be insufficient. This means that you start building real useable and scalable prototypes to prove your designs and/or bring to the fore the Database System's Design Flaws. DON’T ASSUME OR GUESS. You don’t want to explain to your boss why you decided to build a database system on top of a database system instead of solving his business needs. :)

Business Drowning In Customer Information, Yet Don't Understand Their Customers

From my personal experience in database development over the years of my career, I have seen many businesses suffer from information glut without a means to quickly and cheaply mine their data. A single business can lose Terabytes of data every day without ever relinquish its secrets and benefiting its owner with its hidden treasures.

Businesses may or may not even know what they are looking for other then they want to make money. First they must understand what their goal is as business or business unit. Figuring out their target customers and understanding them is a start. You'd be surprised that there are many businesses that really don't understand there own customer. It is sad to see highly educated people missing this very point by being distracted by their own ego or self interests.

So to begin with: Do a use case model of your customers. Really get to know them well.

Sources that you can turn to in understanding your customer: Customers themselves, WebLogs (Use Logs), Sales/Subscription Accounting books, Customer Service Logs/Emails, and Questionnaires and Polls.