Monday, December 18, 2006

How To: Sessionize Events

This is a how to do sessionizing using SQL with just a set of events. We use alot of self-joins to accomplish this task.

Steps:
  • Determine your session timeout length.
  • Order events by date.
  • Identify the Exit Point Event and the duration between events. The Exit Point is the event that has a duration greater than the session timeout length.
  • Finally you can identify the sessionID and assign a sequence number for each event within a session.
-- HOW TO: SESSIONIZING EVENTS
-- This is an example of how to implicitly determine sessions with just a set of event date/times.

--------------------------------------------------------------------
-- Setup test scenerio.
--------------------------------------------------------------------
-- DROP TABLE WebLog
CREATE TABLE WebLog(UserID INT NOT NULL, EventDate DATETIME NOT NULL)
DECLARE @SessionTimeOut INT
SET @SessionTimeOut = 1800 -- 60 seconds * 30 minutes

INSERT INTO WebLog VALUES (123,'01-jan-2005 00:01:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 00:02:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 00:03:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 00:06:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 00:10:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 00:20:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 01:00:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 00:30:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 00:31:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 03:00:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 03:32:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 03:33:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 03:34:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 03:34:10');
INSERT INTO WebLog VALUES (123,'01-jan-2005 03:34:30');
INSERT INTO WebLog VALUES (123,'01-jan-2005 04:00:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 05:01:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 05:02:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 05:03:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 05:04:00');
INSERT INTO WebLog VALUES (123,'01-jan-2005 05:05:00');

INSERT INTO WebLog VALUES (121,'01-jan-2005 00:01:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 00:02:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 00:03:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 00:06:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 00:10:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 00:20:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 01:00:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 00:30:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 00:31:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 03:00:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 03:32:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 03:33:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 03:34:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 03:34:10');
INSERT INTO WebLog VALUES (121,'01-jan-2005 03:34:30');
INSERT INTO WebLog VALUES (121,'01-jan-2005 04:00:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 05:01:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 05:02:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 05:03:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 05:04:00');
INSERT INTO WebLog VALUES (121,'01-jan-2005 05:05:00');

DECLARE @WebLogWithRowID TABLE (UserID INT NOT NULL,
EventDate DATETIME NOT NULL,
RowID BIGINT NOT NULL)

DECLARE @WebLogExitPoint TABLE (UserID INT NOT NULL,
EventDate DATETIME NOT NULL,
RowID BIGINT NOT NULL,
Duration BIGINT NOT NULL,
ExitPointFlag BIT NOT NULL)
--------------------------------------------------------------------
-- Assign Record ID
--------------------------------------------------------------------
INSERT INTO @WebLogWithRowID(UserID, EventDate, RowID)
SELECT UserID, EventDate, ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY EventDate) RowID
FROM WebLog
--------------------------------------------------------------------
-- Determine Duration and ExitPoint
--------------------------------------------------------------------
INSERT INTO @WebLogExitPoint(UserID, EventDate, RowID, Duration, ExitPointFlag)
SELECT a.UserID, a.EventDate, a.RowID,
ISNULL(DATEDIFF(SECOND, a.EventDate, b.EventDate), @SessionTimeOut) AS Duration,
CASE WHEN b.UserID IS NULL THEN 1 -- Last record
WHEN DATEDIFF(SECOND, a.EventDate, b.EventDate) > @SessionTimeOut THEN 1
ELSE 0
END AS ExitPointFlag
FROM @WebLogWithRowID a
LEFT OUTER JOIN @WebLogWithRowID b
ON b.RowID = a.RowID + 1
AND b.UserID = a.UserID

--------------------------------------------------------------------
-- Determine SeqNumber within Session and SessionID
--------------------------------------------------------------------
SELECT a.UserID, a.EventDate, a.RowID,
a.RowID - IsNULL(b.RowID, 0) AS SeqNumber,
a.Duration,
a.ExitPointFlag,
ISNULL(b.RowID, 1) AS SessionID
FROM @WebLogExitPoint a
LEFT OUTER JOIN @WebLogExitPoint b
ON b.RowID = (SELECT MAX(c.RowID)
FROM @WebLogExitPoint c
WHERE c.RowID < a.RowID
AND c.ExitPointFlag = 1 )
AND b.UserID = a.UserID

Wednesday, November 01, 2006

A Place For a Meta Model: System Configuration!

Now I know I stated in previous posts that I'm not really excited about Meta models. But there is a place for them. They can be used sparingly here-and-there where the business area is blurred or when you need to store system configuration! Yep. A system can have many services that need to be controlled by a central database. A meta model fits the bill.

Having a database schema that is flexibable enough to handle configuration data is a wonderful thing. There are only 4-5 tables in such a model, but it saves alot of hassle when service developers ask for a new variable to be stored. Its always a slow trickle and has nothing to do with business, but rather operation needs. So when a developer asks in the 11th hour to add a new configuration field, I just add a new name value pair. The test team doesn't have to do any regression testing, because there were no changes in the schema or APIs. That is what we call a zero risk change.

Sweet! Ok. Don't over do it. When in doubt, model it out! This is just one exception to that rule ;) Keep reading below for some really cool information.

Sunday, March 26, 2006

How To: Experience the Data Model

Not everyone can be proficient at creating and comprehending data models. Modeling is a very abstract concept with its own unique hieroglyphic language. So to make it easy on everyone, why not allow people to experience the model as you design it.

How?
1. As your defining the logical data model, stop each week and create a physical model based on the logical model designed so far
2. Create the data schema scripts and build the database on a server some where
3. Auto generate a “Throw-Away” UI based on the schema using a code generator
4. Release the generated UI on to a web server some where
5. Allow your key users that are giving business input on the design access to the UI
6. Get feed back from these users

Most users will describe data and system requirements in terms of UI presentation and reports. This approach is a very natural way to communicate with users and technology has advanced enough to enable us to accomplish this at very little cost.

There are several code generators that can generate UI and code based on templates and database schema. The one that I would recommend would be http://www.ironspeed.com/, because it just works right out of the box. I was able to generate a complete UI within a matter of hours. This one generates code based on Windows DotNet. There are other UI and code generators out there, but I had difficulties getting the output code to work the first time. It is very important to have hassle free “Throw-Away” UI and code generation. NOTE: Though I say “Throw-Away”, the generated code and UI are actually quite good and usable. Plus you can create your own UI templates.

By generating a “Throw-Away” UI for users to play with, you allow the user to experience the Data Model. This speeds up communication and comprehension of the data model. You can then capture the user’s feedback and categorize them into: data definitions and relationships; business rules; workflow; aesthetics; usability issues. Then place the user feed back into a requirements document. Not a bad week’s worth of work!

During this phase, don't get distracted into customizing the UI for aesthetic and usability reasons unless it is dirt cheap to do so. Just record this information into the requirements document. Stay focused on data definitions and relationships in order to get those defined properly. Once the model has been fully defined and explored, you can then start customizing the Generated UI or start a customized UI from ground up.

Note: If the data model is quite large, then you can break up the model into subject areas in which each subject area has its own exploritory and development cycles.

Benefits:
1. Key users get a better understanding about the data model being designed
2. Better feed back from key users
3. The database scripts can be reused for development
4. The data captured by the users can be used to seed the database with example data for developers and testers to use.
5. The generated UI may turn out to be satisfactory to the users needs and requires little or no further development or customization.
6. You have a solid requirements document to develop a customized UI
7. The weekly delivery primes the development cycle
8. Gets the developers hungry to use a code generator and code templates for elements of code needed to create a customized UI.
9. Lowers the number of development and test resources required for the project

Bottom Line:
1. Increases your chances of the new system fulfilling the actual business needs
2. Decreases your time-to-delivery for a working system
3. Increases reliability, predictability, and maintainability of the new system by encouraging the use of code generators and code templates in strategic ways early on in the design phase

Warning: Garbage in Garbage out. This is very important. Get an experienced data modeler. Your system is driven by the business definitions as well as processes.

Thursday, March 23, 2006

Model Design Patterns For: Document Management System

The following design pattern I’ve known for several years and have implemented it to great success. The subtles are what will save you:
1. Having an explicit ExpirationDate will save you having to look up what an initial contract’s end date is when you imply that the Start date of the replacing contract is the end of initial contract (When in doubt, be explicit not implicit);
2. You can use Between statements for getting contracts that are currently active;
3. Not mixing non-customer-visible change history with official versioned data will save you from having to always filter change history out of the resultset.

The design pattern below will cover a simplified contract management model that will track the work flow status of a contract and provide simple change control. This example is just to demonstrate the design pattern and can be used in variation for other concepts as well. It will simplify the understanding of your system.

Concept: Contract
A contract defines the terms for services to be rendered in exchange for money between two or more legal entities. A contract can not be changed once it has been activated. If you require to change the contract after the contract is activated you need to Terminate the contract and create a new contract with the required changes.

Concept: Contract Effective Date
A contract’s effective date represents the date the contract will be activated. This date can’t be changed once the current date is less than its effective date.

Concept: Contract Expiration Date
A contract’s expiration date represents the date the contract expires. Open ended contracts have no expiration date.

Concept: Contract Termination Date
A contract’s termination date represents the date the contract was terminated. Use this date to prematurely cancel a contract and yet preserve the original expiration date.

Concept: Contract Status
A contract’s work flow is controlled by its status: Draft, Approved, Active, Terminated, Expired.
1. Draft – Represents that the contract is a draft. Contract and terms can be changed freely.
2. Approved– Represents that the contract has been approved. Contract and terms can NOT change except the Contract Status, Effective Date and Expiration Date.

3. Active – Represents that the contract is activated. Contract and terms can NOT change except the Contract Status and Expiration Date.
4. Terminated – Represents that the contract has been terminated prematurely. Contract and terms can NOT be changed.
5. Expired – Represents that the contract has naturally Expired based on the contract’s expiration date. Contract and terms can NOT be changed.


Concept: Change history tracking
Change history tracking tracks all changes done to a table. Please see http://dataglass.blogspot.com/2006/03/how-to-build-change-log-into-your.html for technique.

Contract
------------------
ContractID (PK)
ContractNumber (AK)
… (Other attributes about the contract)


EffectiveDate
ExpirationDate
TerminationDate
StatusCode
CreatedBy
CreatedDateTime
ModifiedBy
ModifiedDateTime


ContractHistory
-----------------------
ContractID, RecordedDateTime (PK)
ContractNumber
… (Other attributes about the contract)


EffectiveDate
ExpirationDate
TerminationDate
StatusCode
CreatedBy
CreatedDateTime
ModifiedBy
ModifiedDateTime
ActionType

Status
----------
StatusCode (PK)
StatusName (AK)
StatusDesc
CreatedBy
CreatedDateTime
ModifiedBy
ModifiedDateTime

Tuesday, March 21, 2006

Modeling Questionnaire

Here are some strategic questions to ask key business personnel to help accurately model how the business uniquely defines its information.

Entity Questions:1. What is the list of Items (or Entities) that you are tracking or would like to track?
2. How do you uniquely identify each {Entity Name Here} ?
3. Do you manually assign identification to each {Entity Name Here}?
4. Can the unique identifier change for a {Entity Name Here}?

Entity Relationship Questions:
1. Can {Entity A} have 1 or more {Entity B}?
2. Can {Entity B} have 1 or more {Entity A}?
3. Can {Entity A} exist without {Entity B}?
4. Can {Entity B} exist without {Entity A}?

Entity Subtyping Questions:
1. Are there one or more types of {Entity Name Here}?
2. Is a {Entity Name Here} a type of {Entity Name Here}?
3. Can a {Entity Name Here} be classified under more than one subtype?

Entity Attribute Questions:
1. What is the list of attributes that you directly associate with a {Entity Name Here}?
2. Is this {Entity Attribute Name Here} Required?
3. Is this {Entity Attribute Name Here} Changeable?
4. Is this {Entity Attribute Name Here} made up of multiple values?
5. Is this {Entity Attribute Name Here} a Calculated value?
6. Is this {Entity Attribute Name Here} unique?
7. Does this {Entity Attribute Name Here} be selected from a standard list of values?

Entity Mastering and Business Ownership Questions:
1. What is the list business areas that are affected by this information model?
2. Does {Entity Name Here} get mastered by this {Business Area}?
3. Does {Entity Name Here} get used by this {Business Area}?

Wednesday, March 15, 2006

How To: Build a Change Log into Your Database

How do you track all changes for a table within a database? Easy. Build a corrisponding history table that mirrors the table in question with the addition of a datetime column to the primary key and an action column to indentify the action that got this item into the history table (Insert, Update, Delete). The history table is maintained by a trigger on the master table.

Example:
Person
-------------
PersonID (PK)
FirstName
LastName
CreatedBy
CreatedDateTime
UpdatedBy
UpdatedDateTime


PersonHistory
-------------
PersonID, RecordedDateTime (PK)
FirstName
LastName
CreatedBy
CreatedDateTime
UpdatedBy
UpdatedDateTime

ActionType

Saturday, March 11, 2006

Logical Data Modeling: An Introduction

Data Modeling In General
I found it worthy to note that the purpose of creating a normalized logical data model is to accurately document the business entities and relationships between them in a detailed model. The value of this logical model is two fold: One is for the business owners to have detailed comprehension of their own business information; Second is to transfer this business knowledge to the developers to equip them to accurately build the system to the exact business needs. They will also create an optimized physical data model (Schema) for the system based on this logical model. Therefore it is critical that you interview the appropriate knowledgeable business people and document the business as much as you can. You will run across business areas in which the business may not be well defined. It will then be your responsibility to understand this area as much as you can, model it accurately, and get approval on the design by the business owners.

Meta Modeling: A WarningDO NOT fall into the temptation to meta model a business area that is not well understood by the business owners in the attempt to avoid your due diligence. This includes throwing in many-to-many relationships into the model to solve problems where the relationship between entities were unclear. This will lead to complications in development and result in a system hard to maintain, use, and raises the cost of ownership. Think like a lawyer. The point is to model the business accurately and provide a natural means to describe and enforce the definition and relationships of the business entities that need to be managed. Only after you have exhausted all avenues of research is when you should use a meta model. But isolate that meta model to be applied specifically to the area that is requiring that level of flexability. Then the meta model will be your friend and not your enemy.

Step 1: Use Case Modeling
The first thing you need to do is document the real world scenarios of the business areas you need to model. This should be detailed in proper sentences to work out the nouns and verbs. The nouns and verbs will be your key to discovering the data model. Of course there are many other benefits as well: Classifying your users, identifying external systems and data dependences, identifying user interfaces, discovering required reports, understand business processes, and finding the scope of the system you are to build.

Step 2: Extended Relational Analysis
Data modeling using the rules of normalization (http://www.datamodel.org/NormalizationRules.html) is not the most natural and easy way to model any system. This can be proven by the very fact that many people don’t truly understand how to normalize a data model. But there is good news. There is a different approach that makes it childs play to create a normalized data model using a technique called Extended Relational Analysis (ERA). I’ve learned it 13 years ago and has been critical to my career. So I highly recommend it. This doesn’t mean that you don’t need to understand the normalization rules, but it does mean that you don’t have to memorize them in to a daily magical chant so you don’t forget. Think of ERA as an as a way of thinking and a means to organize your thoughts rather than a software tool.

ERA technique is broken up into three areas: Entity Analysis; Relationship Analysis; Attribute Analysis. Entity analysis uses nouns out of the use case model to help define each entity that a business is required to deal with. Relationship Analysis uses verbs out of the use case model to help define each relationship between business entities. Attributes Analysis uses the modifiers (Adjectives and Adverbs) to help finish the definition of each business entity defined or create new ones missed with the other two analysis steps. There are many classes you can take out there to help you learn this technique in modeling (http://www.era-sql.com/). Its worth the expense, trust me. It will make your modeling efforts much easier and the end results increase your chances of lowering your total cost of ownership of any system you build. Just do it.
For more detail see http://www.pmcomplete.com/BPM/HTML/bpm659v.asp.

Tuesday, February 21, 2006

Reasons for Partitioning Your Data

Partitioning splits data across multiple tables, databases, and/or database servers. There are two types; vertical and horizontal partitioning. Horizontal partitioning divides data across mutiple tables based on rows that fall within the appropriate partition rules. Each partition has the same columns, but has its own partition rule (Example: Between 01/01/2005-02/01/2005). Veritical partitioning splits the table definition into two or more tables based on the columns.

Here are some basic reasons why a system may need to have its data partitioned: Performance, Workflow, Security, & Change Tracking.


Performance: Volume
Performance issues due to data volume is the most common reason for partitioning tables. Data warehouses commonly use Horizontal partitioning to partition based on a clearly defined set of rules. If a row fits within a partition rule set, it is inserted into that partition. When defining partitions, database designers look for natural partitions for the business; Example:

· Date Range: By Year, Month, or Day
· By Database Source: Database System X, Y, & Z
· By Collecting Point: Data is collected by Nth number of data collectors
· By Business Unit
· By User
· By User Location

Workflow: Status
Workflow partitioning is used in systems like document tracking and management services. It’s a horizontal partition that partitions based on Workflow Status Rules; Example:

· (Partition 1) Draft Incomplete, Draft Complete, Draft Cancelled, Draft Approved
· (Partition 2) Active, Suspended, Expired
· (Partition 3) Archived

WorkFlow: User Type
Some of the reasons for partitioning based on Workflow Status may be based on types of users as well. Contracts are the most common managed documents and are used by several different types of users. Example:

· Sales Department
· Legal Department
· Service Fulfillment Departments
· Accounting Department

Security: PII
Partitioning data based on levels of security risk is used to Isolate PII (Personal Identification Information). Example:

· SSN
· Phone and Address
· Email
· Name
· Credit Cards and Accounts
· Passport ID


Security: Need To Know
Other type of information that may be partitioned for security reasons is information that requires a security clearance and/or a need to know.

· Medical History
· Sensitive Documents
· Trademark Secrets

Security partitions are usually horizontal partitioned using tables, databases, and/or servers. This type of partitioning lowers the security risk and increases the manageability of enforcing security requirements.


Change Tracking
Change tracking and history tracking partitions are very common in mission critical applications requiring full tracking of data changes within a database. These partitions are horizontal partitions with the addition of extra tracking attributes. These partitions are usually managed by a trigger on its corresponding master table. This trigger inserts a new row into the partition every time there is an update or delete. With this technique, it is easy to determine what changed when, and by whom.

Sunday, February 12, 2006

How To: Hierarchal Lookups Without a Cursor

Reporting on hierarchal tables (Child/Parent relationship) tables can be a huge time bottleneck when using an iterative cursor process. Here is a speedy way to look up hierarchal information.

Original Hierarchal Table
Traditionally hierarchal tables self reference its own primary key and giving it a parent role name. Example:

Person
PersonID (PK) , Name (Attributes) , ParentPersonID (Self Ref. Person.PersonID FK)
1 , John Doe , NULL
2 , Tom Doe , 1
3 , Jill Doe , 1
4 , Harry Doe , 2
5 , Jim Smith , NULL


Hierarchal Index Table
The key is building an index table that maps a parent to every child, grand child, great grand child, etc... and assign the generation it belongs to in relationship. Example:

PersonHierarchyIndex
AncestoryID (Person.PersonID PK, FK), ChildID (Person.PersonID PK, FK), GenerationLevel1 (John Doe) , 1 (John Doe), 0
1 (John (Doe), 2 (Tom Doe), 1
1 (John Doe), 3 (Jill Doe), 1
1 (John Doe), 4 (Harry Doe), 2
2 (Tom Doe), 2 (Tom Doe), 0
2 (Tom Doe), 4 (Harry Doe), 1
3 (Jill Doe),2 (Jill Doe), 0
4 (Harry Doe), 4 (Harry Doe), 0
5 (Jim Smith), 5 (Jim Smith), 0


Index Usage
With the aid of the hierarchal index you will not need a cursor for your reports. Example:

-- Get all Progeny (Children, Grand Children, etc…)
Select Person.Name, Parent.Name, Index.GenerationLevel
From PersonHierarchyIndex
Join Person
On Person.PersonID = PersonHierarchyIndex.ChildID
And ParentHierarchyIndex.AncestoryID = 1 -- (John Doe)

Join Person Parent
On Parent.PersonID = PersonHierarchyIndex.AncestoryID


Index Population
The following example uses the new feature in 2005 SQL Server. We populate the Index using a recursive query using a CTE (Common Table Expression). It is used to build the index at the time of creating or modifying the hierarchy. Most hierarchies are slowly changing domain data. With that in mind it makes sense to take the cost of building the index at the time when the domain changes rather then at the time of selecting from the hierarchy to create your reports.

Use this function to then insert into the PersonHierarchyIndex table:

-- Recursive Query using Common Table Expression
CREATE Function dbo.BuildPersonHierarchyIndex () RETURNS Table

ASRETURN(
WITH AncestoryTree (PPID, PID, PersonID, ParentPersonID, DirectChild, GenerationLevel)
AS (SELECT PPID = Person.ParentPersonID, PID = Person.PersonID, Person.PersonID, Person.ParentPersonID , DirectChild = 1, GenerationLevel = 1
FROM Person
UNION ALL
SELECT AncestoryTree.PPID, AncestoryTree.PID, Person.PersonID, Person.ParentPersonID, DirectChild = 0, GenerationLevel = AncestoryTree.GenerationLevel + 1
FROM Person
JOIN AncestoryTree
ON AncestoryTree.ParentPersonID = Person.PersonID
WHERE Person.PersonID <> Person.ParentPersonID)
SELECT PersonID = AncestoryTree.PID, AncestoryTree.ParentPersonID, AncestoryTree.DirectChild, AncestoryTree.GenerationLevel
FROM AncestoryTree
WHERE AncestoryTree.PPID IS NOT NULL
UNION ALL
SELECT PersonID, PersonID, 0, 0
FROM Person

--OPTION (MAXRECURSION 200) -- Default is 100: Make sure the recursion limit is set high enough
);

Friday, February 03, 2006

Best Practices for SQL Design Patterns

I’m a firm believer in creating symmetry through design patterns where it is applicable. Using good modeling techniques and naming conventions on data and process definitions will enable easier code generation. 80% of an application can be code generated based on templates that enforce design patterns. The remaining 20% may need to vary a little from the standards when it is too “costly” to do so. The following naming and coding suggestions have proved to be the most practical for me in starting any project.

Table Name convention In the past I‘ve seen a several databases that had some difficult naming conventions that increased the cost of ownership due to confusion. The databases that have followed the below suggestions have lowered the cost of ownership by making it easy to understand.

Pascal naming convention: The first letter of every word is capitalized.

Good Example: Person, PersonAddress, GBITax
Bad Example: personaddress, Person_Address, personAddress, tbl_PersonAddress

Do not pluralize the table name: It is best not to place an 'S' at the end of every table. It can be assumed that a table contains many entries.

Good Example: Person, Country, UserSignup, DetailRecord
Bad Example: Persons, DetailRecords

General Rules for placing Prefix to tables: Prefixes can be useful for classifying tables into business areas when the number of tables becomes too unwieldy to deal with in a single database. Usually one should start asking the question of whether or not they should split the database into multiple databases when this starts to happen, but there are cases when this may not be the desired direction. In this case you should start thinking about Prefixes. You can use abbreviations also long as they are consistent.

Good Example: CatalogItem, UtilMessageLog, PromoRule
Bad Example: tblPerson

Do not use the ‘_’ as a spacer: From a developers and analyst standpoint these little guys can be quite annoying. And with the Pascal naming convention the use of ‘_’ for readability becomes redundant.

Column Name convention Pascal naming convention: The first letter of every word is capitalized.

Good Example: PersonID, FirstName, ZipCode, SSN
Bad Example: Personid, Firstname, lastName

Do not place the name of the data type in column name: It is not a good idea to place the name of the data type in the column name. If you change the data type of the column, then your column name is misleading. And changing the name of a column is full of troubling changes. There are a few exceptions: Date is one example.

Good Example: StartBatchDate, FirstName, USDAmount
Bad Example: ti_event_id, vcFirstName

Usage of Abbreviations: Abbreviations are good for reducing the size of a column name. One should always be consistent and only use abbreviations that are commonly understood. Remember that the name of the column should be self-descriptive. Avoid over-use of abbreviations which can make it difficult to understand and support. When in doubt spell it out.

Good Example: StartBatchDate, UserNum, SubscriptionDesc
Bad Example: ti_event_id, vcFirstName, SbDc, X

Schema Structure
Use Constraints:
It is sad to say that there are many databases that exist in this world that do not use primary key (PK), foreign key (FK), and alternate key (AK) constraints. Unless there are technical reasons for not having these, you should always use them, if for nothing else, to insure referential integrity. The use of data integrity constraints also makes the database design self documenting (Example: Schema extraction process into Visio or Erwin.)

Name Your Indexes: If you name your indexes instead of having SQL Server to generate the name dynamically, you will have an easier time altering those indexes in the future. You will then explicitly know the name of the index rather then having to dynamically lookup the index name to alter it. If you dynamically generate the name of the index, it will be different from environment to environment (Example Environments: Development, Test, & Production).

Stored Procedure Naming ConventionName Format: Don’t use SP as a prefix for your stored procedures. SP in SQL Server stands for System Procedure (Example: SP_HelpText). {VERIFY} Also, if you use the SP prefix, SQL Server will check the master database first and then the application database. So it is also a performance issue as well.

Good Example: GetPerson, ImportProductKey, CountPIDs, CommitPayments, GetProductReport
Bad Example: SP_GetPerson

Pascal naming convention: The first letter of every word is capitalized.

Good Example: GetPerson, ImportProductKey, CountPIDs, CommitPayments
Bad Example: Get_Person, IProductKey, COUNTPIDS

Examples of names: Get; Set; Ins (Insert); Upd (Update); Del (Delete); Import; Extract; Load; Transform; Export; Count; Process; Commit;

Abbreviation / Acronym dictionary: You should have a list of the most commonly used abbreviations. It’s not a hard and fast rule that a person can’t use an abbreviation that is not on the list. It is just a means to help create symmetry into the system. This list should be in a database design / development standards document.

Examples:
Abbreviation NameNum - Number
Ind - Indicator
Dtm - Date and Time (Millisecond)
Id - Identifier
Cnt - Count
Amt - Amount
Acct - Account
USD - United States Dollar Currency Code

Stored Procedure Argument Naming Convention
This is where coding preferences really start to vary from developer to developer. The concepts here are just helpful tips that I have found very useful as a SQL Developer.

Argument Prefix: @Arg @Arg. Looking through a complex stored procedure and attempting to quickly find where the stored procedure arguments are being used can be time consuming. The use of @Arg Prefix can ease this issue. I’ve found myself avoiding lots of coding mistakes by having done so.

Example: GetPerson(@ArgPersonID int)

Coding Stored ProceduresThe Return statement: Avoided using the return statement to return any values other then error codes if even that. One should use Output parameters where possible for returning values. (This is for stored procedures. Functions are a different beast all together.)

Set NoRowCount On: Place Set NoRowCount On at the beginning of every stored procedure to reduce the noise coming back to Query Analyzer.

Error Trapping: Always check for errors at every statement and log these errors to a MessageLog table. Every stored procedure must do this.

Transactions: K.I.S.S. is the general rule here. Keeping it simple will avoid some of the pitfalls many fall into with complex rollback designs. Always think of a unit of work in which several processes must live and die together. If any process within that unit of work fails, then the unit of work should be rolled back. If a unit of work is extensive it would be a good idea to break it up into multiple units of work with each unit of work isolated in its own stored procedure. That stored procedure can have multiple stored procedures calls, but try to keep the design flat so as not to have a stored procedure that calls a stored procedure that calls a stored procedures etc… . Avoid combining units of work into one gigantic transaction. People from all walks of life will walk over to your desk and slap you with a wet noodle.

Avoid Deadlocks: To avoid deadlocks, get in the habit of accessing tables in the same order through out your system. This will alleviate the most common cause of deadlocks.

InLine Code Documentation: Document the “You Know What” out of the stored procedure within the code. No excuses or exceptions here.

Stored Procedure Code Generator: This is a very useful tool in creating fundamental (Basic) insert, update, delete, get, and set stored procedures. I’ve called them fundamental stored procedures as they are generated per table. These stored procedures are mainly used by the UI. I’ve found that generating these stored procedures increases the symmetry and stability and reduces human error and derivation. This kind of tool can be used to generate 80% of all stored procedures where a UI is needed. The other 20% are complex stored procedures that need more LTC (Loving Tender Care) from the developers.

Once you’ve generated these stored procedures, your better off maintaining them manually, Though some generators allow for “Re-Entry” in which manual code changes are kept while regenerating the rest of the stored procedure.

I do have a stored procedure code generator that I’ve created over the years. It needs a little bit of updating, but works fine as-is and can save lots of time and money. It will generate very clean and supportable code. There are many code generators available. I’ve created my own, because I like total control of my tools and I’ve had it before many code generators existed.