Al's BrainDump

RSS

Navigation





Quick Search
»
Advanced Search »

PoweredBy
2005 DW 10 Best Practises

When bulk inserting every column should be a var/char to ensure that all of a file can be loaded then any strange data handled in sql. If a stg table col is numeric and a char comes in for instance the bulk insert will break and the file will not load at all.

Consider partitioning fact tables that are 50 to 100GB or larger.

Build clustered index on the date key of the fact table

If you load data in a batch window then use the options ALLOW_ROW_LOCKS = OFF and ALLOW_PAGE_LOCKS = OFF for the clustered index on the fact table. This helps speed up table scan operations during query time and helps avoid excessive locking activity during large updates.

Build nonclustered indexes for each foreign key.

Use integer surrogate keys for all dimensions, other than the Date dimension. Use the smallest possible integer for the dimension surrogate keys. This helps to keep fact table narrow.

Use a meaningful date key of integer type derivable from the DATETIME data type (for example: 20060215).

Don't use a surrogate Key for the Date dimension:- Easy to write queries that put a WHERE clause on this column, which will allow partition elimination of the fact table.

Build a clustered index on the business key (not the surrogate key) for each dimension table:- Supports fast lookup while loading the fact table, Supports fast lookup of existing dimension rows to manage Type 2 changing dimensions.

Build a nonclustered primary key index on the dimension key (surrogate key) of each dimension table.

Build nonclustered indexes on other frequently searched dimension columns.

Avoid partitioning dimension tables.

Avoid enforcing foreign key relationships between the fact and the dimension tables, to allow faster data loads. You can create foreign key constraints with NOCHECK to document the relationships; but don’t enforce them. Ensure data integrity though Transform Lookups, or perform the data integrity checks at the source of the data.

Manually update statistics on large fact tables after loading new data. If table very big may not have time for full scan so maybe better update only index statistics.

Updating statistics on small dimension tables after incremental loads may also help performance. Use FULLSCAN option on update statistics on dimension tables for more accurate query plans.

ScrewTurn Wiki version 3.0.5.600. Some of the icons created by FamFamFam.