Pages

Thursday, March 21, 2013

What is Dimensional modeling? Types Of Dimensional Modeling?


The process of identifying the dimensions required for analysis, defining the hierarchies and levels they contain, and making sure they conform. . This is different from the 3rd normal form, commonly used for transactional (OLTP) type systems. 

To understand dimensional data modeling, let's define some of the terms commonly used in this type of modeling: 

Dimension: A business perspective useful for analyzing data. A dimension usually contains one or more hierarchies that can be used to drill up or down to different levels of detail. 
Typical dimensions include 

  • product 
  • customer 
  • time
  • department
  • location 
  • channel.

Dimension data is typically collected at the lowest level of detail and then aggregated into higher level totals that are more useful for analysis. These natural rollups or aggregations within a dimension table are called hierarchies.

Hierarchy:
An organisation of data into a logical tree structure defining parent-child relationships between the levels in a dimension. Controls data consolidation and drill down paths.
A typical time dimension would have a hierarchy based on 

  • date
  • week
  • month
  • quarter 
  • year.

Attribute: A unique level within a dimension.
For example, Month is an attribute in the Time Dimension. 

Level:
A position in a hierarchy. For example, a time dimension might have a hierarchy that represents data at the    

  • Month
  • Quarter
  • Year
Star Schema:
In the star schema design, a single object (the fact table) sits in the middle and is radically connected to other surrounding objects (dimension lookup tables) like a star. A star schema can be simple or complex. A simple star consists of one fact table; a complex star can have more than one fact table.











The most natural way to model a data warehouse is as a star schema, only one join establishes the relationship between the fact table and any one of the dimension tables.
A star schema optimizes performance by keeping queries simple and providing fast response time. All the information about each level is stored in one row. it maintains de normalized data, performance wise good, all dimension tables directly maintains relation with fact table.  

Snowflake schema:

A type of star schema in which the dimension tables are partly or fully normalized. The main advantage of the snowflake schema is the improvement in query performance due to minimized disk storage requirements and joining smaller lookup tables. 
The main disadvantage of the snowflake schema is the additional maintenance efforts needed due to the increase number of lookup tables. Performance wise bad, some dimension tables not maintain relation with fact table those are maintain relation with another dimension table. Whether one uses a star or a snowflake largely depends on personal preference and business needs.

No comments:

Post a Comment