Dimensional modeling, also known as the Star Schema approach, is a set of techniques and concepts used in data warehouse design proposed by Ralph Kimball.
A dimensional model is more of a conceptual model than a logical, or physical data model, because it is applicable to any physical form, such as a relational or multidimensional database. Star and snowflake schemas are the most commonly used schema types for implementing a dimensional model in a relational database. The Snowflake schema normalizes the dimension tables, while the star schema denormalizes the dimension tables. The transition from dimensional to logical, or physical data model is straightforward and should follow well-defined guidelines.
The architecture contains the following main layers:
A dimensional model is driven by well-defined and known analytical requirements. Simplification is a valid approach when designing a star schema, but it reduces information and requires certain steps in planning. One drawback is that it cannot answer previously unconsidered analytical questions. Designing a dimensional model has always been a very difficult trade-off between simplicity and flexibility of analysis As a result, the gap between the model's analytical capabilities and future analytical requirements increases as the system becomes more complex.
The complexity of the ETL process to the (dimensional) core model depends on several factors, including the number and complexity of source systems, the complexity of integration and business transformation procedures. Loading dimension tables is usually easier than loading fact tables because there are fewer source tables to consider.
ETL complexity for fact tables is often very high, typically requiring transformations and consolidations of many source tables. Star schemas are completely different from typical OLTP data models, so restructuring the data model can be very complex and often requires multiple steps and intermediate tables are required.
Different SCD types can be implemented to meet different historization requirements. The most common types are:
The SCD2 dimension table has two additional timestamps that indicate record validity. While the relational and data vault model approaches assign a surrogate key to each entity, the star/snowflake schema assigns a surrogate key to each new version of the entity. Surrogate keys also contain information about validity information, so you can join fact and dimension tables without considering the timestamp attribute. As a result, join operations are more intuitive to users and handled more efficiently by the database. Having different primary keys for each version has the drawback of the so-called "ripple effect". Each new version of the parent "triggers" a new version in all children, potentially leading to an "explosion" of the children.
Accelerate and automate your analytical data workflow with comprehensive features that biGENIUS-X offers.