In my previous article I have given the information about Snowflake schema with real industry examples. In this article I would like to give more information about Star Schema with real industry example. Snowflake Schema is the expansion of star schema in which each star explodes into other points. In a snowflake schema, that dimensional table is standardized into numerous lookup tables, each of which represents a level in the dimensional hierarchy. But in star schema, each dimension is given by single dimension table.
For instance, the Time Dimension has two distinct hierarchies:
- Year– Month– Day
- Week – Day
This schema is frequently used to create or develop dimensional data marts. One or more fact tables index any number of dimensional tables are included. A necessary cause of the snowflake schema is the star schema. Additionally, it handles simple queries well.
Why Star Schema is Important?
The simplicity of a star schema is one of the key factors contributing to its acceptance as the industry standard best practice for data organizing. Compared to the traditional source system models, the strategy is more simpler for the business user to
comprehend. Businesspeople can quickly navigate through the design and make sense of it because the content is arranged into sensible business areas. Star schemas have really been used inadvertently by Excel PivotTable users for years. Additionally, this enables the organization to take part in the debate around data modelling. In essence, the data represents how complicated the firm and its procedures
are.
What is Star Schema Architecture ?
The simplest data warehouse design is the star schema architecture. The fact table and dimension tables, which enable browsing through particular categories, summarizing, drilling down, and establishing criteria, are the two fundamental components of a star schema.
Fact Table :
The fact table is not a standard relational database table because it has been deliberately de-normalized to speed up query processing. The fact table normally has records that are ready for ad hoc queries to explore. Due to the time-varying nature of a data warehouse environment, records in the fact table are frequently referred to as events. All columns, with the exception of those containing scores or numeric values, make up the fact table’s main key (like QUANTITY, TURNOVER, exact invoice date and time).
Sales Fact Table – All information about sales orders is contained in the sales fact table.
Order Fact Table – The table may occasionally be divided into open orders and historical
orders.
Budget Fact Table – typically loaded once at the end of the year, organized by month.
Forecast Fact Table – typically loaded daily, weekly, or monthly and organized by month.
Inventory Fact Table – report stocks; typically updated every day.
Dimension Table :
One or more dimension tables include nearly all of the data in a standard fact table. The primary goal of maintaining Dimension Tables is to make it simple and quick to browse through the categories. The composite primary key of the fact table is created by connecting the individual primary keys of each of the dimension tables. For each dimension, there is only one de-normalized table in a star schema design.
In a data warehouse, typical dimension tables include:
- Customer Dimension Table
- Star Dimension Table
- Products Dimension Table
- KAM (Key account managers) Dimension Table
- Sales Office Dimension Table.
What are Characteristics of Star Schema ?
- Makes a denormalized database that can respond to queries quickly.
- Gives a flexible design that is simple to modify or expand upon as the database
develops and the development cycle continues. - Gives a design comparison to how end users normally think about and use the
data. - Makes metadata less complex for both developers and end users.
STAR SCHEMA vs SNOWFLAKE SCHEMA
The dimension tables of Star schema do not have any foreign keys. Therefore, neither the dimension tables nor their “sub-dimension tables” refer to any other tables. Since some information may be repeated in the dimension tables, they are typically de-normalized. This makes it possible to tune star schema databases for read and query efficiency along particular dimensions.
In that it has a single fact table and numerous dimension tables, a database with a snowflake schema is comparable to one with a star schema. A snowflake design, however, allows for the possibility of foreign keys connecting each dimension table to other dimension tables.
Therefore, from each dimension table, there is no further branching in a star schema. However, with a snowflake schema, each branch may have further branches, much like a snowflake with successively smaller branches growing out of a central core. Though not always totally normalized, a snowflake schema is also more normalized than a star schema.
An employee ID is present in the fact table’s sales record as a straightforward illustration. This employee ID is linked to a table of employee dimensions that includes details like the branch office, first and last name, and gender.
The record would include details like “male” and “Texas Office” in a star schema. This information would also be repeated in other rows for staff members of the same gender or from the same branch office.
What are Advantages of Star Schema ?
- Simple Design.
- In comparison to other join logic required to retrieve data from a transactional schema that is well normalized, star schema join logic is extremely simple.
- Easy data aggregation and easy integration with data cubes and OLAP.
- Cube Processer is faster.
- De-normalized data structure also run fast.
What are Disadvantages of Star Schema?
- Slower queries.
- Very limited flexibility on non-dimensional data.
- Star schemas do not support many-to-many links within business organizations.
- Larger Storage.
- Not flexible in normalized data model.
I hope you like this article. If you like this article or if you have any issues with the same kindly comment in comments section.