Data warehousing is a system which is used for reporting purpose as well as data analysis purpose where data is coming from multiple heterogeneous sources whether it is oracle, sqlserver, PostgreSQL, simple excel sheet. Data warehousing is specially used for reporting historical data. Data warehousing is core component of Business Intelligence (BI).In Data warehouse there is one central mechanism called as repository using which one can fetch the historical reports. In Data warehouse user can store current data as well as historical data on single place called as repository. Data warehouse provides generalized data and consolidated data in multidimensional view. Data warehouse provides Online Analytical tools to analyze the data and reporting purpose. I would like to start this article with definition of data warehouse in simple words.
What you will find in this article?
1.Definition of Data warehouse
2.Characteristics of data warehouse
3.OLAP vs OLTP
4.Types of Data warehouse
Definition of data warehouse :
Data warehousing is subject oriented, time variant, non-volatile collection of data which is used for creation of transactional reports as well as historical reports.
Characteristics of Data warehouse:
Data warehouse is first invented by bill Inmon in 1990
Following table shows the difference between Data warehouse and Transactional system :
Data warehouse | Transactional System |
Data warehouse stores complex and general form of the data. | Transactional system stores the updated daily transactions, workloads etc. |
Data warehouse stores historical information | Transactional System Contains current data of organization |
Data warehouse database generally have only read only access means user can only select the data | Operational or Transactional database have insert, update privileges as the data processing and updating needed. |
Data warehousing requires data cleaning, data validation and data consolidation | Transactional system requires parallel processing of the data, concurrency control. Data consolidation is less required in transactional system as compare to OLAP database. |
Data Mart is a simplest set of Data warehouse which is used to focus on single functional area of the business. We can say Dat
Data Mart is simply a subset of Organization’s Data warehouse.
OLAP is technology used in many Business Intelligence applications which includes complex analytical calculations.OLAP is used for complex calculations, Trends Analysis, sophisticated data modeling.OLAP database is stored in multidimensional database model.OLAP system contains less number of transactions but complex calculations like aggregation- Sum,count,average,min,max etc..
The Aggregated data in OLAP system must be in months,quarters,years,weeks etc.. The key purpose to use OLAP system is to reduce the query response time and increase the effectiveness of reporting.If these aggregated calculations are already stored in repository and if user wants fast access of data then user can use OLAP system.OLTP database stores aggregated historical data in multidimensional schema.
Real Example :
If Company head wants information of Resources salary in year 2000.
In spite of using the transactional system we will use OLAP system here where aggregated data of year 2000 for Resources is already present.
OLTP system is known as large number of small daily transactions like insert, update and delete. Operational database is known as OTLP system. OLTP system provides fast query processing as well as it is also responsible to provide data integrity and data consistency. The actual effectiveness of OLTP is measured in number of Transactions per second. OLTP normally contains current data and data normalization is used properly in OLTP system.
Real Example :
If Company head wants transactional report of all Employees In – Out time.
As Company head wants daily report of in-out time we need to provide it using OLTP system. We need to schedule report on daily basis using OLTP system.
Following Table shows difference between OLAP and OLTP system:
OLAP | OLTP |
It stores only historical data and historical data processing is done. | It involves daily processing of data |
OLAP System is used by higher management like managers , analysts , executives,CEO,CFOs | OLTP system used by DBAs, Database Professionals ,Programmers for applying business logic. |
OLAP is used to analyze the business | OLTP is used to run the business |
OLAP is very huge database so lot of indexes are used for fast data processing | OLTP uses less indexing as data is less here |
OLAP uses Star-Schema, Snowflakes schema or Fact-Dimensions | OLTP uses Entity Relations |
OLAP contains historical data | OLTP contains Current data |
OLAP has less joins and in de-normalized form database | OLTP has large no of joins and in normalized form |
OLAP system has aggregated multidimensional data | OLTP has not aggregated data. |
OLAP system gives summarized consolidated data | OLTP system gives data in detailed format |
OLAP database size is 100 GB to 100 TB | OLTP database size is 100 MB to 100 GB |
CLICK HERE TO CHECK OUT COMPLEX SQL QUERIES
Hope Everyone likes this article. If you like this article don’t forget to comment on comment section.
In my previous articles I have given the roles and responsibilities of L1,L2 and L3…
In my previous articles i have given the hierarchy of production support in real company…
In this article i would like to provide information about production support organization structure or…
In my previous article I have given roles for L1 and L2 support engineer with…
I have started this new series of how to become application support engineer. This article…
In this series we are starting with Roles and responsibilities of L1 support engineer .…