Sponsored links

Wednesday, January 20, 2010

Distributed Warehouse data model

The heart of any data warehouse is its database, where all the information is stored. most traditional data warehouses use one of the relational products for this purpose. they can manage extremely large amount of data even hundreds of terabytes. Mainframe relational databases such as DB2 are used for some of the worlds largest data warehouses. Universal data servers such as those from oracle or informix may be a good choice for medium sized warehouses because they can manage a variety of data types.Multidimensional databases are becoming increasingly popular but they limit the size of a ware house to less than 5 Gigabytes.
data Warehouse
In relational storage systems the attributes of a tuple are placed coniguously in storage with this row store architecture, a single disk writes suffices to push all of the fields of a single record out to disk hence high performance writes are achieved. A DBMS with a row store architecture is called a write optimized system. In contrast systems oriented towards oriented toward querying a large amount of data should be read optimized. Data warehouse represent one class of read optimized systems in which periodically a bulk load of new data is performed followed by a ralatively long period of hoc queries. in such environments, a column store archietecture, in which values for each column are store contiguously should be more efficient. with column store architecture a DBMS need only read values of columns requred for processing a given query and avoid bringing into memory irrelevant attributes. in data warehouse environments where typical queries involve aggregates performed over a large number of data items a column store has a sizeable performance advantage.

2 comments: