Knowledge (XXG)

Data vault modeling

Source 📝

707:. These consist of metadata linking them to their parent hub or link, metadata describing the origin of the association and attributes, as well as a timeline with start and end dates for the attribute. Where the hubs and links provide the structure of the model, the satellites provide the "meat" of the model, the context for the business processes that are captured in hubs and links. These attributes are stored both with regards to the details of the matter as well as the timeline and can range from quite complex (all of the fields describing a client's complete profile) to quite simple (a satellite on a link with only a valid-indicator and a timeline). 319:. Both techniques have issues when dealing with changes in the systems feeding the data warehouse. For conformed dimensions you also have to cleanse data (to conform it) and this is undesirable in a number of cases since this inevitably will lose information. Data vault is designed to avoid or minimize the impact of those issues, by moving them to areas of the data warehouse that are outside the historical storage area (cleansing is done in the data marts) and by separating the structural items (business keys and the associations between the business keys) from the descriptive attributes. 595:
to make them unique. Let's say, "unique number". The latter key is not a real business key, so it is no hub. However, we do need to use it in order to guarantee the correct granularity for the link. In this case, we do not use a hub with surrogate key, but add the business key "unique number" itself to the link. This is done only when there is no possibility of ever using the business key for another link or as key for attributes in a satellite. This construct has been called a 'peg-legged link' by Dan Linstedt on his (now defunct) forum.
969:). First you have to load all the hubs, creating surrogate IDs for any new business keys. Having done that, you can now resolve all business keys to surrogate ID's if you query the hub. The second step is to resolve the links between hubs and create surrogate IDs for any new associations. At the same time, you can also create all satellites that are attached to hubs, since you can resolve the key to a surrogate ID. Once you have created all the new links with their surrogate keys, you can add the satellites to all the links. 591:
and transport company. This could be a link called "Delivery". Referencing a link in another link is considered a bad practice, because it introduces dependencies between links that make parallel loading more difficult. Since a link to another link is the same as a new link with the hubs from the other link, in these cases creating the links without referencing other links is the preferred solution (see the section on loading practices for more information).
122: 25: 727:
whether this is the primary driver, the name of the insurance company for this car and person (could also be a separate hub) and a summary of the number of accidents involving this combination of vehicle and driver. Also included is a reference to a lookup- or reference table called R_RISK_CATEGORY containing the codes for the risk category in which this relationship is deemed to fall.
235: 450:
of a historical database. If you use these keys as the backbone of a data warehouse, you can organize the rest of the data around them. This means that choosing the correct keys for the hubs is of prime importance for the stability of your model. The keys are stored in tables with a few constraints on the structure. These key-tables are called hubs.
594:
Links sometimes link hubs to information that is not by itself enough to construct a hub. This occurs when one of the business keys associated by the link is not a real business key. As an example, take an order form with "order number" as key, and order lines that are keyed with a semi-random number
590:
Links can link to other links, to deal with changes in granularity (for instance, adding a new key to a database table would change the grain of the database table). For instance, if you have an association between customer and address, you could add a reference to a link between the hubs for product
976:
The ETL is quite straightforward and lends itself to easy automation or templating. Problems occur only with links relating to other links, because resolving the business keys in the link only leads to another link that has to be resolved as well. Due to the equivalence of this situation with a link
726:
This is an example for a satellite on the drivers-link between the hubs for cars and persons, called "Driver insurance" (S_DRIVER_INSURANCE). This satellite contains attributes that are specific to the insurance of the relationship between the car and the person driving it, for instance an indicator
972:
Since the hubs are not joined to each other except through links, you can load all the hubs in parallel. Since links are not attached directly to each other, you can load all the links in parallel as well. Since satellites can be attached only to hubs and links, you can also load these in parallel.
449:
The business keys and their associations are structural attributes, forming the skeleton of the data model. The data vault method has as one of its main axioms that real business keys only change when the business changes and are therefore the most stable elements from which to derive the structure
334:
Data vault's philosophy is that all data is relevant data, even if it is not in line with established definitions and business rules. If data are not conforming to these definitions and rules then that is a problem for the business, not the data warehouse. The determination of data being "wrong" is
1011:
For this purpose, the hubs and related satellites on those hubs can be considered as dimensions and the links and related satellites on those links can be viewed as fact tables in a dimensional model. This enables you to quickly prototype a dimensional model out of a data vault model using views.
1051:
Teams using the data vault methodology should readily adapt to the repeatable, consistent, and measurable projects that are expected at CMMI Level 5. Data that flow through the EDW data vault system will begin to follow the TQM life-cycle that has long been missing from BI (business intelligence)
903:
Reference tables are referenced from Satellites, but never bound with physical foreign keys. There is no prescribed structure for reference tables: use what works best in your specific case, ranging from simple lookup tables to small data vaults or even stars. They can be historical or have no
394:
Data vault modeling was originally conceived by Dan Linstedt in the 1990s and was released in 2000 as a public domain modeling method. In a series of five articles in The Data Administration Newsletter the basic rules of the Data Vault method are expanded and explained. These contain a general
445:
Data vault attempts to solve the problem of dealing with change in the environment by separating the business keys (that do not mutate as often, because they uniquely identify a business entity) and the associations between those business keys, from the descriptive attributes of those keys.
413:
According to Dan Linstedt, the Data Model is inspired by (or patterned off) a simplistic view of neurons, dendrites, and synapses – where neurons are associated with Hubs and Hub Satellites, Links are dendrites (vectors of information), and other Links are synapses (vectors in the opposite
598:
Links contain the surrogate keys for the hubs that are linked, their own surrogate key for the link and metadata describing the origin of the association. The descriptive attributes for the information on the association (such as the time, price or amount) are stored in structures called
335:
an interpretation of the data that stems from a particular point of view that may not be valid for everyone, or at every point in time. Therefore the data vault must capture all data and only when reporting or extracting data from the data vault is the data being interpreted.
373:
in Data Vault 2.0) and a presentation layer (data mart), and handling of data quality services and master data services), and the model. Within the methodology, the implementation of best practices is defined. Data Vault 2.0 has a focus on including new components such as
710:
Usually the attributes are grouped in satellites by source system. However, descriptive attributes such as size, cost, speed, amount or color can change at different rates, so you can also split these attributes up in different satellites based on their rate of change.
326:"The Data Vault Model is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and 1015:
Note that while it is relatively straightforward to move data from a data vault model to a (cleansed) dimensional model, the reverse is not as easy, given the denormalized nature of the dimensional model's fact tables, fundamentally different to the
382:- and also focuses on the performance of the existing model. The old specification (documented here for the most part) is highly focused on data vault modeling. It is documented in the book: Building a Scalable Data Warehouse with Data Vault 2.0. 912:
This is an example of a reference table with risk categories for drivers of vehicles. It can be referenced from any satellite in the data vault. For now we reference it from satellite S_DRIVER_INSURANCE. The reference table is R_RISK_CATEGORY.
342:
requirements in the USA and similar measures in Europe this is a relevant topic for many business intelligence implementations, hence the focus of any data vault implementation is complete traceability and auditability of all information.
437:. The data vault model actually provides a "graph based" model with hubs and relationships in a relational database world. In this manner, the developer can use SQL to get at graph-based relationships with sub-second responses. 422:
ratings. They can be created and dropped on the fly in accordance with learning about relationships that currently don't exist. The model can be automatically morphed, adapted, and adjusted as it is used and fed new structures.
887:
Reference tables are a normal part of a healthy data vault model. They are there to prevent redundant storage of simple reference data that is referenced a lot. More formally, Dan Linstedt defines reference data as follows:
582:
Associations or transactions between business keys (relating for instance the hubs for customer and product with each other through the purchase transaction) are modeled using link tables. These tables are basically
1836:
Dirk Lerner: Data Vault für agile Data-Warehouse-Architekturen. In: Stephan Trahasch, Michael Zimmer (Hrsg.): Agile Business Intelligence. Theorie und Praxis. dpunkt.verlag, Heidelberg 2016, ISBN 978-3-86490-312-0,
466:. The descriptive attributes for the information on the Hub (such as the description for the key, possibly in multiple languages) are stored in structures called Satellite tables which will be discussed below. 714:
All the tables contain metadata, minimally describing at least the source system and the date on which this entry became valid, giving a complete historical view of the data as it enters the data warehouse.
385:
It is necessary to evolve the specification to include the new components, along with the best practices in order to keep the EDW and BI systems current with the needs and desires of today's businesses.
404:
has arrived on the scene as of 2013 and brings to the table Big Data, NoSQL, unstructured, semi-structured seamless integration, along with methodology, architecture, and implementation best practices.
904:
history, but it is recommended that you stick to the natural keys and not create surrogate keys in that case. Normally, data vaults have a lot of reference tables, just like any other Data Warehouse.
430:
of the Enterprise in the sense that it describes the terms in the domain of the enterprise (Hubs) and the relationships among them (Links), adding descriptive attributes (Satellites) where necessary.
179:" where data that does not conform to the definitions is removed or "cleansed". A data vault enterprise data warehouse provides both; a single version of facts and a single source of truth. 167:
Data vault modeling makes no distinction between good and bad data ("bad" meaning not conforming to business rules). This is summarized in the statement that a data vault stores "
993: 893:
Any information deemed necessary to resolve descriptions from codes, or to translate keys in to (sic) a consistent manner. Many of these fields are "descriptive" in nature and
338:
Another issue to which data vault is a response is that more and more there is a need for complete auditability and traceability of all the data in the data warehouse. Due to
1846:
Dani Schnider, Claus Jordan u. a.: Data Warehouse Blueprints. Business Intelligence in der Praxis. Hanser, München 2016, ISBN 978-3-446-45075-2, S. 35–37, 161–173.
1840:
Daniel Linstedt: Super Charge Your Data Warehouse. Invaluable Data Modeling Rules to Implement Your Data Vault. Linstedt, Saint Albans, Vermont 2011, ISBN 978-1-4637-7868-2.
879:(*) at least one attribute is mandatory. (**) sequence number becomes mandatory if it is needed to enforce uniqueness for multiple valid satellites on the same hub or link. 160:
in a data vault must be accompanied by record source and load date attributes, enabling an auditor to trace values back to the source. The concept was published in 2000 by
1827:
John Giles: The Elephant in the Fridge. Guided Steps to Data Vault Success through Building Business-Centered Models. Technics, Basking Ridge 2019, ISBN 978-1-63462-489-3.
144:
coming in from multiple operational systems. It is also a method of looking at historical data that deals with issues such as auditing, tracing of data, loading speed and
988:
The data vault modelled layer is normally used to store data. It is not optimised for query performance, nor is it easy to query by the well-known query-tools such as
703:
The hubs and links form the structure of the model, but have no temporal attributes and hold no descriptive attributes. These are stored in separate tables called
1721: 1843:
Daniel Linstedt, Michael Olschimke: Building a Scalable Data Warehouse with Data Vault 2.0. Morgan Kaufmann, Waltham, Massachusetts 2016, ISBN 978-0-12-802510-9.
494:
A hub is not allowed to contain multiple business keys, except when two systems deliver the same business key but with collisions that have different meanings.
1048:
for build out and deployment. Data vault projects have a short, scope-controlled release cycle and should consist of a production release every 2 to 3 weeks.
1696: 213:
are well-suited for capturing changes that occur when a source system is changed or added, but are considered advanced techniques which require experienced
182:
The modeling method is designed to be resilient to change in the business environment where the data being stored is coming from, by explicitly separating
1883: 718:
An effectivity satellite is a satellite built on a link, "and record the time period when the corresponding link records start and end effectivity".
1769: 1745: 1539: 1520: 42: 897:
a specific state of the other more important information. As such, reference data lives in separate tables from the raw Data Vault tables
785:
This can happen if, for instance, you have a hub COURSE and the name of the course is an attribute but in several different languages.
2355: 611:
This is an example for a link-table between two hubs for cars (H_CAR) and persons (H_PERSON). The link is called "Driver" (L_DRIVER).
303:
modeling there are two well-known competing options for modeling the layer where the data are stored. Either you model according to
278: 108: 395:
overview, an overview of the components, a discussion about end dates and joins, link tables, and an article on loading practices.
245: 1824:
Patrick Cuba: The Data Vault Guru. A Pragmatic Guide on Building a Data Vault. Selbstverlag, ohne Ort 2020, ISBN 979-86-9130808-6.
89: 1830:
Kent Graziano: Better Data Modeling. An Introduction to Agile Data Engineering Using Data Vault 2.0. Data Warrior, Houston 2015.
61: 1833:
Hans Hultgren: Modeling the Agile Data Warehouse with Data Vault. Brighton Hamilton, Denver u. a. 2012, ISBN 978-0-615-72308-2.
2261: 1029: 427: 366: 354: 46: 68: 2315: 1876: 506: 2240: 1941: 977:
to multiple hubs, this difficulty can be avoided by remodeling such cases and this is in fact the recommended practice.
490:
optionally, you can also have metadata fields with information about manual updates (user/time) and the extraction date.
1729: 1966: 1137: 176: 145: 1117: – high-level sequence tasks used to design, develop and deploy a data warehouse or business intelligence system 260: 75: 1036:
Level 5 best practices. It includes multiple components of CMMI Level 5, and combines them with best practices from
2350: 2266: 1961: 1926: 194:
loading as much as possible, so that very large implementations can scale out without the need for major redesign.
1704: 1671: 398:
An alternative (and seldom used) name for the method is "Common Foundational Integration Modelling Architecture."
256: 35: 2384: 2219: 2119: 584: 358: 218: 57: 2140: 2135: 2093: 1869: 1691:
Cheat sheet reflecting the rules in v1.0.8 and additional clarification from the forums on the rules in v1.0.8.
1041: 962: 1219: 175:
as "all the data, all of the time") as opposed to the practice in other data warehouse methods of storing "a
2209: 370: 804:
Load End Date (enddate) for the validity of this combination of attribute values for parent key L_DRIVER_ID
2235: 2032: 2010: 168: 1458: 793:
Load Date (startdate) for the validity of this combination of attribute values for parent key L_DRIVER_ID
2327: 2194: 1936: 1114: 308: 202: 187: 1777: 1753: 2199: 2114: 1982: 1931: 1005: 997: 870:
An ID into a table with audit information, such as load time, duration of load, number of lines, etc.
687:
An ID into a table with audit information, such as load time, duration of load, number of lines, etc.
568:
An ID into a table with audit information, such as load time, duration of load, number of lines, etc.
291:
In its early days, Dan Linstedt referred to the modeling technique which was to become data vault as
2150: 2088: 1956: 415: 980:
Data is never deleted from the data vault, unless you have a technical error while loading data.
191: 82: 458:
Hubs contain a list of unique business keys with low propensity to change. Hubs also contain a
2062: 1535: 1516: 1045: 1017: 546:
The business key that drives this hub. Can be more than one field for a composite business key
419: 1550: 1245: 1004:
et al. Since these end-user computing tools expect or prefer their data to be contained in a
505:
This is an example for a hub-table containing cars, called "Car" (H_CAR). The driving key is
2245: 1951: 330:. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise" 206: 125:
Simple data vault model with two hubs (blue), one link (green) and four satellites (yellow)
2145: 2109: 2048: 2000: 1069: 776: 434: 210: 1892: 1108: 487:
the record source, which can be used to see what system loaded each business key first.
339: 300: 214: 183: 157: 322:
Dan Linstedt, the creator of the method, describes the resulting database as follows:
2378: 2322: 2067: 1122: 474: 459: 350: 304: 2334: 2271: 2155: 1551:"The next generation EDW – Letting go of the idea of a single version of the truth" 1131: 316: 172: 161: 153: 121: 2204: 1946: 779:, to enforce uniqueness if there are several valid satellites for one parent key 481: 463: 327: 198: 149: 24: 414:
direction). By using a data mining set of algorithms, links can be scored with
2310: 2083: 1808:
Verhagen, K.; Vrijkorte, B. (June 10, 2008). "Relationeel versus Data Vault".
1096: 369:, etc..), the architecture (amongst others an input layer (data stage, called 312: 1799:
Ketelaars, M.W.A.M. (2005-11-25). "Datawarehouse-modelleren met Data Vault".
2214: 2057: 1992: 1102: 1037: 362: 140:
modeling method that is designed to provide long-term historical storage of
484:, the driver for this hub. The business key can consist of multiple fields. 1650: 1629: 1608: 1587: 1566: 2052: 1987: 1921: 375: 137: 1125: – American computer scientist and co-founder of the data warehouse 848:
The risk category for the driver. This is a reference to R_RISK_CATEGORY
764:(surrogate) primary key for the driver link, the parent of the satellite 1001: 859:
The recordsource of the information in this satellite when first loaded
1105: – System or repository of data stored in its natural/raw format 989: 263:. Statements consisting only of original research should be removed. 2027: 2022: 2017: 1861: 826:
The name of the insurance company for this vehicle and this driver
379: 120: 1857:
The homepage of Dan Linstedt, the inventor of Data Vault modeling
1493: 1064: 1033: 353:. The new specification consists of three pillars: methodology ( 141: 2292: 2176: 1903: 1865: 1044:(TQM), and SDLC. Particularly, it is focused on Scott Ambler's 965:
for updating a data vault model is fairly straightforward (see
815:
Indicator whether the driver is the primary driver for this car
661:
surrogate key for the person hub, the second anchor of the link
228: 18: 1856: 1278:
Building a scalable datawarehouse with data vault 2.0, p. xv
1269:
Building a scalable datawarehouse with data vault 2.0, p. 11
462:
for each Hub item and metadata describing the origin of the
1184:
Building a scalable datawarehouse with data vault 2.0, p. 6
1084: 750:
Sequence ID and surrogate key for the satellite on the link
648:
surrogate key for the car hub, the first anchor of the link
221:
models, but anchor models have a more normalized approach.
1079: 1074: 252: 837:
The number of accidents by this driver in this vehicle
674:
The recordsource of this association when first loaded
1530:
Thomas C. Hammergren; Alan R. Simon (February 2009).
994:
Oracle Business Intelligence Suite Enterprise Edition
477:, used to connect the other structures to this table. 1134: – Location where items are gathered before use 1127:
Pages displaying wikidata descriptions as a fallback
1119:
Pages displaying wikidata descriptions as a fallback
426:
Another view is that a data vault model provides an
2343: 2303: 2254: 2228: 2187: 2128: 2102: 2076: 2041: 1975: 1914: 433:Another way to think of a data vault model is as a 49:. Unsourced material may be challenged and removed. 497:Hubs should normally have at least one satellite. 1609:"Data Vault Series 3 – End Dates and Basic Joins" 1549:Ronald Damhof; Lidwine van As (August 25, 2008). 1220:"Rålager istället för ett strukturerat datalager" 469:The Hub contains at least the following fields: 1333:Data Vault Series 3 – End Dates and Basic Joins 557:The record source of this key when first loaded 324: 1354: 1352: 2356:Data warehousing products and their producers 1877: 1588:"Data Vault Series 2 – Data Vault Components" 1443: 1441: 1439: 148:to change as well as emphasizing the need to 8: 1426: 1424: 1567:"Data Vault Series 1 – Data Vault Overview" 1322:Data Vault Series 2 – Data Vault Components 2300: 2289: 2184: 2173: 1911: 1900: 1884: 1870: 1862: 1697:"Data Vault Modeling Specification v1.0.9" 1418:, page 61, why are business keys important 634:Sequence ID and surrogate key for the Link 293:common foundational warehouse architecture 1651:"Data Vault Series 5 – Loading Practices" 1532:Data Warehousing for Dummies, 2nd edition 1311:Data Vault Series 1 – Data Vault Overview 953:(*) at least one attribute is mandatory. 532:Sequence ID and surrogate key for the hub 297:common foundational modeling architecture 279:Learn how and when to remove this message 217:. Both data vaults and anchor models are 109:Learn how and when to remove this message 1448:Data Vault Modeling Specification v1.0.9 1111: – Centralized storage of knowledge 915: 729: 613: 511: 1722:"Data Vault Loading Specification v1.2" 1359:Data Vault Series 5 – Loading Practices 1154: 1028:The data vault methodology is based on 967:Data Vault Series 5 – Loading Practices 1672:"Data Vault Rules v1.0.8 Cheat Sheet" 1008:, a conversion is usually necessary. 7: 1803:(7). Array Publications B.V.: 36–40. 1657:. The Data Administration Newsletter 1636:. The Data Administration Newsletter 1615:. The Data Administration Newsletter 1594:. The Data Administration Newsletter 1573:. The Data Administration Newsletter 984:Data vault and dimensional modelling 47:adding citations to reliable sources 1630:"Data Vault Series 4 – Link Tables" 1431:Data Vault Forum, Standards section 1099: – American computer scientist 349:is the new specification. It is an 307:, with conformed dimensions and an 190:. Data vault is designed to enable 152:where all the data in the database 2241:MultiDimensional eXpressions (MDX) 1812:(4). Array Publications B.V.: 6–9. 944:A description of the risk category 14: 1746:"A short intro to #datavault 2.0" 1459:Effectivity Satellites - dbtvault 1344:Data Vault Series 4 – Link tables 1246:"Datamodeller för data warehouse" 1770:"Data Vault 2.0 Being Announced" 1513:Super Charge your Data Warehouse 1482:Super Charge your Data Warehouse 1470:Super Charge your Data Warehouse 1416:Super Charge your data warehouse 1404:Super Charge your Data Warehouse 1206:Super Charge your data warehouse 1194:Super Charge your data warehouse 1162:Super Charge your data warehouse 233: 23: 1511:Linstedt, Dan (December 2010). 1382:A short intro to #datavault 2.0 34:needs additional citations for 2262:Business intelligence software 2141:Extract, load, transform (ELT) 2136:Extract, transform, load (ETL) 1776:. Dan Linstedt. Archived from 1752:. Dan Linstedt. Archived from 1728:. Dan Linstedt. Archived from 1703:. Dan Linstedt. Archived from 1393:Data Vault 2.0 Being Announced 1300:A short intro to#datavault 2.0 933:The code for the risk category 1: 2210:Decision support system (DSS) 1447: 1430: 1173: 507:vehicle identification number 169:a single version of the facts 2236:Data Mining Extensions (DMX) 1481: 1469: 1415: 1403: 1392: 1370:Data Warehousing for Dummies 1369: 1205: 1193: 1161: 1060:Some examples of tools are: 587:tables, with some metadata. 311:, or you model according to 1997:Ensemble modeling patterns 1967:Single version of the truth 1288:The New Business Supermodel 1138:Agile Business Intelligence 603:which are discussed below. 409:Alternative interpretations 259:the claims made and adding 177:single version of the truth 2401: 2351:Comparison of OLAP servers 1406:, paragraph 5.20, page 110 1381: 1358: 1343: 1332: 1321: 1310: 1299: 1287: 966: 2299: 2288: 2220:Data warehouse automation 2183: 2172: 1910: 1905:Creating a data warehouse 1899: 1560:. Array Publications B.V. 1534:. John Wiley & Sons. 1484:, paragraph 8.0, page 149 1472:, paragraph 8.0, page 146 756:Recommended but optional 640:Recommended but optional 538:Recommended but optional 1810:Database Magazine (DB/M) 1801:Database Magazine (DB/M) 1558:Database Magazine (DB/M) 1042:total quality management 156:. This means that every 16:Database modeling method 2246:XML for Analysis (XMLA) 1433:, section 3.0 Hub Rules 1174:The next generation EDW 371:persistent staging area 2178:Using a data warehouse 2033:Operational data store 1793:Dutch language sources 1065:2150 Datavault Builder 901: 332: 225:History and philosophy 209:(3NF), data vault and 184:structural information 126: 2195:Business intelligence 1115:The Kimball lifecycle 890: 747:S_DRIVER_INSURANCE_ID 203:dimensional modelling 171:" (also expressed by 124: 58:"Data vault modeling" 2011:Focal point modeling 1983:Column-oriented DBMS 1932:Dimensional modeling 998:SAP Business Objects 205:) and the classical 43:improve this article 2316:Information factory 2089:Early-arriving fact 2006:Data vault modeling 1957:Reverse star schema 1707:on 30 November 2012 1670:Kunenborg, Ronald. 1290:, glossary, page 75 1020:of the data vault. 309:enterprise data bus 134:data vault modeling 2267:Reporting software 1681:. Grundsätzlich IT 941:RISK_CATEGORY_DESC 930:R_RISK_CATEGORY_CD 845:R_RISK_CATEGORY_CD 812:IND_PRIMARY_DRIVER 315:with the database 244:possibly contains 127: 2372: 2371: 2368: 2367: 2364: 2363: 2284: 2283: 2280: 2279: 2168: 2167: 2164: 2163: 2063:Sixth normal form 1655:Data Vault Series 1634:Data Vault Series 1613:Data Vault Series 1592:Data Vault Series 1571:Data Vault Series 1541:978-0-470-40747-9 1522:978-0-9866757-1-3 1070:Astera DW Builder 1046:agile methodology 1018:third normal form 1006:dimensional model 957:Loading practices 951: 950: 908:Reference example 877: 876: 823:INSURANCE_COMPANY 722:Satellite example 696: 695: 585:many-to-many join 575: 574: 289: 288: 281: 246:original research 186:from descriptive 119: 118: 111: 93: 2392: 2385:Data warehousing 2301: 2290: 2185: 2174: 1952:Snowflake schema 1912: 1901: 1886: 1879: 1872: 1863: 1813: 1804: 1788: 1786: 1785: 1764: 1762: 1761: 1740: 1738: 1737: 1716: 1714: 1712: 1701:Data Vault Forum 1690: 1688: 1686: 1679:Data Vault Rules 1676: 1666: 1664: 1662: 1645: 1643: 1641: 1624: 1622: 1620: 1603: 1601: 1599: 1582: 1580: 1578: 1561: 1555: 1545: 1526: 1515:. Dan Linstedt. 1497: 1491: 1485: 1479: 1473: 1467: 1461: 1456: 1450: 1445: 1434: 1428: 1419: 1413: 1407: 1401: 1395: 1390: 1384: 1379: 1373: 1367: 1361: 1356: 1347: 1341: 1335: 1330: 1324: 1319: 1313: 1308: 1302: 1297: 1291: 1285: 1279: 1276: 1270: 1267: 1261: 1260: 1258: 1257: 1241: 1235: 1234: 1232: 1231: 1215: 1209: 1203: 1197: 1191: 1185: 1182: 1176: 1171: 1165: 1159: 1128: 1120: 916: 883:Reference tables 730: 614: 601:satellite tables 512: 284: 277: 273: 270: 264: 261:inline citations 237: 236: 229: 207:relational model 114: 107: 103: 100: 94: 92: 51: 27: 19: 2400: 2399: 2395: 2394: 2393: 2391: 2390: 2389: 2375: 2374: 2373: 2360: 2339: 2295: 2276: 2250: 2224: 2179: 2160: 2124: 2120:Slowly changing 2110:Dimension table 2098: 2072: 2049:Data dictionary 2037: 2001:Anchor modeling 1971: 1906: 1895: 1893:Data warehouses 1890: 1853: 1821: 1816: 1807: 1798: 1783: 1781: 1774:DanLinstedt.com 1768:Linstedt, Dan. 1767: 1759: 1757: 1750:DanLinstedt.com 1744:Linstedt, Dan. 1743: 1735: 1733: 1726:DanLinstedt.com 1720:Linstedt, Dan. 1719: 1710: 1708: 1695:Linstedt, Dan. 1694: 1684: 1682: 1674: 1669: 1660: 1658: 1649:Linstedt, Dan. 1648: 1639: 1637: 1628:Linstedt, Dan. 1627: 1618: 1616: 1607:Linstedt, Dan. 1606: 1597: 1595: 1586:Linstedt, Dan. 1585: 1576: 1574: 1565:Linstedt, Dan. 1564: 1553: 1548: 1542: 1529: 1523: 1510: 1506: 1501: 1500: 1492: 1488: 1480: 1476: 1468: 1464: 1457: 1453: 1446: 1437: 1429: 1422: 1414: 1410: 1402: 1398: 1391: 1387: 1380: 1376: 1368: 1364: 1357: 1350: 1346:, paragraph 2.3 1342: 1338: 1331: 1327: 1320: 1316: 1309: 1305: 1298: 1294: 1286: 1282: 1277: 1273: 1268: 1264: 1255: 1253: 1244:Porsby, Johan. 1243: 1242: 1238: 1229: 1227: 1218:Porsby, Johan. 1217: 1216: 1212: 1204: 1200: 1192: 1188: 1183: 1179: 1172: 1168: 1160: 1156: 1151: 1146: 1126: 1121:, developed by 1118: 1093: 1058: 1026: 986: 959: 910: 885: 834:NR_OF_ACCIDENTS 777:sequence number 724: 701: 609: 580: 503: 456: 443: 435:graphical model 411: 392: 285: 274: 268: 265: 250: 238: 234: 227: 215:data architects 211:anchor modeling 115: 104: 98: 95: 52: 50: 40: 28: 17: 12: 11: 5: 2398: 2396: 2388: 2387: 2377: 2376: 2370: 2369: 2366: 2365: 2362: 2361: 2359: 2358: 2353: 2347: 2345: 2341: 2340: 2338: 2337: 2332: 2331: 2330: 2328:Enterprise bus 2320: 2319: 2318: 2307: 2305: 2297: 2296: 2293: 2286: 2285: 2282: 2281: 2278: 2277: 2275: 2274: 2269: 2264: 2258: 2256: 2252: 2251: 2249: 2248: 2243: 2238: 2232: 2230: 2226: 2225: 2223: 2222: 2217: 2212: 2207: 2202: 2197: 2191: 2189: 2181: 2180: 2177: 2170: 2169: 2166: 2165: 2162: 2161: 2159: 2158: 2153: 2148: 2143: 2138: 2132: 2130: 2126: 2125: 2123: 2122: 2117: 2112: 2106: 2104: 2100: 2099: 2097: 2096: 2091: 2086: 2080: 2078: 2074: 2073: 2071: 2070: 2065: 2060: 2055: 2045: 2043: 2039: 2038: 2036: 2035: 2030: 2025: 2020: 2015: 2014: 2013: 2008: 2003: 1995: 1990: 1985: 1979: 1977: 1973: 1972: 1970: 1969: 1964: 1959: 1954: 1949: 1944: 1939: 1934: 1929: 1924: 1918: 1916: 1908: 1907: 1904: 1897: 1896: 1891: 1889: 1888: 1881: 1874: 1866: 1860: 1859: 1852: 1851:External links 1849: 1848: 1847: 1844: 1841: 1838: 1837:S. 83–98. 1834: 1831: 1828: 1825: 1820: 1817: 1815: 1814: 1805: 1795: 1794: 1790: 1789: 1765: 1741: 1717: 1692: 1667: 1646: 1625: 1604: 1583: 1562: 1546: 1540: 1527: 1521: 1507: 1505: 1502: 1499: 1498: 1494:Melbournevault 1486: 1474: 1462: 1451: 1435: 1420: 1408: 1396: 1385: 1374: 1362: 1348: 1336: 1325: 1314: 1303: 1292: 1280: 1271: 1262: 1236: 1210: 1198: 1186: 1177: 1166: 1153: 1152: 1150: 1147: 1145: 1142: 1141: 1140: 1135: 1129: 1112: 1109:Data warehouse 1106: 1100: 1092: 1089: 1088: 1087: 1082: 1077: 1072: 1067: 1057: 1054: 1025: 1022: 985: 982: 958: 955: 949: 948: 945: 942: 938: 937: 934: 931: 927: 926: 923: 920: 909: 906: 884: 881: 875: 874: 871: 868: 864: 863: 860: 857: 853: 852: 849: 846: 842: 841: 838: 835: 831: 830: 827: 824: 820: 819: 816: 813: 809: 808: 805: 802: 798: 797: 794: 791: 787: 786: 783: 780: 773: 769: 768: 765: 762: 758: 757: 754: 751: 748: 744: 743: 740: 737: 734: 723: 720: 700: 697: 694: 693: 691: 688: 685: 681: 680: 678: 675: 672: 668: 667: 665: 662: 659: 655: 654: 652: 649: 646: 642: 641: 638: 635: 632: 628: 627: 624: 621: 618: 608: 605: 579: 576: 573: 572: 569: 566: 562: 561: 558: 555: 551: 550: 547: 544: 540: 539: 536: 533: 530: 526: 525: 522: 519: 516: 502: 499: 492: 491: 488: 485: 478: 455: 452: 442: 439: 410: 407: 402:Data Vault 2.0 391: 388: 347:Data Vault 2.0 340:Sarbanes-Oxley 301:data warehouse 287: 286: 241: 239: 232: 226: 223: 117: 116: 31: 29: 22: 15: 13: 10: 9: 6: 4: 3: 2: 2397: 2386: 2383: 2382: 2380: 2357: 2354: 2352: 2349: 2348: 2346: 2342: 2336: 2333: 2329: 2326: 2325: 2324: 2323:Ralph Kimball 2321: 2317: 2314: 2313: 2312: 2309: 2308: 2306: 2302: 2298: 2291: 2287: 2273: 2270: 2268: 2265: 2263: 2260: 2259: 2257: 2253: 2247: 2244: 2242: 2239: 2237: 2234: 2233: 2231: 2227: 2221: 2218: 2216: 2213: 2211: 2208: 2206: 2203: 2201: 2198: 2196: 2193: 2192: 2190: 2186: 2182: 2175: 2171: 2157: 2154: 2152: 2149: 2147: 2144: 2142: 2139: 2137: 2134: 2133: 2131: 2127: 2121: 2118: 2116: 2113: 2111: 2108: 2107: 2105: 2101: 2095: 2092: 2090: 2087: 2085: 2082: 2081: 2079: 2075: 2069: 2068:Surrogate key 2066: 2064: 2061: 2059: 2056: 2054: 2050: 2047: 2046: 2044: 2040: 2034: 2031: 2029: 2026: 2024: 2021: 2019: 2016: 2012: 2009: 2007: 2004: 2002: 1999: 1998: 1996: 1994: 1991: 1989: 1986: 1984: 1981: 1980: 1978: 1974: 1968: 1965: 1963: 1960: 1958: 1955: 1953: 1950: 1948: 1945: 1943: 1940: 1938: 1935: 1933: 1930: 1928: 1925: 1923: 1920: 1919: 1917: 1913: 1909: 1902: 1898: 1894: 1887: 1882: 1880: 1875: 1873: 1868: 1867: 1864: 1858: 1855: 1854: 1850: 1845: 1842: 1839: 1835: 1832: 1829: 1826: 1823: 1822: 1818: 1811: 1806: 1802: 1797: 1796: 1792: 1791: 1780:on 2012-08-21 1779: 1775: 1771: 1766: 1756:on 2014-01-03 1755: 1751: 1747: 1742: 1732:on 2014-01-03 1731: 1727: 1723: 1718: 1706: 1702: 1698: 1693: 1680: 1673: 1668: 1656: 1652: 1647: 1635: 1631: 1626: 1614: 1610: 1605: 1593: 1589: 1584: 1572: 1568: 1563: 1559: 1552: 1547: 1543: 1537: 1533: 1528: 1524: 1518: 1514: 1509: 1508: 1503: 1496:, 16 May 2023 1495: 1490: 1487: 1483: 1478: 1475: 1471: 1466: 1463: 1460: 1455: 1452: 1449: 1444: 1442: 1440: 1436: 1432: 1427: 1425: 1421: 1417: 1412: 1409: 1405: 1400: 1397: 1394: 1389: 1386: 1383: 1378: 1375: 1371: 1366: 1363: 1360: 1355: 1353: 1349: 1345: 1340: 1337: 1334: 1329: 1326: 1323: 1318: 1315: 1312: 1307: 1304: 1301: 1296: 1293: 1289: 1284: 1281: 1275: 1272: 1266: 1263: 1251: 1247: 1240: 1237: 1225: 1221: 1214: 1211: 1207: 1202: 1199: 1195: 1190: 1187: 1181: 1178: 1175: 1170: 1167: 1163: 1158: 1155: 1148: 1143: 1139: 1136: 1133: 1130: 1124: 1123:Ralph Kimball 1116: 1113: 1110: 1107: 1104: 1101: 1098: 1095: 1094: 1090: 1086: 1083: 1081: 1078: 1076: 1073: 1071: 1068: 1066: 1063: 1062: 1061: 1055: 1053: 1049: 1047: 1043: 1039: 1035: 1031: 1023: 1021: 1019: 1013: 1009: 1007: 1003: 999: 995: 991: 983: 981: 978: 974: 970: 968: 964: 956: 954: 946: 943: 940: 939: 935: 932: 929: 928: 924: 921: 918: 917: 914: 907: 905: 900: 898: 896: 889: 882: 880: 872: 869: 867:LOAD_AUDIT_ID 866: 865: 861: 858: 855: 854: 850: 847: 844: 843: 839: 836: 833: 832: 828: 825: 822: 821: 817: 814: 811: 810: 806: 803: 800: 799: 795: 792: 789: 788: 784: 781: 778: 774: 771: 770: 766: 763: 760: 759: 755: 752: 749: 746: 745: 741: 738: 735: 732: 731: 728: 721: 719: 716: 712: 708: 706: 698: 692: 689: 686: 684:LOAD_AUDIT_ID 683: 682: 679: 676: 673: 670: 669: 666: 663: 660: 657: 656: 653: 650: 647: 644: 643: 639: 636: 633: 630: 629: 625: 622: 619: 616: 615: 612: 606: 604: 602: 596: 592: 588: 586: 577: 570: 567: 565:LOAD_AUDIT_ID 564: 563: 559: 556: 553: 552: 548: 545: 543:VEHICLE_ID_NR 542: 541: 537: 534: 531: 528: 527: 523: 520: 517: 514: 513: 510: 508: 500: 498: 495: 489: 486: 483: 479: 476: 475:surrogate key 472: 471: 470: 467: 465: 461: 460:surrogate key 453: 451: 447: 441:Basic notions 440: 438: 436: 431: 429: 424: 421: 417: 408: 406: 403: 399: 396: 389: 387: 383: 381: 377: 372: 368: 364: 360: 356: 352: 351:open standard 348: 344: 341: 336: 331: 329: 323: 320: 318: 314: 310: 306: 305:Ralph Kimball 302: 298: 294: 283: 280: 272: 262: 258: 254: 248: 247: 242:This article 240: 231: 230: 224: 222: 220: 216: 212: 208: 204: 200: 195: 193: 189: 185: 180: 178: 174: 170: 165: 163: 159: 155: 151: 147: 143: 139: 135: 131: 123: 113: 110: 102: 99:November 2016 91: 88: 84: 81: 77: 74: 70: 67: 63: 60: –  59: 55: 54:Find sources: 48: 44: 38: 37: 32:This article 30: 26: 21: 20: 2335:Dan Linstedt 2005: 1809: 1800: 1782:. Retrieved 1778:the original 1773: 1758:. Retrieved 1754:the original 1749: 1734:. Retrieved 1730:the original 1725: 1711:26 September 1709:. Retrieved 1705:the original 1700: 1685:26 September 1683:. Retrieved 1678: 1661:12 September 1659:. Retrieved 1654: 1640:12 September 1638:. Retrieved 1633: 1619:12 September 1617:. Retrieved 1612: 1598:12 September 1596:. Retrieved 1591: 1577:12 September 1575:. Retrieved 1570: 1557: 1531: 1512: 1489: 1477: 1465: 1454: 1411: 1399: 1388: 1377: 1365: 1339: 1328: 1317: 1306: 1295: 1283: 1274: 1265: 1254:. Retrieved 1252:(in Swedish) 1250:www.agero.se 1249: 1239: 1228:. Retrieved 1226:(in Swedish) 1224:www.agero.se 1223: 1213: 1201: 1189: 1180: 1169: 1157: 1132:Staging area 1059: 1050: 1027: 1014: 1010: 987: 979: 975: 971: 960: 952: 911: 902: 894: 892: 891: 886: 878: 775:Ordering or 725: 717: 713: 709: 704: 702: 610: 607:Link example 600: 597: 593: 589: 581: 504: 496: 493: 482:business key 468: 464:business key 457: 448: 444: 432: 425: 412: 401: 400: 397: 393: 384: 346: 345: 337: 333: 325: 321: 296: 292: 290: 275: 266: 243: 219:entity-based 196: 181: 173:Dan Linstedt 166: 162:Dan Linstedt 133: 129: 128: 105: 96: 86: 79: 72: 65: 53: 41:Please help 36:verification 33: 2272:Spreadsheet 2205:Data mining 1947:Star schema 1024:Methodology 925:Mandatory? 922:Description 761:L_DRIVER_ID 736:Description 658:H_PERSON_ID 631:L_DRIVER_ID 620:Description 518:Description 501:Hub example 328:star schema 269:August 2019 199:star schema 197:Unlike the 2311:Bill Inmon 2115:Degenerate 2084:Fact table 1819:Literature 1784:2014-01-03 1760:2014-01-03 1736:2014-01-03 1256:2023-02-22 1230:2023-02-22 1144:References 1097:Bill Inmon 1085:AutomateDV 1080:Vaultspeed 1075:Wherescape 1052:projects. 739:Mandatory? 705:satellites 699:Satellites 623:Mandatory? 521:Mandatory? 416:confidence 317:normalized 313:Bill Inmon 253:improve it 188:attributes 146:resilience 69:newspapers 2229:Languages 2215:OLAP cube 2200:Dashboard 2151:Transform 2103:Dimension 2058:Data mart 1993:Data mesh 1962:Aggregate 1927:Dimension 1372:, page 83 1208:, page 76 1196:, page 21 1164:, page 74 1149:Citations 1103:Data lake 1038:Six Sigma 919:Fieldname 733:Fieldname 617:Fieldname 515:Fieldname 363:Six Sigma 257:verifying 154:came from 130:Datavault 2379:Category 2344:Products 2188:Concepts 2053:Metadata 2042:Elements 1988:Data hub 1976:Variants 1922:Database 1915:Concepts 1091:See also 895:describe 772:S_SEQ_NR 742:Comment 645:H_CAR_ID 626:Comment 529:H_CAR_ID 524:Comment 428:ontology 420:strength 376:big data 192:parallel 138:database 2294:Related 2146:Extract 2129:Filling 2094:Measure 1504:Sources 1002:Pentaho 947:No (*) 851:No (*) 840:No (*) 829:No (*) 818:No (*) 801:S_LEDTS 390:History 251:Please 83:scholar 2304:People 1538:  1519:  990:Cognos 856:S_RSRC 790:S_LDTS 782:No(**) 671:L_RSRC 554:H_RSRC 85:  78:  71:  64:  56:  2255:Tools 2028:ROLAP 2023:MOLAP 2018:HOLAP 1675:(PDF) 1554:(PDF) 1056:Tools 578:Links 380:NoSQL 299:. In 150:trace 136:is a 90:JSTOR 76:books 2156:Load 2077:Fact 1942:OLAP 1937:Fact 1713:2012 1687:2012 1663:2011 1642:2011 1621:2011 1600:2011 1579:2011 1536:ISBN 1517:ISBN 1034:CMMI 961:The 936:Yes 862:Yes 796:Yes 767:Yes 560:Yes 549:Yes 454:Hubs 418:and 367:SDLC 359:CMMI 142:data 62:news 1030:SEI 963:ETL 873:No 807:No 677:Yes 664:Yes 651:Yes 571:No 355:SEI 295:or 255:by 158:row 132:or 45:by 2381:: 1772:. 1748:. 1724:. 1699:. 1677:. 1653:. 1632:. 1611:. 1590:. 1569:. 1556:. 1438:^ 1423:^ 1351:^ 1248:. 1222:. 1040:, 1000:, 996:, 992:, 753:No 690:No 637:No 535:No 509:. 480:a 473:a 378:, 365:, 361:, 164:. 2051:/ 1885:e 1878:t 1871:v 1787:. 1763:. 1739:. 1715:. 1689:. 1665:. 1644:. 1623:. 1602:. 1581:. 1544:. 1525:. 1259:. 1233:. 1032:/ 899:. 357:/ 282:) 276:( 271:) 267:( 249:. 201:( 112:) 106:( 101:) 97:( 87:· 80:· 73:· 66:· 39:.

Index


verification
improve this article
adding citations to reliable sources
"Data vault modeling"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message

database
data
resilience
trace
came from
row
Dan Linstedt
a single version of the facts
Dan Linstedt
single version of the truth
structural information
attributes
parallel
star schema
dimensional modelling
relational model
anchor modeling
data architects

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.