Knowledge (XXG)

Star schema

Source πŸ“

209: 27: 20: 130:
Dimension tables usually have a relatively small number of records compared to fact tables, but each record may have a very large number of attributes to describe the fact data. Dimensions can define a wide variety of characteristics, but some of the most common attributes defined by dimension tables
83:
The star schema separates business process data into facts, which hold the measurable, quantitative data about a business, and dimensions which are descriptive attributes related to fact data. Examples of fact data include sales price, sale quantity, and time, distance, speed and weight measurements.
99:
Fact tables record measurements or metrics for a specific event. Fact tables generally consist of numeric values, and foreign keys to dimensional data where descriptive information is kept. Fact tables are designed to a low level of uniform detail (referred to as "granularity" or
265:
column of the fact table in this example represents a measure or metric that can be used in calculations and analysis. The non-primary key columns of the dimension tables represent additional attributes of the dimensions (such as the
578:
Dedić, N. and Stanier C., 2016., "An Evaluation of the Challenges of Multilingualism in Data Warehouse Development" in 18th International Conference on Enterprise Information Systems - ICEIS 2016, p. 196.
104:"), meaning facts can record events at a very atomic level. This can result in the accumulation of a large number of records in a fact table over time. Fact tables are defined as one of three types: 167:, meaning the typical rules of normalization applied to transactional relational databases are relaxed during star-schema design and implementation. The benefits of star-schema denormalization are: 216:
Consider a database of sales, perhaps from a store chain, classified by date, store and product. The image of the schema to the right is a star schema version of the sample schema provided in the
208: 174:
Simplified business reporting logic – when compared to highly normalized schemas, the star schema simplifies common business reporting logic, such as period-over-period and as-of reporting.
91:. Having dimensions of only a few attributes, while simpler to maintain, results in queries with many table joins and makes the star schema less easy to use. 171:
Simpler queries – star-schema join-logic is generally simpler than the join logic required to retrieve data from a highly normalized transactional schema.
668: 177:
Query performance gains – star schemas can provide performance enhancements for read-only reporting applications when compared to highly
84:
Related dimension attribute examples include product models, product colors, product sizes, geographic locations, and salesperson names.
1140: 52:
and is the approach most widely used to develop data warehouses and dimensional data marts. The star schema consists of one or more
641: 1046: 184:
Fast aggregations – the simpler queries against a star schema can result in improved performance for aggregation operations.
135:
Time dimension tables describe time at the lowest level of time granularity for which events are recorded in the star schema
114:
Accumulating snapshot tables record aggregate facts at a given point in time (e.g., total month-to-date sales for a product)
1100: 661: 155:, usually a single-column integer data type, mapped to the combination of dimension attributes that form the natural key. 1025: 726: 549: 188: 751: 1135: 1051: 746: 711: 57: 588: 147:
Range dimension tables describe ranges of time, dollar values or other measurable quantities to simplify reporting
1169: 1004: 904: 199:
mode of operation which can use a star schema directly as a source without building a proprietary cube structure.
925: 920: 878: 654: 277:
For example, the following query answers how many TV sets have been sold, for each brand and country, in 1997:
994: 1174: 1020: 817: 795: 178: 164: 1112: 979: 721: 75:
with a fact table at its center and the dimension tables surrounding it representing the star's points.
984: 899: 767: 716: 935: 873: 790: 741: 554: 68: 544: 111:
Snapshot fact tables record facts at a given point in time (e.g., account details at month end)
847: 1030: 736: 559: 217: 61: 930: 894: 833: 785: 677: 621:
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)
539: 49: 1163: 1107: 852: 245:
column, relating to one of the columns (viewed as rows in the example schema) of the
152: 119: 1119: 1056: 940: 72: 138:
Geography dimension tables describe location data, such as country, state, or city
122:
to ensure each row can be uniquely identified. This key is a simple primary key.
989: 108:
Transaction fact tables record facts about a specific event (e.g., sales events)
1095: 868: 101: 53: 999: 842: 777: 192: 46: 34: 837: 772: 706: 607:
C J Date, "An Introduction to Database Systems (Eighth Edition)", p. 708
26: 19: 594: 812: 807: 802: 646: 207: 196: 144:
Employee dimension tables describe employees, such as sales people
25: 18: 1077: 961: 688: 650: 636: 87:
A star schema that has many dimensions is sometimes called a
195:
efficiently; in fact, most major OLAP systems provide a
226:
is the fact table and there are three dimension tables
64:, and is more effective for handling simpler queries. 60:. The star schema is an important special case of the 637:
Stars: A Pattern Language for Query Optimized Schema
615: 613: 1128: 1088: 1039: 1013: 972: 913: 887: 861: 826: 760: 699: 241:Each dimension table has a primary key on its 1141:Data warehousing products and their producers 662: 249:table's three-column (compound) primary key ( 187:Feeding cubes – star schemas are used by all 8: 1085: 1074: 969: 958: 696: 685: 669: 655: 647: 151:Dimension tables are generally assigned a 141:Product dimension tables describe products 571: 67:The star schema gets its name from the 118:Fact tables are generally assigned a 7: 1026:MultiDimensional eXpressions (MDX) 14: 212:Star schema used by example query 1047:Business intelligence software 926:Extract, load, transform (ELT) 921:Extract, transform, load (ETL) 619:Ralph Kimball and Margy Ross, 1: 995:Decision support system (DSS) 191:systems to build proprietary 1021:Data Mining Extensions (DMX) 550:Online analytical processing 782:Ensemble modeling patterns 752:Single version of the truth 1191: 1136:Comparison of OLAP servers 56:referencing any number of 1084: 1073: 1005:Data warehouse automation 968: 957: 695: 690:Creating a data warehouse 684: 642:Fact constellation schema 45:is the simplest style of 279: 261:). The non-primary key 1031:XML for Analysis (XMLA) 16:Data warehousing schema 963:Using a data warehouse 818:Operational data store 593:, 2009, archived from 213: 30: 23: 980:Business intelligence 211: 153:surrogate primary key 29: 22: 796:Focal point modeling 768:Column-oriented DBMS 717:Dimensional modeling 1101:Information factory 874:Early-arriving fact 791:Data vault modeling 742:Reverse star schema 555:Reverse star schema 1052:Reporting software 545:Fact constellation 214: 31: 24: 1157: 1156: 1153: 1152: 1149: 1148: 1069: 1068: 1065: 1064: 953: 952: 949: 948: 848:Sixth normal form 163:Star schemas are 71:resemblance to a 1182: 1170:Data warehousing 1086: 1075: 970: 959: 737:Snowflake schema 697: 686: 671: 664: 657: 648: 624: 617: 608: 605: 599: 598: 585: 579: 576: 560:Snowflake schema 529: 526: 523: 520: 517: 514: 511: 508: 505: 502: 499: 496: 495:Product_Category 493: 490: 487: 484: 481: 478: 475: 472: 469: 466: 463: 460: 457: 454: 451: 448: 445: 442: 439: 436: 433: 430: 427: 424: 421: 418: 415: 412: 409: 406: 403: 400: 397: 394: 391: 388: 385: 382: 379: 376: 373: 370: 367: 364: 361: 358: 355: 352: 349: 346: 343: 340: 337: 334: 331: 328: 325: 322: 319: 316: 313: 310: 307: 304: 301: 298: 295: 292: 289: 286: 283: 273: 269: 264: 260: 256: 252: 248: 244: 237: 233: 229: 225: 218:snowflake schema 126:Dimension tables 89:centipede schema 69:physical model's 62:snowflake schema 58:dimension tables 1190: 1189: 1185: 1184: 1183: 1181: 1180: 1179: 1160: 1159: 1158: 1145: 1124: 1080: 1061: 1035: 1009: 964: 945: 909: 905:Slowly changing 895:Dimension table 883: 857: 834:Data dictionary 822: 786:Anchor modeling 756: 691: 680: 678:Data warehouses 675: 633: 628: 627: 618: 611: 606: 602: 597:on 16 July 2010 587: 586: 582: 577: 573: 568: 536: 531: 530: 527: 524: 521: 518: 515: 512: 509: 506: 503: 500: 497: 494: 491: 488: 485: 482: 479: 476: 473: 470: 467: 464: 461: 458: 455: 452: 449: 446: 443: 440: 437: 434: 431: 428: 425: 422: 419: 416: 413: 410: 407: 404: 401: 398: 395: 392: 389: 386: 383: 380: 377: 374: 371: 368: 365: 362: 359: 356: 353: 350: 347: 344: 341: 338: 335: 332: 329: 326: 323: 320: 317: 314: 311: 308: 305: 302: 299: 296: 293: 290: 287: 284: 281: 271: 267: 262: 258: 254: 250: 246: 242: 235: 231: 227: 223: 206: 161: 128: 97: 81: 17: 12: 11: 5: 1188: 1186: 1178: 1177: 1172: 1162: 1161: 1155: 1154: 1151: 1150: 1147: 1146: 1144: 1143: 1138: 1132: 1130: 1126: 1125: 1123: 1122: 1117: 1116: 1115: 1113:Enterprise bus 1105: 1104: 1103: 1092: 1090: 1082: 1081: 1078: 1071: 1070: 1067: 1066: 1063: 1062: 1060: 1059: 1054: 1049: 1043: 1041: 1037: 1036: 1034: 1033: 1028: 1023: 1017: 1015: 1011: 1010: 1008: 1007: 1002: 997: 992: 987: 982: 976: 974: 966: 965: 962: 955: 954: 951: 950: 947: 946: 944: 943: 938: 933: 928: 923: 917: 915: 911: 910: 908: 907: 902: 897: 891: 889: 885: 884: 882: 881: 876: 871: 865: 863: 859: 858: 856: 855: 850: 845: 840: 830: 828: 824: 823: 821: 820: 815: 810: 805: 800: 799: 798: 793: 788: 780: 775: 770: 764: 762: 758: 757: 755: 754: 749: 744: 739: 734: 729: 724: 719: 714: 709: 703: 701: 693: 692: 689: 682: 681: 676: 674: 673: 666: 659: 651: 645: 644: 639: 632: 631:External links 629: 626: 625: 609: 600: 580: 570: 569: 567: 564: 563: 562: 557: 552: 547: 542: 540:Data warehouse 535: 532: 280: 205: 202: 201: 200: 185: 182: 175: 172: 160: 157: 149: 148: 145: 142: 139: 136: 127: 124: 116: 115: 112: 109: 96: 93: 80: 77: 15: 13: 10: 9: 6: 4: 3: 2: 1187: 1176: 1175:Data modeling 1173: 1171: 1168: 1167: 1165: 1142: 1139: 1137: 1134: 1133: 1131: 1127: 1121: 1118: 1114: 1111: 1110: 1109: 1108:Ralph Kimball 1106: 1102: 1099: 1098: 1097: 1094: 1093: 1091: 1087: 1083: 1076: 1072: 1058: 1055: 1053: 1050: 1048: 1045: 1044: 1042: 1038: 1032: 1029: 1027: 1024: 1022: 1019: 1018: 1016: 1012: 1006: 1003: 1001: 998: 996: 993: 991: 988: 986: 983: 981: 978: 977: 975: 971: 967: 960: 956: 942: 939: 937: 934: 932: 929: 927: 924: 922: 919: 918: 916: 912: 906: 903: 901: 898: 896: 893: 892: 890: 886: 880: 877: 875: 872: 870: 867: 866: 864: 860: 854: 853:Surrogate key 851: 849: 846: 844: 841: 839: 835: 832: 831: 829: 825: 819: 816: 814: 811: 809: 806: 804: 801: 797: 794: 792: 789: 787: 784: 783: 781: 779: 776: 774: 771: 769: 766: 765: 763: 759: 753: 750: 748: 745: 743: 740: 738: 735: 733: 730: 728: 725: 723: 720: 718: 715: 713: 710: 708: 705: 704: 702: 698: 694: 687: 683: 679: 672: 667: 665: 660: 658: 653: 652: 649: 643: 640: 638: 635: 634: 630: 622: 616: 614: 610: 604: 601: 596: 592: 591: 584: 581: 575: 572: 565: 561: 558: 556: 553: 551: 548: 546: 543: 541: 538: 537: 533: 278: 275: 239: 221: 219: 210: 203: 198: 194: 190: 186: 183: 180: 176: 173: 170: 169: 168: 166: 158: 156: 154: 146: 143: 140: 137: 134: 133: 132: 125: 123: 121: 120:surrogate key 113: 110: 107: 106: 105: 103: 94: 92: 90: 85: 78: 76: 74: 70: 65: 63: 59: 55: 51: 48: 44: 40: 36: 28: 21: 1120:Dan Linstedt 731: 620: 603: 595:the original 589: 583: 574: 501:'tv' 276: 274:dimension). 240: 222: 215: 165:denormalized 162: 150: 129: 117: 98: 88: 86: 82: 66: 42: 38: 32: 1057:Spreadsheet 990:Data mining 732:Star schema 590:DWH Schemas 432:Dim_Product 236:Dim_Product 95:Fact tables 54:fact tables 39:star schema 1164:Categories 1096:Bill Inmon 900:Degenerate 869:Fact table 566:References 450:Product_Id 336:Fact_Sales 327:Units_Sold 263:Units_Sold 259:Product_Id 247:Fact_Sales 224:Fact_Sales 193:OLAP cubes 179:normalized 73:star shape 43:star model 1014:Languages 1000:OLAP cube 985:Dashboard 936:Transform 888:Dimension 843:Data mart 778:Data mesh 747:Aggregate 712:Dimension 390:Dim_Store 309:Countries 232:Dim_Store 220:article. 131:include: 47:data mart 35:computing 1129:Products 973:Concepts 838:Metadata 827:Elements 773:Data hub 761:Variants 707:Database 700:Concepts 623:, p. 393 534:See also 408:Store_Id 348:Dim_Date 272:Dim_Date 255:Store_Id 228:Dim_Date 181:schemas. 159:Benefits 1079:Related 931:Extract 914:Filling 879:Measure 528:Country 366:Date_Id 303:Country 270:of the 251:Date_Id 204:Example 1089:People 282:SELECT 50:schema 37:, the 1040:Tools 813:ROLAP 808:MOLAP 803:HOLAP 516:Brand 504:GROUP 468:WHERE 426:INNER 384:INNER 342:INNER 291:Brand 197:ROLAP 102:grain 79:Model 941:Load 862:Fact 727:OLAP 722:Fact 483:1997 477:Year 429:JOIN 387:JOIN 345:JOIN 333:FROM 268:Year 234:and 189:OLAP 486:AND 315:SUM 41:or 33:In 1166:: 612:^ 507:BY 462:Id 438:ON 420:Id 396:ON 378:Id 354:ON 306:AS 257:, 253:, 243:Id 238:. 230:, 836:/ 670:e 663:t 656:v 525:. 522:S 519:, 513:. 510:P 498:= 492:. 489:P 480:= 474:. 471:D 465:) 459:. 456:P 453:= 447:. 444:F 441:( 435:P 423:) 417:. 414:S 411:= 405:. 402:F 399:( 393:S 381:) 375:. 372:D 369:= 363:. 360:F 357:( 351:D 339:F 330:) 324:. 321:F 318:( 312:, 300:. 297:S 294:, 288:. 285:P 100:"

Index



computing
data mart
schema
fact tables
dimension tables
snowflake schema
physical model's
star shape
grain
surrogate key
surrogate primary key
denormalized
normalized
OLAP
OLAP cubes
ROLAP

snowflake schema
Data warehouse
Fact constellation
Online analytical processing
Reverse star schema
Snowflake schema
DWH Schemas
the original


Stars: A Pattern Language for Query Optimized Schema

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

↑