Entity Types: In a data warehouse data can be classified into different types to signify the value it brings to the warehouse.
- 3rd Party Datasets are upstream data that are provided by external companies.
- Raw Logs are upstream data that are provided by other systems within the company. Examples are Website Logs, Sales or Purchase Orders, and Inventory Movement.
- Master Data Domains are data that came from a Master Data Service or collected across multiple company systems. This kind of data is used to build dimensions.
- Dimensions are data elements that enable filtering, grouping and labeling facts (measures).
- Staging Datasets are data that has been cleansed, enhanced, or conformed and ready for further processing.
- Atomic Facts are the lowest level of grain for a set of measures. Measures can be filtered, grouped, and labeled by a predefined set of dimensions
- Aggregated Facts are an summarized set of measures that have been summed up from an atomic fact. Used for business snapshot purposes like end month or end of year reporting or can be used for performance tuning.
- Reports are usually simple facts (sometimes non-pivotable) that the end user uses as a statement of measure on key performance (KPI) for the business.
Data Variance Check Definitions: An attribute to an entity may have a
Attribute Types: Each attribute in a entity has a specific meaning in a data warehouse.
- Dimension Keys are surrogate key that is used to make to a dimension.
- Business Keys are values used to lookup the dimension key using in the key mapping phase of an ETL. These are found in the upstream source streams.
- Measures Attributes are numeric values representing a count or output of an equation.
- Regular Attributes are general values used for pass-through processes or user consumption.
Please see (Baseline Conceptual Models Commentary) for further details on what conceptual models are to be used for.
No comments:
Post a Comment