Sub-Dimensions

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

A subdimension is a secondary dimension table linked to a dimension table. Although we try to have all the related attributes of a dimension in a single table, there are some sets of attributes that are used again and again in many dimensions. The same way that dimensions are used as an entry point in fact tables, subdimensions are used as an entry point in dimensions. The most usual subdimension is the date dimension, for example a last encounter date in a patient dimension.

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

Fact Tables

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

Fact Tables

Fact tables store the measurements of the processes/activities/work flows/events of the system we are trying to model. There is one row in the fact table for each measurement. The fields of the fact table that hold the result of the measurements are called facts (or measures). Dimensions (and their attributes) describe these measurements and contain the context surrounding them. This is known as the grain of the fact table (e.g. a test to determine the temperature of a patient on certain day at a certain location). A fact table contains foreign keys that point to the dimension tables, have one or more numerical fields (the facts) and may contain one or more degenerate dimensions. The primary key of these tables is defined by a subset of the foreign keys.

Delivering Fact Tables Steps

  • Structure
  • Referential Integrity
  • Surrogate Key Pipeline
  • Preparing the fact tables for loading
  • Incremental Loading
  • Aggregations
  • OLAP Cubes

 

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

Accumulating Snapshots

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

These fact tables represent the complete life of an activity or process, that has a definite beginning and end. They have multiple date dimensions (roles) to refer to the events that occur during a process. These kind of tables are used for to the spans of time between different events (lag calculations).

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

Periodic Snapshots

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

When we need to track the values of some measurements at regular, predictable time intervals, we can use periodic snapshots. Instead of having a row for each event we take a picture (snapshot) of the activity at the end of the day, week, or month. Sometimes, periodic snapshots are just an aggregation of the transactions that occurred during a time period, however they have more predictable sparseness as there is one row of facts generated for each combination of the dimensions that it refers to. Periodic snapshots are suited for tracking long-running processes.

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

Factless Fact Tables

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful 

These tables have no facts. They are used to represent event tracking and coverage. Those that are used for events, relate different dimensions values at a point in time and space (e.g. encounters, orders). However, when designing such tables we add a dummy fact, which is usually called table_name_count that takes the value 1, just to make the SQL queries more clean. These tables can also be used to track coverage (for example have a fact table with one row for every test, location, day combination with a single fact denoting if it was used or not).

What do you think of this post?
  • Awesome 
  • Interesting 
  • Useful