When creating a dimension table in a data warehouse, we generally create the tables with a system generated key to unqiuely identify a row in the dimension. This key is also known as a surrogate key.
Note on Key_Generation: To generate artificial keys in DI we can use either Key_Generation Transform or Key_Generation Funtion. It looks into the table and fetches max. existing key value and that will be used as a starting value. Based on this starting value, transform/function increments the value for each row.
Options: We have three options
- Table Name : You should provide table name along with the Datastore and Owner (DATASTORE.OWNER.TABLE)
- Generated Key Column : The new artificial keys are inserted into this column. Remember your key column should be in any number datatype (REAL, FLOAT, DOUBLE, INTEGER, DECIMAL), if it is any other data type, then DI will throw an error.
- Increment value : Specify your interval for system generated key values. Surrogate key will be incremented based on this interval value. From 11.7 version onwards, we can pass variables also.
Design Steps: Here I’m populating customer information, I’ve a primary called Customer_ID in the both source & target tables, but I want to maintain a SURROGATE_KEY.
Have a glance on soruce data, here it is
key_gen transform always expects a SURROGATE_KEY column in SCHEMA IN
After completion of your job execution, here is the target customers_dim target data with surrogate key values.