terça-feira, 2 de outubro de 2012

SCD (Slowly Changing Dimensions) - English Version


SCD (Slowly Changing Dimensions)

Slowly Changing Dimensions means that data changes without a temporal basis. There isn't a planned change by time. There are various types of SCD (Type 0, 1, 2, 3, 4, 6-hybrid-). The most common are type 1, 2 and 3.

Below an example of each type:

Type I:

Suppose we have a Customer dimension with the following attributes: social security number, name and state, as shown below:


Where SK is the Surrogate Key (sequential, numerical and meaningless), the social security number is the business key, name a descriptive attribute and state an attribute that we are looking to update.

In SCD type I, when there is a change in a given data this value is updated. No version is generated.
In the case below, the customer Erick Furst changed from the state of MG for the state of RJ (Brazilian States).

  
Note that SK remains the same and we have only one instance of customer Erick Furst.


Type II:

The SCD type II allows versioning the registry.




In this scenario the SCD Type II makes use of two new fields, Start Date and End Date that will allow the recording of the client's situation (This technique can also be used with the field "Current Record Indicator" or the "Version Number" ).

Note that the client Erick Furst was added on September 24, 2012 (Start Date field) and End Date field with empty value or a date that means null (eg 12/31/1800 or 12/31/9999) indicates that this record is valid, ie is the current record.

Suppose the client Erick Furst moved to the RJ on September 25, 2012. The new version will be included with a new SK (Value 2). And the older version will receive an End Date equivalent to the entry of the new record and the newest version will have the end date open (Null).




Type III:

The SCD type III allows versioning by column.
Assume the same position above in the SCD type II, instead of the versioned record the new version will be included in columns. The disadvantage of this technique is that the versioning is limited to the number of columns provided for this purpose.

Eg:


Sem comentários:

Enviar um comentário