quarta-feira, 3 de outubro de 2012

Particionamento Dinâmico - Dynamic Partitioning



A Stored Procedure (SQL Server) a seguir realiza o particionamento de uma tabela de nome "FACT_TABLE", o particionamento também tem a função de realizar a manutenção de expurgo da tabela. O código abaixo armazena os últimos 30 dias carregados na tabela "FACT_TABLE", e mantém o histórico dos meses fechados armazenando somente o último dia de cada mês já fechado.
Muito útil para processos que acompanham vendas diárias, mas precisam manter a última posição fechada; posições de Carteiras e etc.

The Stored Procedure (SQL Server) performs the partitioning of a table named "FACT_TABLE" this partitioning also has the function to perform the purge o the table. The code below stores the last 30 days in the loaded table "FACT_TABLE" and keeps track of the months closed storing only the last day of each month closed.
Very useful for processes like daily sales, but need to keep the last closed position; positions and etc.


Código abaixo (Code below)

--Cria função de partição da tabela de fato. (Create the partition function of the fact table)
------------------------------------------------------------------------------------------------------------
DECLARE @DATA SMALLDATETIME

SET @DATA = '1900-01-01'


CREATE PARTITION FUNCTION PFDataRef (SMALLDATETIME)
AS RANGE RIGHT FOR VALUES (@DATA)

GO

-- Cria função de partição da tabela auxiliar. (Creates the partition function of the auxiliary table)
------------------------------------------------------------------------------------------------------------
DECLARE @DATA SMALLDATETIME

SET @DATA = '1900-01-01'

CREATE PARTITION FUNCTION PFDataRefAUX (SMALLDATETIME)
AS RANGE RIGHT FOR VALUES (@DATA)

GO

-- Cria a partição da tabela de fato. (Create the partition of the fact table)
------------------------------------------------------------------------------------------------------------
CREATE PARTITION SCHEME PSDataRef1
AS PARTITION PFDataRef
ALL TO ([PRIMARY])

GO

-- Cria a partição da tabela dauxiliar. (Creates the partition of the auxiliary table)
------------------------------------------------------------------------------------------------------------
CREATE PARTITION SCHEME PSDataRefAUX1
AS PARTITION PFDataRefAUX
ALL TO ([PRIMARY])

GO

-- Cria a tabela de Fato. (Creates the Fact Table)
------------------------------------------------------------------------------------------------------------
CREATE TABLE FACT_TABLE( C1 SMALLDATETIME NOT NULL,
C2 INT NOT NULL) ON PSDataRefCNG1 (C1)

GO

-- Cria a tabela auxiliar. (Creates the Auxiliary Table)
------------------------------------------------------------------------------------------------------------
CREATE TABLE FACT_TABLE_AUX( C1 SMALLDATETIME NOT NULL,
C2 INT NOT NULL) ON PSDataRefCNGAUX1 (C1)

GO

-- Stored Procedure para Particionamento. (Stored Procedure for Partitioning.)
------------------------------------------------------------------------------------------------------------
CREATE PROCEDURE SP_ParticaoExpurgo @DATA SMALLDATETIME
AS

DECLARE @DataPart SMALLDATETIME,
@DataSplit SMALLDATETIME,
@DataRange SMALLDATETIME,
@Cont INT


SET @DataRange =  cast((select top 1 [value] from sys.partition_range_values
where function_id = (select function_id
from sys.partition_functions
where name = 'PFDataRef')
order by boundary_id ASC) as smalldatetime)

IF @DataRange = @DATA
RETURN

ALTER PARTITION SCHEME PSDataRef1 NEXT USED [PRIMARY]
ALTER PARTITION FUNCTION PFDataRef() SPLIT RANGE (@DATA)

DECLARE Part_cursor CURSOR
FOR SELECT CAST( [value]  AS SMALLDATETIME) AS DATA
FROM sys.partition_range_values
WHERE function_id = (SELECT function_id
FROM sys.partition_functions
WHERE name = 'PFDataRef')
ORDER BY boundary_id DESC

OPEN Part_cursor

FETCH NEXT FROM Part_cursor INTO @DataPart

SET @Cont = 1

CREATE TABLE ##Part (DataPart SMALLDATETIME, Cont INT)

WHILE (@@fetch_status <> -1)
BEGIN

If @Cont <= 30
BEGIN
INSERT INTO ##Part VALUES (@DataPart, @Cont)
END

SET @Cont = @Cont + 1

FETCH NEXT FROM Part_cursor INTO @DataPart
END

CLOSE Part_cursor

DEALLOCATE Part_cursor

DECLARE Split_cursor CURSOR
FOR SELECT CAST( [value] AS SMALLDATETIME) AS DATA
FROM sys.partition_range_values
WHERE function_id = (SELECT function_id
FROM sys.partition_functions
WHERE name = 'PFDataRef')
AND CAST( [value] AS SMALLDATETIME) NOT IN (SELECT DataPart
FROM ##Part
WHERE Cont <= 30
UNION
SELECT MAX(CAST( [value]  AS SMALLDATETIME)) AS DATA
FROM sys.partition_range_values
WHERE function_id = ( select function_id
from sys.partition_functions
where name = 'PFDataRef')
GROUP BY YEAR(CAST( [value]  AS SMALLDATETIME)) * 100 + MONTH(CAST( [value]  AS SMALLDATETIME)) )
ORDER BY boundary_id ASC


OPEN Split_cursor

FETCH NEXT FROM Split_cursor INTO @DataSplit

WHILE (@@fetch_status <> -1)
BEGIN

ALTER PARTITION SCHEME PSDataRefAUX1 NEXT USED [PRIMARY]
ALTER PARTITION FUNCTION PFDataRefAUX() SPLIT RANGE (@DataSplit)
ALTER TABLE dbo.UFACT_TABLE SWITCH PARTITION $PARTITION.PFDataRef(@DataSplit) TO dbo.FACT_TABLE_AUX PARTITION $PARTITION.PFDataRefAUX(@DataSplit)
TRUNCATE TABLE dbo.FACT_TABLE_AUX
ALTER PARTITION FUNCTION PFDataRefAUX() MERGE RANGE (@DataSplit)
ALTER PARTITION FUNCTION PFDataRef() MERGE RANGE (@DataSplit)

FETCH NEXT FROM Split_cursor INTO @DataSplit
END

CLOSE Split_cursor

DEALLOCATE Split_cursor

DROP TABLE ##Part


-- Teste: (Test)
------------------------------------------------------------------------------------------------------------
exec SP_ParticaoExpurgo '2008-01-01'
INSERT INTO FACT_TABLE VALUES ( '2008-01-01' , 1 )

exec SP_ParticaoExpurgo '2008-01-02'
INSERT INTO FACT_TABLE VALUES ( '2008-01-02' , 2 )

exec SP_ParticaoExpurgo '2008-01-03'
INSERT INTO FACT_TABLE VALUES ( '2008-01-03' , 3 )

/* ... Continue até gerar a quantidade de registros necessários ao teste. Você também pode mudar o valor da procedure de 30 para outro para testar com um volume menor. */
/* ... Continue to generate the quantity of records needed to test. You can also change the value of the quantity of records (less then 30) to test with a smaller volume.*/

-- Consultas para apurar o funcionamento do particionamento. (Queries to test the partitioning operation.)
------------------------------------------------------------------------------------------------------------
SELECT * FROM FACT_TABLE

SELECT *
FROM sys.partition_range_values
WHERE function_id = (SELECT function_id
FROM sys.partition_functions
WHERE name = 'PFDataRef')
ORDER BY boundary_id DESC

SELECT * FROM FACT_TABLE_AUX

SELECT *
FROM sys.partition_range_values
WHERE function_id = (SELECT function_id
FROM sys.partition_functions
WHERE name = 'PFDataRefAux')
ORDER BY boundary_id DESC

terça-feira, 2 de outubro de 2012

SCD (Slowly Changing Dimensions) - English Version


SCD (Slowly Changing Dimensions)

Slowly Changing Dimensions means that data changes without a temporal basis. There isn't a planned change by time. There are various types of SCD (Type 0, 1, 2, 3, 4, 6-hybrid-). The most common are type 1, 2 and 3.

Below an example of each type:

Type I:

Suppose we have a Customer dimension with the following attributes: social security number, name and state, as shown below:


Where SK is the Surrogate Key (sequential, numerical and meaningless), the social security number is the business key, name a descriptive attribute and state an attribute that we are looking to update.

In SCD type I, when there is a change in a given data this value is updated. No version is generated.
In the case below, the customer Erick Furst changed from the state of MG for the state of RJ (Brazilian States).

  
Note that SK remains the same and we have only one instance of customer Erick Furst.


Type II:

The SCD type II allows versioning the registry.




In this scenario the SCD Type II makes use of two new fields, Start Date and End Date that will allow the recording of the client's situation (This technique can also be used with the field "Current Record Indicator" or the "Version Number" ).

Note that the client Erick Furst was added on September 24, 2012 (Start Date field) and End Date field with empty value or a date that means null (eg 12/31/1800 or 12/31/9999) indicates that this record is valid, ie is the current record.

Suppose the client Erick Furst moved to the RJ on September 25, 2012. The new version will be included with a new SK (Value 2). And the older version will receive an End Date equivalent to the entry of the new record and the newest version will have the end date open (Null).




Type III:

The SCD type III allows versioning by column.
Assume the same position above in the SCD type II, instead of the versioned record the new version will be included in columns. The disadvantage of this technique is that the versioning is limited to the number of columns provided for this purpose.

Eg: