Archive

Archive for September, 2010

Interval Partitioning in SQL Server 2008

September 12, 2010 2 comments

Another one of those features in Oracle that we do not natively have in SQL Server is interval partitioning, where you can automatically have partitions generated based on an interval automatically generated for you. The interval partitioning is yet again, another form of dynamic partitioning, so the thought was, could this be achieved within SQL Server?

My initial thought would be to use an instead of trigger, which would intercept the incoming values and take action on them if appropriate to extend the partition function. An initial look into the documentation suggests it will not be trivial – the BoL states:

"Additionally, the following Transact-SQL statements are not allowed inside the
body of a DML trigger when it is used against the table or view that is the
target of the triggering action. CREATE INDEX, ALTER INDEX, DROP INDEX, DBCC
REINDEX, ALTER PARTITION FUNCTION, DROP TABLE..."

The issue there being the alter partition function – any response to an incoming piece of data that is not properly covered by the existing partition mechanism will need to alter the partition information. Of course there are still ways around such restrictions but when experimenting, it seems the BoL is not correct to list the ‘ALTER PARTITION FUNCTION’ command in the restrictions.

There are a few caveats to the following snippets of code, I am not attempting to deal with the rolling window affect, or complex storage pattern of a partitioned table of which I have written about in other posts. This is purely designed to demonstrate that it could be achieved, not to provide a total solution to the problem. The partitions created on the fly will all go to the Primary filegroup as well etc.

So start with a very basic partition function and scheme:

CREATE PARTITION FUNCTION pfTest (datetime)
AS RANGE LEFT FOR VALUES ('20100104' , '20100111' , '20100118', '20100125')
CREATE PARTITION SCHEME psTest AS
PARTITION pfTest ALL TO ([PRIMARY])

And generate a table on the partition scheme:

CREATE TABLE IntervalTest (
MyID int identity(1,1) not null,   
MyField Varchar(200),
MyDate datetime
) ON psTest(MyDate)

The next step is an ‘Instead of’ trigger, which has to intercept the incoming data from the inserted table, and extend the partition function if required:

CREATE TRIGGER tr_IntervalTest ON IntervalTest INSTEAD OF INSERT
AS
 BEGIN     

  -- get the current maximum partition value
  DECLARE @max_part_dt datetime;   DECLARE @max_inserted_dt datetime;
  DECLARE @weeks_to_add int;

  SET @max_inserted_dt = (SELECT max(MyDate) FROM inserted);

  SET @max_part_dt = (
SELECT max(convert(datetime,value))
from sys.partition_functions f
  inner join sys.partition_range_values rv on f.function_id = rv.function_id
  where name = 'pfTest');      
IF (@max_inserted_dt > dateadd(D,7,@max_part_dt))
  BEGIN
   -- need to potentially add multiple partition splits, it depends on
   -- how many weeks in advance the new data is.
   -- get a whole number of the weeks to add to ensure that we cover 
   -- the new data
   SET @weeks_to_add = ceiling(datediff(D,@max_part_dt, @max_inserted_dt) / 7.0)

   -- loop around splitting the partition function with the new weekly values
   -- that need to be covered
   WHILE @weeks_to_add > 0
   BEGIN
    -- increase the maximum partition date by 7 days and split the function
    SET @max_part_dt = dateadd(D,7,@max_part_dt);

    ALTER PARTITION SCHEME psTest NEXT USED [Primary];

    ALTER PARTITION FUNCTION pfTest() SPLIT RANGE (@max_part_dt);
    SET @weeks_to_add = @weeks_to_add - 1
   END;
  END;

  -- finally insert the values
  INSERT INTO IntervalTest (MyField, MyDate)
  SELECT MyField, MyDate
  FROM inserted;  
END     

The code is pretty self-explanatory, but I would point out that it is only covering an insert, not an update – and this is not production code but an experiment to see if it could be done (contrary to the BoL). To ‘productionize’ this would require significant work on exception handling, performance tuning, handling multiple filegroup partitioning, the list goes on, but all of it achievable.

A little test to insert a couple of values set ahead of the partition

insert into IntervalTest (MyField, MyDate)
select 'b', '20100505'
union
select 'c', '20100606'

And a check of the partition function values now show 23 values in the partition function, instead of the original  4 as follows:

2010-01-04 00:00:00.000, 2010-01-11 00:00:00.000, 2010-01-18 00:00:00.000,
2010-01-25 00:00:00.000, 2010-02-01 00:00:00.000, 2010-02-08 00:00:00.000,
2010-02-15 00:00:00.000, 2010-02-22 00:00:00.000, 2010-03-01 00:00:00.000,
2010-03-08 00:00:00.000, 2010-03-15 00:00:00.000, 2010-03-22 00:00:00.000,
2010-03-29 00:00:00.000, 2010-04-05 00:00:00.000, 2010-04-12 00:00:00.000,
2010-04-19 00:00:00.000, 2010-04-26 00:00:00.000, 2010-05-03 00:00:00.000,
2010-05-10 00:00:00.000, 2010-05-17 00:00:00.000, 2010-05-24 00:00:00.000,
2010-05-31 00:00:00.000, 2010-06-07 00:00:00.000

It has clearly created the partitions to cover the data being inserted and then performed that insertion.

So it can be done, but the constant cost of intercepting every insertion and update to provide this kind of dynamic partition is really not ideal, whether it could be made sufficiently efficient to work at the sort of scale that partitioning tends to be used at is debatable. I have it feeling that it would struggle – I would need to be lent a suitably sized server and SAN to test that one and see whether it could be made efficient enough.

Blank Transactions

September 5, 2010 Leave a comment

Busy time with a new addition to the household – sleep is clearly an optional parameter these days, but on to one of those oddities you might see in the transaction log. On many occasions you will see transactions in the log that have no operations, the individual transaction entry just has a LOP_BEGIN_XACT following be a LOP_COMMIT_XACT, with no operations being recorded.

So what causes these?
The immediate thought is a straight:

Begin Transaction
Commit Transaction

If you try that and inspect the log, you will notice that it has not added in this mysterious, zero-operation transaction. So that is not the cause.

How about a rolled back transaction? well you should already know that this would not be the answer since the log reserves space ahead during a transaction to record undo operations, due to a rollback. To show that in more detail, given the following simple snippet of SQL:

begin transaction
insert into test (name) values ('a')
rollback transaction

The log then shows the transaction beginning, performing the insert, and then rolling it back by deleting the record and recording it as an aborted transaction.

LOP_ABORT_XACT
LOP_DELETE_ROWS
LOP_INSERT_ROWS
LOP_BEGIN_XACT

So neither of the first obvious choices are the cause, the reason seems a bit bizarre, but centres around whether any change was attempted but ignored due to being unecessary. Setting up a test table and inserting that single row into it with a value of ‘a’, run the following statement:

begin transaction
update test set name = 'a'
rollback transaction

Now when you inspect the log, there is a blank transaction, it recorded the start and end of the transaction, but no operations are shown. The same is true if the transaction is rolled back.

If the code is altered slightly to deliberately mean that no modification would occur though, the same does not hold true:

begin transaction
update test set name = 'a' where 1 = 0
commit transaction

Clearly the code is designed to make no modifications, so it is not surprising that no entry occurs in the transaction log, to make the test a bit fairer, let’s design the code in a way that it might make a modification, but it doesn’t.

begin transaction
update test set name = 'a' where name = 'b'
commit transaction

Still no entry in the transaction log.

So the distinction in the logging of these zero-op transactions is whether or not there was matching data to be altered, if a record was found but that the alteration was unnecessary we get a zero-op transaction appear in the log. It does make you wonder, why?

It also means that from an auditing perspective, the attempt to modify the data was not logged, not because it was unsuccessful, but because it was being altered to the values it already had.