DateTime2 | SQL Fascination

Why Are Date Tricks in SQL 2005 a Problem in Waiting?

January 26, 2010 Andrew Hogg Leave a comment

One of the long time annoyances about the Date functions in SQL is that a number of them do not function as developer would wish, the prime example I’m going to focus on is the DateDiff function. The way it operates is very contrary to what a developer might expect or want from it – the function counts the number of boundaries that crossed for the specific units selected, not the number of whole units between the dates, as an example:

declare @date1 datetime
declare @date2 datetime
set @date1 = '20100101 23:00:00'
set @date2 = '20100102 01:00:00'
select datediff(d, @date1, @date2)

And the result is 1, since the units selected was days, the boundary line is predictably at midnight, so even though the time span is only 2 hours, it would count as 1 day – that is not intuiative. Now this is all documented so we cannot complain or grumble. If you wanted to know whether a full day has passed, you used hourly units instead and made sure you had the logic to deal with this.

All of this leaves you with a pretty bad resolution however, you can get the hours difference, but minutes and seconds are not available – so you have to datediff on those units and do some maths. It really makes for a ham-fisted way of getting a duration.

So people work around the problem by converting the date to the numeric equivalent and manipulating that directly.

declare @date1 datetime
set @date1 = '20100101 12:15:30'
select convert(float,@date1) 40177.5107638889

The decimal part represents the percentage through the day, which is not really how the underlying binary storage stores it, it uses a number to represent the number of 1/300ths of a second since the day started.

This format was very forgiving though, if you wanted to add a day, instead of using DateAdd, you could just add 1 to the number, very convienient.

It does however make it easier to create a pseudo-timespan by deducting one date’s numeric representation from another dates, although the code is somewhat long-winded. As a side note, make sure you convert to float and not real – real has not go sufficient accuracy for this to work.

declare @date1 datetime
declare @date2 datetime
set @date1 = '20100101 12:00:00'
set @date2 = '20100102 13:15:35'
declare @result float
set @result = convert(float,@date2) - convert(float,@date1) declare @DurationDays float
declare @DurationTime float
declare @DurationHours float
declare @DurationMinutes float
declare @DurationSeconds float
set @DurationDays = floor(@result)
set @DurationTime = (@result - floor(@result) )
set @DurationTime = @DurationTime * 86400 set @DurationHours = floor(@DurationTime / 3600)
set @DurationTime = @DurationTime - @DurationHours * 3600
set @DurationMinutes = floor(@DurationTime/60)
set @DurationTime = @DurationTime - @DurationMinutes * 60
set @DurationSeconds  = @DurationTime
select @DurationDays as Days,  @DurationHours as Hours ,  
@DurationMinutes as Minutes,  @DurationSeconds as Seconds

Days              Hours              Minutes           Seconds
----------------- ------------------ ----------------- -----------------
1                 1                  15                35.0000002188608

Bit of a hack and was it really any shorter or better? Debatable. Whilst it can get time span information out, when used within SQL 2008 using the new datetime2 types, the wheels fall off:

declare @date1 datetime2(7)
set @date1 = '20100101 12:00:00'
select convert(float,@date1)
Msg 529, Level 16, State 2, Line 3 Explicit conversion from data type datetime2 to float is not allowed.

And that is where the problem comes in – the new datetime2 types will not allow the date to be converted to a number, and a number of these developer tricks no longer work.

Most if not all the tricks can be re-written to use multiple date functions with some mathematical logic – and it can be done without the code ever knowing if it had been embedded within a function / stored procedure. Where you would see a less transparent move to the datetime2 data types would be where developers had embedded some of the tricks directly into ad-hoc SQL, that will fail if the type is altered. In an ideal world, is that the code never contains these techniques of course, but we do not all live in that nirvana.

So on the one hand Datetime2 gives great accuracy and can reduce storage, but on the other hands, the tricks used in the past to deal with the inadequacies of the built-in Date functions no longer work.

What I would really like is a revamp of the Date functions and the introduction of a time span type – could be a long wait.

Categories: SQL Server Tags: Best Practise, DateTime2, SQL Server 2005, SQL Server 2008

What is the SQL Server 2008 DateTime2 Internal Structure?

October 11, 2009 Andrew Hogg 2 comments

SQL Server has a number of new date time formats, but the one I am most interested in is DateTime2. The internal format of the SQL DateTime is commonly mistaken as 2×4 byte integers, with the latter integer being milliseconds since midnight. It is in fact the number of 1/300ths of a second since midnight which is why the accuracy of the DateTime within SQL Server has historically been 3.33ms. (If you really want to see it, crack it open by converting it to a binary, adding 1 and re-converting, you add 3.33ms, not 1 ms.)

So DateTime2 must use a different format, and as a weekend exercise that had no purpose than understanding the internals I thought I’d take a look. I have not seen the information in the BoL or posted as yet, so might be of use. I am starting with the DateTime2(7) and looking at the maximum accuracy structure. The code used to crack it open each time is basically as follows:

declare @dt datetime2(7)
set @dt = '2000/01/01 00:00:00'
declare @bin varbinary(max)
set @bin = CONVERT(varbinary(max), @dt)

To make my life easier, SQL conveniently outputs all the values as hexi-decimal numbers. The results are not what you would expect.

0x07000000000007240B

The date which traditionally occupied the first 4 bytes, clearly is occupying the last few bytes. So the format is not going to be obvious or simple. Interestingly the returned result is 9 bytes, but the length is quoted as 8. It is returning 8 when checked using the length, that first byte is somewhat odd to make an appearance. It’s also suspiciously the accuracy value, and with a few tests using a change of accuracy, it show that value changes. So the first pseudo-byte is the accuracy indicator.

To start figuring out some more, let’s take the time back to the beginning point, which in this case is not 1900/01/01 but 0001/01/01 which when converted gives us:

'0001/01/01 00:00:00' => 0x070000000000000000

Start incrementing the day portion and there is an obvious pattern, the 6th byte changes.

'0001/01/02 00:00:00' => 0x070000000000010000
'0001/01/03 00:00:00' => 0x070000000000020000
'0001/01/31 00:00:00' => 0x0700000000001E0000

As you try the 2nd month, to check where the month is, the same byte alters, so it represents days, not specific date parts. Is it the number of days since the beginning of the year? No.

'0001/02/01 00:00:00' => 0x0700000000001F0000

If it was, there would be an issue since 1 byte does not represent enough values, as we can see, FF occurs on the 13th of September, and then it rolls over and puts a 1 in the 7th Byte position.

'0001/09/13 00:00:00' => 0x070000000000FF0000
'0001/09/14 00:00:00' => 0x070000000000000100
'0001/09/15 00:00:00' => 0x070000000000010100

It rolls over, then carries on as before. This immediately suggests the next test, to roll over the year, and the pattern continues.

'0001/12/31 00:00:00' => 0x0700000000006C0100  
'0001/12/31 00:00:00' => 0x0700000000006D0100

So the format is just counting, we see it in the example as hex, but it is a straight number count going on but the hex values are left-to-right. Only 2 bytes are used so far, which do not represent enough day combinations, add the third byte in by going past 180 years:

'0180/06/06 00:00:00' => 0x070000000000FFFF00  
'0180/06/07 00:00:00' => 0x070000000000000001

So the final byte is then increased, so the number of combinations becomes 16777215 – that seems a lot better and certainly going to cover the range required.

'2001/01/01 00:00:00' => 0x07000000000075250B

So that is the final 3 bytes decoded, a simple pattern – and provides the template of how the time is also stored.

'0001/01/01 00:00:00.0000000' => 0x070000000000000000
'0001/01/01 00:00:00.0000001' => 0x070100000000000000
'0001/01/01 00:00:00.0000255' => 0x07FF00000000000000
'0001/01/01 00:00:00.0065535' => 0x07FFFF000000000000
'0001/01/01 00:00:00.0065536' => 0x070000010000000000

So to check whether the format is the same,

'0001/01/01 00:00:00.9999999' => 0x077F96980000000000

Decode that again and it all matches:

select (152 * 256 * 256) + (150 * 256) + 127
-----------
9999999

When we click over into 1 second exactly, we increment the first byte by 1, so the time portion is still represented in 100ns intervals, with the normal system of each byte counting up 1 every time the previous byte rolls over. As we get to the limit of the 3 bytes, it rolls into the 4th and then the 5th.

'0001/01/01 00:00:01.0000000' => 0x078096980000000000

So the internal format of the DateTime2(7) is decoded, not difficult but it is an interesting choice – it is now a straight binary number, with the Least Significant Byte being on the Left, the Most Significant being on the right (for each section.) Within the byte however, to convert it you must still read it right-to-left.

The first 5 bytes are recording how many time units intervals have passed since midnight, and the last 3 bytes recording how many days have passed since 0001/01/01.

The time unit intervals are dictated by the accuracy of the number, 100ns for DateTime2(7), and 1 Micro second intervals for a DateTime2(6) etc. The way in which you interpret it does not change, but the units you are multiplying the time portion by, alters based on the accuracy.

You could construct 2 dates that are identical at a binary level, but due to the field meta-data on accuracy, they do not represent the same date time.

declare @dt1 dt1 datetime2(6)
set @dt1 = '0001/01/01 00:00:00.000001'
declare @dt2 datetime2(7)
set @dt2 = '0001/01/01 00:00:00.0000001'

0x060100000000000000
0x070100000000000000

And that is perhaps why on output they automatically have prefixed the binary value with the datetime accuracy, so that they are not entirely identical? I’m not sure but would be interested to find out.

Categories: SQL Server Tags: DateTime2, Internals, SQL Server 2008

SQL Fascination

Archive

Why Are Date Tricks in SQL 2005 a Problem in Waiting?

What is the SQL Server 2008 DateTime2 Internal Structure?

Recent Posts

Blogroll

Email Subscription

Archives