Archive

Archive for February 21, 2010

Decoding a Simple Update Statement Within the Transaction Log

February 21, 2010 3 comments

After decoding the delete statement within the transaction log, I thought I would tackle another statement – an update statement, although I am attempting the simple version of the update which is only to update a single value within the fixed width fields section. Again to assist in decoding the log entry, I have used the AdventureWorks database and know the values of the data before and after the modification, so it is easier to decode how the log entry is constructed.

As before the database has been backed up whilst in fully logged mode and the transaction log backed up to make it as small as possible. Once the database was ready I issued a simple statement against the database

update HumanResources.Employee
set MaritalStatus = 'M' where employeeid = 6

The employee with the ID  of 6 has changed marital status, congratulations employee id 6.

The first confusion is that two LOP_MODIFY_ROW entries show up against the LCX_Clustered context, for the HumanResources.Employee.PK_Employee_EmployeeID allocation unit / name, but only 1 modification has been made.

What is the second modification?

Against each LOP_MODIFY_ROW there is an [Offset in Row] column, which indicates where within the row the modification has been made. The first modification is recorded at byte 24, the second is at byte 57. Instead of providing a field number / name, the alteration is being recorded at a byte level, so to figure out which columns have changed, we must map out the structure in terms of byte positions for the row.

Offset Name Size in Bytes Value
0 Status Bits 2  
2 Column Number Offset 2  
4 EmployeeID 4 6
8 ContactID 4 1028
12 ManagerID 4 109
16 BirthDate 8 1965-04-19 00:00:00.000
24 MaritalStatus 2 S
26 Gender 2 M
28 HireDate 8 1998-01-20 00:00:00.000
36 SalariedFlag 1 1
37 VacationHours 2 40
39 SickLeaveHours 2 40
41 CurrentFlag 1 1
41 rowguid 16 E87029AA-2CBA-4C03-B948-D83AF0313E28
57 ModifiedDate 8 2010-02-20 18:13:03.710
65 Column Count 2  
67 Variable Column Count 2 3
69 Variable Offets 6  
75 NationalIDNumber 16 24756624
91 LoginID 44 adventure-works\david0
135 Title 100 Marketing Manager

Again, because there are no nullable columns at all, there is no nullability bitmap within the structure. This mapping is specific to this row, but the offsets for the fixed portion of the row would be valid for all the records, it is only the variable width columns that would change per data row.

The two modifications are at byte 24 and 57, which is clearly the Marital Status and the Modified Date. 

Why is the modified date being changed? The adventure works employee table has an update trigger on it, which alters the modified data column to be the current date when any change is made to the row.

For the Marital status transaction log entry, there are two values we need to logically know – what was it before the change, and what is the value after the change.

The Row Log Contents 0 and Row Log Contents 1 fields clearly provide that information

Row Log Contents 0 : 0x53
Row Log Contents 1 : 0x4D
select convert(nchar,0x53) = S
select convert(nchar,0x4D) = M

So they are clearly the before and after results.

Checking the transaction log entry against the trigger, the log row contents are dates within hex format, which are not too difficult to deal with

In Hex, and Row Log Contents  0 contains the previous modified date, although the SQL Management Studio results show the Hex has been shortened somewhat from the normal  8 bytes for a date. Padding the extra zeros in gives us:

0x 00 00 00 00 35 95 00 00 = 0x 00 00 95 35 00 00 00 00  = 2004-07-31 00:00:00.000

The same rules apply to the  RowLog Contents 1 which has also been cut short.

0x A5 F1 D6 00 24 9D 00 00 =  00 00 9D 24 00 D6 F1 A5 =  2010-02-21 13:02:35.217

Which is unsurprisingly today’s date and when I made the modification.

The immediate question is did the log contain two entries because there were two different causes of the change, or does it record a transaction log row per value modified. Since the Offset In Row value is within the transaction log row, you could guess at 1 entry per change.

To check, I restored the database to the starting position and issued a statement with two modifications:

update HumanResources.Employee
set MaritalStatus = 'M',
VacationHours = 60
where employeeid = 6

The transaction log alters signficantly, the modified date LOP_MODIFY_ROW entry with an offset of 57 still gets added, but the alteration of the two values does not produce 2 x LOP_MODIFY_ROW but a single LOP_MODIFY_COLUMNS – this appears to have a more complex format that is going to be tougher to crack, so that will have to wait until another day.