How Do You Decode A Simple Entry in the Transaction Log? (Part 1)
There is always a lot of interest in reading log records within SQL, although when you start getting into it you soon realise that it is an undocumented quagmire and quickly decide that there are better things you can do with your time.
It has taken a non-trivial amount of time to decode and has created a long documented process, so it will be in two parts, so I should caveat this undertaking with two clear statements first:
- There is no real reason you want to go poking around in the log except for ‘interest value’ – if you need to actually properly read log files then as you are about to see, using commercial applications will look very cheap in comparison to the time you would spend trying to do this. Even better – make sure you have proper backup’s so that you never need to try retrieve rows from the file!
- There is basically very little external documentation to guide you in reading an entry, I am learning it by doing it, and using what little documentation there is, along with trying some logical deductions and good understanding of internals from training – it is designed to act as an example of figuring the log entry out, but also demonstrating that this is not trivial and not something you will want to do.
That said, let’s start by trying to work out a single log record.
Using the adventure works example database, I have set it into fully logged mode and taken a backup to ensure that the transaction log is being used correctly. I’ve removed the foreign key constraints and triggers on the HumanResources.EmployeeTable, since I want to be able to delete a row and not trip over the example FK’s and trigger that prevents deletion.
To get to the log file values out, I am using the following snippet of SQL:
Select * from ::fn_dblog(null,null)
On initial inspection the log has only a not too many rows, altering the Diff Map, File Header etc. So I issue the command to delete a single row from the table.
Delete from HumanResources.Employee where EmployeeId = 1
And then select the contents of the log, and I have copied some of the fields from the output:
|Current LSN||Operation||Context||Transaction ID||Log Record Length||AllocUnitName|
|000001bd:000001d8:0004||LOP_SET_BITS||LCX_DIFF_MAP||0000:00000000||56||Unknown Alloc Unit|
|000001bd:000001d8:0006||LOP_SET_BITS||LCX_DIFF_MAP||0000:00000000||56||Unknown Alloc Unit|
|000001bd:000001d8:0009||LOP_SET_BITS||LCX_DIFF_MAP||0000:00000000||56||Unknown Alloc Unit|
That’s a lot of entries for a single delete – which is explained when you check the AllocUnitName, the delete has to also delete the entries within the indexes and the Adventure works table I am working against does indeed have 5 indexes, 4 Non-clustered and 1 Clustered. So that is making a lot of sense, and the whole operation is surrounded by a LOP_BEGIN_XACT and LOP_COMMIT_XACT, and we know from normal SQL terminology with SET XACT ABORT ON / OFF that it is about transactional scope and whether a whole transaction rolls back if a single item within the statement.
Let’s concentrate on the record deletion for the clustered index, with the larger log length of 288. That is a long record for what is in theory marking the row as a ghost, which suggests the row is within the log record. This is also backed up by the differing lengths for the other ghost record marks, which differ in size, just as the index row size does.
What are the interesting fields available to us from the Log record that we can pick up on straight away:
- Transaction ID : All the deletions belong to the same transaction ID, so we know they are related / part of the same transactional operation.
- Previous LSN : Each subsequent deletion in the list, shows the previous one as the previous LSN, so we can see the chain of log records.
- Page ID : We can tell which page was altered.
- Slot ID : Which slot within the page was altered.
- SPID : The LOP_BEGIN_XACT row shows the SPID that issued the command
- BeginTime : The LOP_BEGIN_XACT row shows the DateTime when the command was issued.
- Transaction Name : The LOP_BEGIN_XACT row shows the type of transaction, for this one shows DELETE.
- EndTime : The LOP_COMMIT_XACT row shows the end time of the transaction.
Interestingly, there is no indication as to who issued the command, there is a TransactionSID – which might turn out to be the account SID, but that will be a tangental investigation.
The final column to have a look at is RowLogContents, for the clustered index row deletion, rowlog contents 0 has the larger set of data to work on.
If that does make you go running for the hills, then you have a chance of decoding it.
That’s only 180 Bytes, but this is the log row contents, not the actual log entry itself, so the fixed log fields are not within it, let’s work on it. The structure of a row is a good place, since we know expect the structure of the row to be present within the log. The format is very nicely detailed in Kalen Delaney’s SQL Internal’s book so let’s use that.
- 2 Byte : Status Bits
- 2 Bytes: Offset to find number of columns
- X Bytes:Fixed Length Columns
- 2 Bytes: Total Number of Columns in the data row
- 1 Bit per column, Rounded up: Nullability Bitmap
- 2 Bytes:Number of Variable Length Columns within the data row
- 2 Bytes per variable length column : Row Offset marking the end of each variable length column
- X Bytes:Variable Length Columns
We can use this in Part 2 to decode the binary contents of the log record.