A Strange Case of a Very Large LOP_FORMAT_PAGE
I have been involved in a long running case with a client, trying to establish why an ETL process of a couple of gig of data, would result in over 5 gig of transaction log space being used.
The database was set in bulk mode, and confirmed as being correctly in bulk mode with appropriate full backups and transaction log backups occurring – so how could the log be expanding so rapidly for such a simple import. One clue was that this was only occurring on a single system, and that no other system had ever witnessed this behaviour, so it had to be something environmental.
The output from the transaction log captured during a load window was sent to me and some quick aggregation of the rows show that there was close to 280k LOP_FORMAT_PAGE entries, but instead of the normal log entry size of 84 bytes, they were coming in at 8276 bytes. This was due to the log record overhead but basically meant that for each page being allocated by the inload, an entry was added larger than the page just added.
I contacted Paul Randal to ask why the log entry would get allocated this large and he identified it as not being in bulk logged mode – which made sense given the log expansion but there was little proof. Examining an individual LOP_FORMAT_PAGE entry I could see a clear repeating pattern within the hex data – which would be consistent to a page with rows within it.
One field of what was being imported was a known value for every row (the data related to a specific week number), and translating that value into hex, it was visible multiple times. Using the breakdown of a row and that this value was field number 3 the row could be decoded and actual values of a row imported reconstructed.
Using these values the main database was queried and that row did indeed exist – the data has 27 decimal values within it also, so the chances of a random match are incredibly small. This proved that the log was recording every single row, even though it was in bulk logged mode.
There are a number of things that can prevent bulk logged mode from operating correctly, and on examining the list the problem became clear to the client’s DBA; the main database backup was still running at the same time as the data inload was being scheduled. The bulk log operations have to be fully logged during the backup to ensure that the database can be returned to a consistent state once it has been restored.
It is the only time I have seen the LOP_FORMAT_PAGE taking up this much space in a transaction log, and if anyone else gets stuck on a similar problem, google should at least find something this time.