Rolling a Partition Forward – Part 2
The first part of this topic provided a mini-guide to loading data into a partitioned table and a few helpful DMV based statements that can help you automate the process. The unloading of the data should in theory be easier, but to do this in an automated fashion you are more reliant on the DMVs and system views to get to the right information.
The steps to unload a partition of data are:
- Discover which file group the oldest partition is on.
- Create a staging table on the same filegroup with an identical schema and indexes
- Switch the data out to the staging table
- Merge the partition function
- Archive / Drop the data as appropriate.
As in part 1, there are 3 sections of the process which are not so common, whilst the creation of a table and the archive / drop of the old data at the end is standard T-SQL that you will be using regularly.
Discover which Filegroup the Oldest Partition is On
When checking for the oldest filegroup, I have assumed that the basis of the rolling window is that the highest boundary is the most recent data, whilst the lowest boundary is the oldest – in essence time is moving forward and the partition key ascends, not descends. The oldest boundary will therefore be boundary 1, how do you get the Filegroup name of the filegroup this partition is on? A somewhat complex use of a set of DMV’s.
SELECT sys.filegroups.Name as FileGroupName FROM sys.partition_schemes INNER JOIN sys.destination_data_spaces ON sys.destination_data_spaces.partition_scheme_id = sys.partition_schemes.data_space_id INNER JOIN sys.filegroups ON sys.filegroups.data_space_id = sys.destination_data_spaces.data_space_ID INNER JOIN sys.partition_range_values ON sys.partition_range_values.Boundary_ID = sys.destination_data_spaces.destination_id AND sys.partition_range_values.function_id = sys.partition_schemes.function_id WHERE sys.partition_schemes.name = 'YourPartitionScheme' and sys.partition_range_values.boundary_id = 1
This will return the name of file group, which allows you to create the staging table for the partition switch out on the correct filegroup.
Whilst the data space ID’s do alter in sequence depending on the partition function being a left or right based partition, the boundary ID remains consistent, which is why it is used to discover the oldest and not the destination_id / data_space_id.
Switch the Data Out to the Staging Table
Switching the data out is not complex, it just is the reverse syntax of switching the partition in essence. Under the hood you are redirecting IAM pointers, so the switch is considered a meta-data command and exceptionally fast.
ALTER TABLE YourPartitionedTable SWITCH PARTITION 1 TO YourStagingTable
The partition number used is in effect the boundary id, and the oldest boundary is for partition 1 the rolling window.
Merge the Partition Function
The last complex stage is the merging of the partition function, the command explicitly needs the value from the partition function that represents the partition. If you were doing this by hand you would know it, but to automate the process requires the discovery of this information from the DMV’s again.
SELECT value FROM sys.partition_range_values INNER JOIN sys.partition_functions ON sys.partition_functions.function_id = sys.partition_range_values.function_id WHERE name = 'YourPartitionFunctionName' AND boundary_id = 1
Again, we are using the boundary value of 1 to extract only the oldest partition function value, but this can then be used in a partition function merge command.
ALTER PARTITION FUNCTION YourPartitionFunctionName() MERGE RANGE (YourBoundaryValue)
Conclusion
Using the DMV’s and appropriate stored procedures, the rolling window can be automated and does not require hand-crufted SQL to work – just use of the DMV’s to get the key values you need to be able to construct the harder parts of the process.
If you are following the guide on partition layout I wrote before, then the filegroup you have just removed the data from becomes the next spare filegroup to be used to house the next time data is imported. If you store this within the database, the next load will be able to automatically know where to place the data and set the next used filegroup value to, closing the loop so to speak.
Hi,
Can you please suggest what partitioning strategy is to be used for managing a large no:of user accounts?
Users and partitions have nothing in common, unless you are planning to use partitioning for multi-tenancy, which I do not believe is a good use of partitioning. More details here : https://sqlfascination.com/2010/01/05/