Archive

Archive for September, 2009

How to Remember the Next Used Filegroup in a Partition Scheme

September 30, 2009 6 comments

Within SQL, the partitioned table feature provides an excellent way to store and roll forward data windows across large datasets without incurring huge loading and archiving penalties to the main table.

The process of loading, manipulating and decommissioning data from a partitioned table will need posts of their own – due to the depth of the information available and required as well as the detail needed to understand some of the best practises I have learnt from the field.

This entry is specifically relating to the ‘Next Used’ aspect of dealing with a partitioned table – for some time I wanted to know how could I tell what partition had been set as next used? There seems to be no record of the value – the BoL lists the Next Used facility as:

Specify the filegroup to be marked by the partition scheme as NEXT USED.

This is true in a simplistic sense, but the marking can not be at the filegroup level since the relation from partition schemes to filegroups can be many-to-many, so no single marking on a file group could suffice. I had never been able to figure out where to find that marking, and find a way to read what had been set.

I should add, that you really should not set the next used in one location of the code / stored proc and then perform the split of the partition function in another, it would be far safer to do them together, so the need to actually find it out is really borne out of investigative necessity in trying to understand for a given partition scheme that implements a rolling window that started to go bad, where does it think it should be splitting the partition to?

So the problem remained; it was given to Paul S. Randal  to figure out how we could see / infer this information outside of the dedicated admin console. He figured out the starting point in terms of which value in which DMV to start the solution with, and I ran with it from there to create a relatively easy way to get to it. So credit to Paul for finding out where to start on the issue.

So down to the scripts and some test cases:

There are 2 types of partition scheme to test:

  • Partition Function as Left.
  • Partition Function as Right.

When defining the scheme initially, you can also define it in two ways:

  • Partition scheme defined with the next used pre-set value.
  • Partition scheme defined with no pre-set next used value.

So there are 4 combinations to test initially, since we do not need data for this or are worried about the query plan, the main work is pure schema creation and checking.

First script is to just create a test database, nothing significant:

CREATE DATABASE[NextUsedTest] ON PRIMARY
( 
NAME = N'NextUsedTest', 
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL
\DATA\NextUsedTest.mdf' , 
SIZE = 3072KB , 
FILEGROWTH = 1024KB 
), 
FILEGROUP[PartFG1] ( 
NAME = N'PartFile1', 
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL
\DATA\PartFile1.ndf' , 
SIZE = 3072KB , 
FILEGROWTH = 1024KB ), 
FILEGROUP[PartFG2] ( 
NAME = N'PartFile2', 
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL
\DATA\PartFile2.ndf' , 
SIZE = 3072KB , 
FILEGROWTH = 1024KB 
), 
FILEGROUP[PartFG3] ( 
NAME = N'PartFile3', 
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL
\DATA\PartFile3.ndf' , 
SIZE = 3072KB , 
FILEGROWTH = 1024KB 
), 
FILEGROUP[PartFG4] ( NAME = N'PartFile4', 
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL
\DATA\PartFile4.ndf' , 
SIZE = 3072KB , 
FILEGROWTH = 1024KB 
), 
FILEGROUP[PartFG5] ( 
NAME = N'PartFile5', 
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL
\DATA\PartFile5.ndf' , 
SIZE = 3072KB , 
FILEGROWTH = 1024KB 
)
 LOG ON
( NAME = N'NextUsedTest_log', 
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL
\DATA\NextUsedTest_log.ldf', 
SIZE = 1024KB , 
FILEGROWTH = 10%
) GO

Now the database is in place, lets create the partition functions /schemes:

CREATE PARTITION FUNCTION [pfLeftNoSpare](int) 
AS RANGE LEFT FOR VALUES (200801, 200802, 200803, 200804)

CREATE PARTITION SCHEME [psLeftNoSpare] 
AS PARTITION[pfLeftNoSpare] 
TO([PartFG1], [PartFG2], [PartFG3], [PartFG4], [Primary])

CREATE PARTITION FUNCTION [pfLeftWithNextUsedSet](int) 
AS RANGE LEFT FOR VALUES (200801, 200802, 200803, 200804)

CREATE PARTITION SCHEME [psLeftWithNextUsedSet] 
AS PARTITION [pfLeftWithNextUsedSet] 
TO([PartFG1], [PartFG2], [PartFG3], [PartFG4], [Primary], [PartFG5])

CREATE PARTITION FUNCTION [pfRightNoSpare](int) 
AS RANGE RIGHT FOR VALUES (200801, 200802, 200803, 200804)

CREATE PARTITION SCHEME [psRightNoSpare] 
AS PARTITION[pfRightNoSpare] 
TO([Primary], [PartFG1], [PartFG2], [PartFG3], [PartFG4])

CREATE PARTITION FUNCTION [pfRightWithSpareNextUsedSet](int) 
AS RANGE RIGHT FOR VALUES (200801, 200802, 200803, 200804)

CREATE PARTITION SCHEME [psRightWithSpareNextUsedSet] 
AS PARTITION[pfRightWithSpareNextUsedSet] 
TO([Primary],[PartFG1], [PartFG2], [PartFG3], [PartFG4],  [PartFG5])
GO

Four partition functions and schemes, for the left schemes the next used file group appears on the far right when the next used it pre-set. When you use this mechanisms to preset, it issues a message for the operation:

Partition scheme 'psLeftWithNextUsedSet' has been created successfully. 
'PartFG5' is marked as the next used filegroup 
in partition scheme 'psLeftWithNextUsedSet'. 
Partition scheme 'psRightWithSpareNextUsedSet' has been created successfully. 
'PartFG5' is marked as the next used filegroup 
in partition scheme 'psRightWithSpareNextUsedSet'.

The trick to finding the next used is to look for the mismatching record – if you join the partition function values to the partition scheme there are scheme entries that do not have a corresponding partition function value, since it has not been set by performing the split.

Lets start with the Left based partition and compare the two.

 select FG.Nameas FileGroupName
    , dds.destination_id
    , dds.data_space_id
    , prv.value, ps.Name
 from sys.partition_schemes PS
 inner join sys.destination_data_spaces as DDS 
    on DDS.partition_scheme_id = PS.data_space_id
 inner join sys.filegroups as FG 
    on FG.data_space_id = DDS.data_space_ID 
 left join sys.partition_range_values as PRV 
    on PRV.Boundary_ID = DDS.destination_id and prv.function_id=ps.function_id 
 where PS.name = 'psLeftNoSpare'

The output is:

FileGroupName destination_id data_space_id value Name
PartFG1 1 2 200801 psLeftNoSpare
PartFG2 2 3 200802 psLeftNoSpare
PartFG3 3 4 200803 psLeftNoSpare
PartFG4 4 5 200804 psLeftNoSpare
PRIMARY 5 1 NULL psLeftNoSpare

And check the other Left defined partition:

 select FG.Name asFileGroupName
    , dds.destination_id
    , dds.data_space_id
    , prv.value, ps.Name  
from sys.partition_schemes PS
 inner join sys.destination_data_spaces as DDS 
    on DDS.partition_scheme_id = PS.data_space_id
 inner join sys.filegroups as FG 
    on FG.data_space_id = DDS.data_space_ID 
 left join sys.partition_range_values as PRV 
    on PRV.Boundary_ID = DDS.destination_id and prv.function_id=ps.function_id 
 where PS.name = 'psLeftWithNextUsedSet'

Results:

FileGroupName destination_id data_space_id value Name
PartFG1 1 2 200801 psLeftNoSpare
PartFG2 2 3 200802 psLeftNoSpare
PartFG3 3 4 200803 psLeftNoSpare
PartFG4 4 5 200804 psLeftNoSpare
PRIMARY 5 1 NULL psLeftNoSpare
PartFG5 6 6 NULL psLeftNoSpare

And the difference appears, the next used appears as the highest destination ID, but only when there are two data spaces assigned to the partition that do not have a value. The ‘primary’ entry that shows up as null is due to the partition scheme running from -ve infinity to +ve infinity, and whilst 4 lines are drawn on the number line, that divides the number line into 5 sections.

Running the same statements for the partitions declared using the right shows similar results, but the Primary entry is given destination 5 whilst the spare shows on 6 again.

The query is slightly awkward in that it must pick up the second entry of the list if it exists, using ‘orderings and tops’ will not pull the result we need.

select FileGroupName, Destination_ID, Data_Space_ID, Name  from
(
  select  FG.Name as FileGroupName
   , dds.destination_id
   , dds.data_space_id
   , prv.value
   , ps.Name
   , RANK() OVER (PARTITION BY ps.name order by dds.destination_Id) as dest_rank
  from sys.partition_schemes PS
  inner join sys.destination_data_spaces as DDS 
    on DDS.partition_scheme_id = PS.data_space_id
  inner join sys.filegroups as FG 
    on FG.data_space_id = DDS.data_space_ID 
  left join sys.partition_range_values as PRV 
    on PRV.Boundary_ID = DDS.destination_id and prv.function_id=ps.function_id 
  where prv.Value is null
 ) as a
 where dest_rank = 2

Results:

FileGroupName destination_id data_space_id Name
PartFG5 6 6 psLeftWithNextUsedSet
PartFG5 6 6 psRightWithSpareNextUsedSet

To test whether it picks up setting the next used, let’s set it on the partitions that did not previously have it.

ALTER PARTITION SCHEME psLeftNoSpare NEXT USED [PartFG5]
ALTER PARTITION SCHEME psRightNoSpare NEXT USED [PartFG5]

And re-run the query

FileGroupName destination_id data_space_id Name
PartFG5 6 6 psLeftNoSpare
PartFG5 6 6 psLeftWithNextUsedSet
PartFG5 6 6 psRightNoSpare
PartFG5 6 6 psRightWithSpareNextUsedSet

To make it re-usable, I switched the query into a view

create view NextUseFileGroups
as
select FileGroupName, Destination_ID, Data_Space_ID, Name
 from
 (
   select  FG.Name as FileGroupName
    , dds.destination_id
    , dds.data_space_id, prv.value, ps.Name,
   RANK() OVER (PARTITION BY ps.name order by dds.destination_Id) as dest_rank
   from sys.partition_schemes PS
   inner join sys.destination_data_spaces as DDS
     on DDS.partition_scheme_id = PS.data_space_id
   inner join sys.filegroups as FG
     on FG.data_space_id = DDS.data_space_ID 
   left join sys.partition_range_values as PRV
     on PRV.Boundary_ID = DDS.destination_id and prv.function_id=ps.function_id 
   where prv.Value is null
 ) as a
 where dest_rank = 2

And a final check with removing the setting – you can blank a set next used value by specifying no value in the statement.

ALTER PARTITION SCHEME psLeftNoSpare NEXT USED
ALTER PARTITION SCHEME psRightNoSpare NEXT USED

Select from the view and the two file partition schemes / file groups no longer show up in the list as intended.

So finding out the ‘next used’ setting is possible, although there really is no need in normal operation of the partition window to have to find out, but as an investigative tool it could be useful.

The scripts were tested on both 2005 and 2008, so are good for both in terms of testing, or using the view.

When is MaxDop not MaxDop?

September 27, 2009 Leave a comment

MaxDop is in some sense a bit of a misnomer, in that you would think ‘Max Degree of Parallelism’, set by the system administrator would be the last and final word on the matter; That is your maximum, and there are no choices to be made.

However, whilst at the SQL Immersion event in Dublin hosted by Prodata I had made an off hand comment about increasing the thread count beyond the MaxDop setting whilst creating an on-line index on an OLTP based system that had MaxDop 1 set.

That gained me some quizzical looks, in that most assume MaxDop is set and has to be adhered to, so surely what I was indicating was not possible? Well – yes it is, and there is even a KB that relates to it and the difference between SQL Server 2000 /2005 and 2008. http://support.microsoft.com/default.aspx/kb/329204

I should also mention that the BoL is less than precise about the situation, against SQL 2008 the BoL states for ‘Max Degree of Parallelism Option’:

 Set max degree of parallelism to 1 to suppress parallel plan generation.

 And against the ‘Degree of Parallelism Page’ :

For example, you can use the MAXDOP option to control, by extending or reducing, the number of processors dedicated to an online index operation. In this way, you can balance the resources used by an index operation with those of the concurrent users.

So that’s entirely clear…

You can infer (aside from the KB) that something is not what it seems when  it states  ‘by extending or reducing’ – how can you extend it? that would not make logical sense if it was a hard limit.

So we have a slightly bizarre situation in which you can override the server level setting with your own value. Initial thought is, this is a bit dangerous isn’t it?

Would you want the average user or developer to start implementing the appropriate query hint and override you? They would get more CPU so they wouldn’t hesitate (but not guarenteed better performance for it, since they would all be doing it)

To perform a test, I used a large partitioned table since the parallelism that occurs when querying a partitioned table is far easier to predict and engineer specifix scenarios to. The table had ~688 million rows in it and the selection was asking for a simple row count from 2 of the available partitions. The server used was an 8-core.

Select Count(*) From LargePartitionedTable Where MyPartitionColumn in (200801, 200901)

The baseline case was setting the server to a maxdop of 1

sp_configure 'max degree of parallelism', 1
go
reconfigure
go

The query plan is not particularily complex and the two key parts are that the clustered index scan. 

MaxDop1QueryPlanMaxDop1ScanProperties

 The baseline case of using MaxDop 0 was then set up and the same query run. 

sp_configure 'max degree of parallelism', 0
go
reconfigure
go

MaxDop0QueryPlanMaxDop0ScanProperties

As expected, the query ran up 8 threads (8 core machine) and only 2 of the threads processed any work, since SQL 2005 threading model is to assigned 1 thread per partition, when more than 1 partitioned is requested.

So the base scenarios are in place, now the query is altered slightly to override the MaxDop upwards.

sp_configure ‘max degree of parallelism’, 1
go
reconfigure
go
Select Count(*) From tblODS Where TimelineID in (200801, 200802) Option (maxdop 4) 

If the override works, the thread count will go up from 1, and as per the KB article it does.

MaxDop4QueryPlanMaxDop4ScanProperties

So the MaxDop set by an administrator does not need to be obeyed – but does this apply to everyone? That is really the key issue and unfortunately it does not appear to fall the way you would wish it to. Placing the SQL into Mixed Authentication mode I created a user account, and gave it only the permission to select from the table and the show plan permission, nothing else.

Result?  The query parellelled again, indicating there is no special permission required for a user to overide the setting.

The facility to override it is useful, on an OLTP where you want to increase the indexing speed by using more threads it is essential – but to not require it as a granted permission to the service accounts / jobs performing those tasks – that seems slightly bizarre.

These tests were performed using SQL 2005, so I need to test SQL 2008 to see if the lack of controls continue to exist.