Archive

Archive for October 4, 2009

Do Filtered Index Have Different Tipping Points?

October 4, 2009 Leave a comment

I’ve been taking a look at filtered indexes and they do initially look particularly useful, but I was wandering what effect the filtered index has on the tipping point. The tipping point was described by Kimberley L. Tripp in an excellent post – and describes the point at which SQL opts to change the query plan strategy from a NC index seek to a scan of some kind (Table / Clustered Index)

So I was wondering whether Filtered indexes chose a different strategy, or would they ‘tip’ from a targeted index to a table scan in the same way.

First off was to create a dummy table:

create tabletblPersonnel ( 
PersonnelID Int Identity, 
FirstName Char(30), 
LastName Char(30),
Department Char(30), 
SomePadding Char(250)
)

This was deliberately created as a heap, since data had to be generated and loaded:

insert into tblPersonnel(firstname,lastname,Department) values( dbo.randomstring(20), dbo.randomstring(20), dbo.randomstring(20)); go 1000000

The dbo.randomstring is a user created function I added to generate a random string [A-Z] the length of the function parameter, so the insert statement is inserting random strings of length 20 into the first name, last name and department fields.

Once the data was loaded, a clustered index was applied.

alter table dbo.tblPersonnel add constraint
PK_tblPersonnel primary key clustered(PersonnelID) with( statistics_norecompute = off, ignore_dup_key = off, allow_row_locks = on, allow_page_locks = on) on [PRIMARY]
GO

A quick check of the table:

select COUNT(*) from tblpersonnel
-----------
1000000
(1 row(s) affected)

To get a rough idea of the tipping point we need to know the number of pages at the leaf level of the table. 

select * from sys.objects where name ='tblpersonnel'
-- grab the object id and query the physical stats.
select * from sys.dm_db_index_physical_stats(DB_ID('FilteredIndexTest'), 1117247035 , 1,0,'DETAILED')
-- 45455 pages shown in the table at leaf level 0 i.e. the clustered index

So create an NC index on the first name, last name.

create nonclustered index[IX_Normal] ON [dbo].[tblPersonnel] (  [FirstName] asc,  [LastName] asc
) with( pad_index = off ,statistics_norecompute = off, ignore_dup_key = off, allow_row_locks = on, allow_page_locks = on) on [PRIMARY]
GO

For reference, we will also check how many pages the NC index has:

select * from sys.dm_db_index_physical_stats(DB_ID('FilteredIndexTest'), 1117247035 , 2,0,'DETAILED')
-- Level 0 of the index shows 8696 pages

The tipping point is expected between 25% to 33% of the number of pages when expressed as rows, so something between 11363 to 15000 rows is where to expect it. Since the data was random when inserted, a bit of human binary chop is needed to find the tipping point. After some experimentation – the dividing point was between the firstname like ‘a[a-h]%’ and ‘a[a-i]%’

select * from tblpersonnel
where firstname like 'a[a-h]%'
-- 11898 rows - seeks
select * from tblpersonnel
where firstname like 'a[a-i]%'
-- 13299 rows - scans

So the tipping point showed up within the expected range, so next I created the filtered index based on FirstName ‘>=a’ and ‘<= ak’ since I can not use a like clause in the Filtered Index where statement.

create nonclustered index [IX_Filter] ON[dbo].[tblPersonnel] (  [FirstName] asc,  [LastName] asc
)
where[FirstName] >= 'a'and firstname <= 'ak'
) with( pad_index = off ,statistics_norecompute = off, ignore_dup_key = off, allow_row_locks = on, allow_page_locks = on) on [PRIMARY]
 GO

The filtered index is in play, so I disabled the non-filtered index and re-ran the same queries.

select * from tblpersonnel
where firstname like 'a[a-h]%'
-- 11898 rows - seeks
select * from tblpersonnel
where firstname like 'a[a-i]%'
-- 13299 rows - scans

Conclusion: The tipping point did not alter based on the filtered index – it still tips based on the ratio of rows to total pages in the leaf level of the clustered index.