<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>SQL Fascination &#187; Andrew Hogg</title>
	<atom:link href="http://sqlfascination.com/author/andrewhogg/feed/" rel="self" type="application/rss+xml" />
	<link>http://sqlfascination.com</link>
	<description>Weirdness and oddities within SQL</description>
	<lastBuildDate>Tue, 22 May 2012 22:43:53 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='sqlfascination.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>SQL Fascination &#187; Andrew Hogg</title>
		<link>http://sqlfascination.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://sqlfascination.com/osd.xml" title="SQL Fascination" />
	<atom:link rel='hub' href='http://sqlfascination.com/?pushpress=hub'/>
		<item>
		<title>Oracle : Recursive Common Table Expressions and Parallelism</title>
		<link>http://sqlfascination.com/2012/04/14/oracle-recursive-common-table-expressions-and-parallelism/</link>
		<comments>http://sqlfascination.com/2012/04/14/oracle-recursive-common-table-expressions-and-parallelism/#comments</comments>
		<pubDate>Sat, 14 Apr 2012 13:01:28 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[CTE]]></category>
		<category><![CDATA[Oracle 11gr1]]></category>
		<category><![CDATA[Oracle 11gr2]]></category>
		<category><![CDATA[Parallelism]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[RCTE]]></category>
		<category><![CDATA[Recursive Subquery Refactoring]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=621</guid>
		<description><![CDATA[Sometimes, no matter how hard you search you just can&#8217;t find an answer &#8211; that was the problem this week. Oracle&#8217;s recursive common table expressions (RCTE), or Recursive Sub Query Refactoring to put it in Oracle&#8217;s terms were proving to be pretty bad on performance. (Hopefully, the next person searching will now find this answer.) [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=621&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Sometimes, no matter how hard you search you just can&#8217;t find an answer &#8211; that was the problem this week. Oracle&#8217;s recursive common table expressions (RCTE), or Recursive Sub Query Refactoring to put it in Oracle&#8217;s terms were proving to be pretty bad on performance. (Hopefully, the next person searching will now find this answer.)</p>
<p>As feature&#8217;s go, this one is should be a relatively well-known feature &#8211; it&#8217;s part of the ANSI SQL-99 standard and available in a number of RDBMs, with near identical implementation on the syntax.</p>
<p>Even the esteemed Mr Kyte has changed his position on RCTE&#8217;s from being more code and harder to understand than the CONNECT BY syntax, to being a somewhat useful feature.</p>
<p>So what was the question to which we could find no answer?</p>
<p>Why does a RCTE seem to ignore parallel hints?</p>
<p>Amazingly, we can&#8217;t find anything documented about this against RCTE&#8217;s themselves or in the parallelism sections of the documentation. No mention of restrictions of parallelism on RCTE&#8217;s appear anywhere.</p>
<p>We have quite a complex example but needed a simple scenario to submit to Oracle to get an answer. Kudos for the krufting of this goes to Phil Miesle &#8211; it was his turn to deal with Oracle support.</p>
<p>First, create a numbers table, and fill it with data, we even used a RCTE to do that part.</p>
<pre>create table vals (n ,constraint vals_pk primary key (n) ) 
as 
with numbers(n) as (
select 1 as n    
from dual   
union all  
select n+1 as n    
from numbers   
where n &lt; 1000000 
)
select n from numbers;</pre>
<p>We now need a data table, that is basically going to act as a hierarchy, for us to test the RCTE against. A simple parent / child table is suffice:</p>
<pre>create table pairs (
  parent 
  ,child 
  ,constraint pairs_pk primary key (parent,child) ) 
as 
select 
  'B'||to_char(mod(n,100000)+1) as parent      
  ,'A'||to_char(n) as child   from vals;</pre>
<p>Using the numbers table, the table now contains A1 to 1000000 linking to B1 to B100000 &#8211; so in effect, every B has 10 A&#8217;s linked to it.</p>
<p>This then continues, with every 10 B&#8217;s linking to a C:</p>
<pre>insert into pairs 
select distinct        
  'C'||to_char(mod(n,10000)+1) as parent      
   ,'B'||to_char(mod(n,100000)+1) as child  
from vals;</pre>
<p>And so on, with each successive layer having a 10 to 1 ratio:</p>
<pre>insert into pairs 
select distinct       
  'D'||to_char(mod(n,1000)+1) as parent      
  ,'C'||to_char(mod(n,10000)+1) as child  
from vals;

insert into pairs 
select distinct       
  'E'||to_char(mod(n,100)+1) as parent       
  ,'D'||to_char(mod(n,1000)+1) as child  
from vals;

insert into pairs 
select distinct        
  'F'||to_char(mod(n,10)+1) as parent      
  ,'E'||to_char(mod(n,100)+1) as child  
from vals;

insert into pairs 
select distinct       
  'G'||to_char(mod(n,1)+1) as parent      
  ,'F'||to_char(mod(n,10)+1) as child  
from vals;
commit;</pre>
<p>And finally we have G1 linking to F1 to F10 &#8211; so the structure is clearly a very basic tree.</p>
<p>Gather the stats to make sure the optimizer is going to have a half chance at a decent plan.</p>
<pre>begin 
dbms_stats.gather_table_stats(ownname=&gt;user,tabname=&gt;'PAIRS'); 
end; 
/</pre>
<p>So for the test cast query, we wished to generate all the possible paths within the tree &#8211; which is in effect a non-cyclical directed graph. This is the ideal scenario for connect by / RCTE to perform its magic, I need to recurse the dataset in a single set based statement.</p>
<pre>explain plan for
select count(*)
from (
  with e (parent,child) 
  as (
    select parent,child
    from pairs
    union all
    select e.parent,pairs.child
    from e
    join pairs on e.child = pairs.parent
  )
  select * from e
);</pre>
<p>Grab the plan output using</p>
<pre>select * from table(dbms_xplan.display)</pre>
<p>- which I&#8217;ve included the important bit below:</p>
<pre>---------------------------------------------------- 
| Id  | Operation                                  |
---------------------------------------------------- 
|   0 | SELECT STATEMENT                           |
|   1 |  SORT AGGREGATE                            |
|   2 |   VIEW                                     |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|
|   4 |     TABLE ACCESS FULL                      |
|*  5 |     HASH JOIN                              |
|   6 |      TABLE ACCESS FULL                     |
|   7 |      RECURSIVE WITH PUMP                   |
----------------------------------------------------</pre>
<p>There is nothing shocking or unusual about the plan, it is what we would expect to see. So let&#8217;s now add some parallelism to the query:</p>
<pre>explain plan for
select count(*)
from (
  with e (parent,child) 
  as (
    select /*+ parallel(pairs,4) */
      parent,child
    from pairs
    union all
    select  /*+ parallel(pairs,4) parallel(e,4) */ 
      e.parent,pairs.child
    from e
    join pairs on e.child = pairs.parent
  )
  select * from e
);</pre>
<p>The expected effect on the plan should be that we would see parallelism operations against both table accesses.</p>
<pre>----------------------------------------------------
| Id  | Operation                                  |
----------------------------------------------------
|   0 | SELECT STATEMENT                           |
|   1 |  SORT AGGREGATE                            |
|   2 |   VIEW                                     |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|
|   4 |     PX COORDINATOR                         |
|   5 |      PX SEND QC (RANDOM)                   |
|   6 |       PX BLOCK ITERATOR                    |
|   7 |        TABLE ACCESS FULL                   |
|*  8 |     HASH JOIN                              |
|   9 |      TABLE ACCESS FULL                     |
|  10 |      RECURSIVE WITH PUMP                   |
----------------------------------------------------</pre>
<p>This now shows us the problem, you can see the PX Co-ordinator is present within the anchor clause of the RCTE, but there is no parallelism listed against the recursion. At first we though it might be ignoring the hints for some reason, but the following idea disproved that theory immediately.</p>
<pre>explain plan for
select count(*)
from (
  with e (parent,child) 
  as (
    select /*+ parallel(pairs,4) */
      parent,child
    from pairs
    union all
    select  /*+ parallel(pairs,4) parallel(e,4) use_merge(e,pairs) */ 
      e.parent,pairs.child
    from e
    join pairs on e.child = pairs.parent
  )
  select * from e
);</pre>
<p>The plan altered to use the merge hint as follows:</p>
<pre>---------------------------------------------------- 
| Id  | Operation                                  |
| --------------------------------------------------
|   0 | SELECT STATEMENT                           |
|   1 |  SORT AGGREGATE                            |
|   2 |   VIEW                                     |
|   3 |    UNION ALL (RECURSIVE WITH) BREADTH FIRST|
|   4 |     PX COORDINATOR                         |
|   5 |      PX SEND QC (RANDOM)                   |
|   6 |       PX BLOCK ITERATOR                    |
|   7 |        TABLE ACCESS FULL                   |
|   8 |     MERGE JOIN                             |
|   9 |      SORT JOIN                             |
|  10 |       RECURSIVE WITH PUMP                  |
|* 11 |      SORT JOIN                             |
|  12 |       TABLE ACCESS FULL                    |
----------------------------------------------------</pre>
<p>The explain plan is of course a terrible plan &#8211; there would be no reason to use a merge join, but the fact it appears in the plan demonstrates the hints on the recursion clause are being read by the query engine and that it chose to discard the parallelism ones.</p>
<p>Given this example &#8211; an SR was raised to find out why the performance is so bad, and are we looking at a bug? If it was a bug, then we could look for a fix of some kind.</p>
<p>The test case was accepted and reproduced inside Oracle very efficiently &#8211; it was given to the parallel query department to determine what was the problem.</p>
<p>The response back?</p>
<blockquote><p><em>This is the expected behavior. Oracle does not parallelize the iterations sub-query of connect-by/recursive-with clause.</em></p></blockquote>
<p>That&#8217;s the last thing we wanted to hear &#8211; &#8216;by design&#8217;. It&#8217;s by design that this feature is going to be incredibly slow on larger data sets. That&#8217;s not so much as &#8216;design&#8217; as rendering RCTE&#8217;s useless in Oracle unless you have small data sets, or don&#8217;t mind waiting around for a long time to get answers back.</p>
<p>We were already close to ditching any use of the RCTE syntax, this fully nailed the coffin shut on that feature.<br />
(The other reason we are still looking to sort out the test case for &#8211; but we have witnessed problems with RCTEs contained within a view. When the view is joined to and accessed with a predicate against the view, we have seen the predicate pushed into the recursion &#8211; which results in an incorrect answer. The predicate pushing cuts the recursion short in effect. We had worked around this &#8211; but it was an annoying bug.)</p>
<p>Oracle stalwarts will consider that we were foolish to use the RCTE&#8217;s over oracle connect by syntax &#8211; except that we were not. An RCTE can do far more complex recursion than the Connect By can do, and for the specific instance we wanted to use it, that complexity was required</p>
<p>Another reason for trying to go down that route was performance, because the connect by clause is no better at parallelism:</p>
<pre>explain plan for 
select count(*) 
from 
(  
  select /*+ parallel(pairs,4) */         
    parent,child    
  from pairs   
  start with parent = 'G1' 
  connect by parent = prior child 
);

-------------------------------------
| Id  | Operation                   | 
-------------------------------------
|   0 | SELECT STATEMENT            |
|   1 |  SORT AGGREGATE             |
|   2 |   VIEW                      |
|*  3 |    CONNECT BY WITH FILTERING|
|*  4 |     INDEX RANGE SCAN        |
|   5 |     NESTED LOOPS            |
|   6 |      CONNECT BY PUMP        |
|*  7 |      INDEX RANGE SCAN       |
-------------------------------------</pre>
<p>The plan is no better for using a CONNECT BY - but from a performance perspective  the connect by clause is clearly faster when we ran some comparisons.</p>
<p>So the verdict on Oracle and RCTE / Recursive Sub-Query Refactoring &#8211; excellent language feature &#8211; unscalable performance &#8211; will refuse to parallel the recursion - very useless for those of us in the VLDB world.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/oracle/'>Oracle</a> Tagged: <a href='http://sqlfascination.com/tag/cte/'>CTE</a>, <a href='http://sqlfascination.com/tag/oracle-11gr1/'>Oracle 11gr1</a>, <a href='http://sqlfascination.com/tag/oracle-11gr2/'>Oracle 11gr2</a>, <a href='http://sqlfascination.com/tag/parallelism/'>Parallelism</a>, <a href='http://sqlfascination.com/tag/performance/'>Performance</a>, <a href='http://sqlfascination.com/tag/rcte/'>RCTE</a>, <a href='http://sqlfascination.com/tag/recursive-subquery-refactoring/'>Recursive Subquery Refactoring</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/621/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/621/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/621/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/621/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/621/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/621/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/621/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/621/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/621/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/621/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/621/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/621/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/621/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/621/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=621&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2012/04/14/oracle-recursive-common-table-expressions-and-parallelism/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Oracle : Duplicate GUID values being returned from sys_guid() when run in parallel</title>
		<link>http://sqlfascination.com/2012/01/22/oracle-duplicate-guid-values-being-returned-from-sys_guid-when-run-in-parallel/</link>
		<comments>http://sqlfascination.com/2012/01/22/oracle-duplicate-guid-values-being-returned-from-sys_guid-when-run-in-parallel/#comments</comments>
		<pubDate>Sun, 22 Jan 2012 12:31:59 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Oracle 11gr1]]></category>
		<category><![CDATA[sys_guid]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=602</guid>
		<description><![CDATA[A post? yes, it&#8217;s been a while and because I am having to spend all my time on Oracle these days &#8211; it&#8217;s a post relating to a problem in Oracle. I had to construct a test case recently to try track down a primary key failure. The primary key was a sys_guid value and the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=602&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A post? yes, it&#8217;s been a while and because I am having to spend all my time on Oracle these days &#8211; it&#8217;s a post relating to a problem in Oracle.</p>
<p>I had to construct a test case recently to try track down a primary key failure. The primary key was a sys_guid value and the failure was coming from the insertion of new values, that didn&#8217;t make much sense since the odds of a collision on a GUID should be astronomically high &#8211; assuming they used an up to date algorithm. Even with those astronomical odds, primary key failures were occurring very regularly, so the immediate suspicion is that the sys_guid algorithm in Oracle is not up to date and not-consistent across all platforms. It can return GUIDs that appear totally random, or GUIDs that are clearly within a sequence. It&#8217;s easy enough to test any individual platform to see how it behaves:</p>
<pre>select sys_guid from dual
union
select sys_guid() from dual;
SYS_GUID()
--------------------------------
B71D52B1531167D9E040760ADD7E0B80
B71D52B1531267D9E040760ADD7E0B80</pre>
<p>12th character in has increased by one, the rest of the guid remains identical.</p>
<p>This isn&#8217;t too surprising, the documentation is delightfully vague in using the term &#8216;most&#8217;:</p>
<p><em><code>SYS_GUID</code> generates and returns a globally unique identifier (<code>RAW</code> value) made up of 16 bytes. On most platforms, the generated identifier consists of a host identifier, a process or thread identifier of the process or thread invoking the function, and a nonrepeating value (sequence of bytes) for that process or thread.</em></p>
<p>So &#8216;most&#8217; platforms will behave like this &#8211; that&#8217;s helpful documentation, thanks for that.</p>
<p>So back to the problem and test case &#8211; whenever I come across potential Oracle bugs, I have an immediate suspicion that parallelism is at play &#8211; this is just from the consistent experience of Oracle getting parallelism wrong within the database &#8211; I have multiple outstanding SR&#8217;s for various features when combined with parallelism causing failures &#8211; anything from ORA-600&#8242;s to incorrect data being returned. (Parallel + Pivot = missing columns, nice!).</p>
<p>When you have these GUIDs being generated in a pseudo sequence, it makes sense that adding parallelism is a recipe for disaster, since the parallel slaves would all have to communicate and co-ordinate to ensure that they did not duplicate values in that sequence. After many hours whittling down the original statement, I was able to construct a repeatable test case to finally submit to Oracle for fixing &#8211; the shocking part is how trivial it was to demonstrate the problem on a specific AIX environment.</p>
<p>So let&#8217;s walk through the test case, firstly, create a numbers table:</p>
<pre>create table n (c1 number);</pre>
<p>..and populate it:</p>
<pre>begin  
  for i in 1..30 loop   
    insert into n     
      select i*100000 + level from dual connect by level&lt;=100000;  
  end loop;
  commit;
end;
/</pre>
<p>This just populates the table with 3 million rows, 30 iterations of 100k rows, it&#8217;s a bit faster to do it that way than populate it in a single statement &#8211; the connect by level goes slower as the number rises.</p>
<p>That is all we need for the set up, the test code is pretty simple but I will explain it:</p>
<pre>declare 
  e number := 0;
begin  
  for i in 1..10 loop
    begin      
      select count(*) into e      
      FROM (        
        select sid, count(*)        
        from (
          select /*+ parallel(n,40) */                
            sys_guid() as sid              
          from n              
        )        
        group by sid        
        having count(*) &gt; 1      
      ) t;    
    exception      
      when no_data_found then null;      
      when others then raise;    
    end;            
    if e&gt;0 then raise_application_error(-20000
        ,e||' duplicates found in iteration '||i); end if;
  end loop;
end; /</pre>
<p>The easiest way to explain this is from the inside out &#8211; the inner most query generates 3 million sys_guid values by selecting from the numbers table and asking for a sys_guid value per row &#8211; the statement is given a parallel hint.</p>
<p>We then perform an outer select that group&#8217;s by the SID (Sys guID) values, and uses a having count(*) &gt; 1 clause to only show duplicates. Under normal conditions this of course should return 0 rows at that point, since every sys_guid generated should be unique. The next outer select count&#8217;s up how many instances of duplicates occurred and finally places this into a variable e.</p>
<p>If e is ever greater than 0, we have encountered a duplicate and an error will be raised.</p>
<p>When run on an AIX box with SMT enabled, the error does get raised.</p>
<pre>202148 duplicates found in iteration 1</pre>
<p>The number of duplicates changes per run and seems to have no pattern; it can be anything from about ~40k duplicates up to ~250k duplicates. If you take the parallel hint out of the script, it never fails. So it is clearly linked to the simultanesous creation of sys_guid values.</p>
<p>As yet, Oracle have not been able to reproduce this themselves which is indicating that this is a platform specific bug, but the client&#8217;s DBA&#8217;s have been provided the script and have seen it churn out duplicates time and time again, much to their amazement. They really should use a better algorithm, having such a predictable sequentially guid as their default guid for &#8216;most&#8217; platforms is less than ideal.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/oracle/'>Oracle</a> Tagged: <a href='http://sqlfascination.com/tag/oracle-11gr1/'>Oracle 11gr1</a>, <a href='http://sqlfascination.com/tag/sys_guid/'>sys_guid</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/602/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/602/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/602/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/602/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/602/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/602/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/602/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/602/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/602/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/602/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/602/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/602/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/602/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/602/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=602&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2012/01/22/oracle-duplicate-guid-values-being-returned-from-sys_guid-when-run-in-parallel/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>SQL Server Denali &#8211; Sequences</title>
		<link>http://sqlfascination.com/2011/01/09/sql-server-denali-sequences/</link>
		<comments>http://sqlfascination.com/2011/01/09/sql-server-denali-sequences/#comments</comments>
		<pubDate>Sun, 09 Jan 2011 15:40:47 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Identity]]></category>
		<category><![CDATA[SQL Server Denali]]></category>
		<category><![CDATA[T-SQL]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=582</guid>
		<description><![CDATA[Another new programmability feature added in the Denali CTP is &#8216;Sequences&#8217; &#8211; a concept very familiar to those of us who already deal with Oracle, but an unusual addition for SQL Server and one that makes me scratch my head thinking &#8211; why? We already have the identity column feature available to us within SQL [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=582&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Another new programmability feature added in the Denali CTP is &#8216;Sequences&#8217; &#8211; a concept very familiar to those of us who already deal with Oracle, but an unusual addition for SQL Server and one that makes me scratch my head thinking &#8211; why? We already have the identity column feature available to us within SQL Server but not available within Oracle, thus the need for sequences in Oracle. When using any identity / numeric key / FK mechanism it is important that the actual value for this identity has no actual relation to the data it represents other than an arbitrary number representing the row, if anything it would be a bad design to rely on the identity value be in sequence, or contiguous in any way. In SQL Server it is not guaranteed to be contiguous at all &#8211; transaction rollback or a value other than 1 for the increment for example will prevent it.</p>
<p>Sequences are primarily for when you wish to know a number in advance of using it, or perhaps you wish to use the same number for a number of records spread across tables (and thus relate them using that number.)</p>
<p>Looking at the syntax you can see that the SQL Server and Oracle Syntax are very similar and share the same keywords. </p>
<p>SQL Server Syntax:</p>
<pre>CREATE SEQUENCE [schema_name . ] sequence_name [ [ ,…n ] ] [ ; ]
::= {[ AS { built_in_integer_type | user-defined_integer_type}]    
START WITH        
INCREMENT BY        
{ MINVALUE | NO MINVALUE }        
{ MAXVALUE | NO MAXVALUE }        
{ CYCLE | NO CYCLE }        
{ CACHE [ ] | NO CACHE } }</pre>
<p>Oracle Syntax:</p>
<pre>CREATE SEQUENCE INCREMENT BY
START WITH
MAXVALUE / NOMAXVALUE
MINVALUE / NOMINVALUE
CYCLE / NOCYCLE
CACHE &lt;#&gt; / NOCACHE
ORDER / NOORDER </pre>
<p>Most of the keywords are self-explanatory, and from a comparison of syntax you can see that SQL Server and Oracle are pretty similar in terms of the syntax.</p>
<p>Most of the keywords are pretty self-explanatory, the one that makes me cringe the most is CYCLE &#8211; It&#8217;s bad enough using a sequence number instead of an identity, but even worse when you consider that it may not be unique. The advice there is to create an additional unique index on the field to prevent an insertion / updated from taking a duplicate &#8211; but that seems like a bit of a &#8216;fudge&#8217;, and instead of solving the real problem, works around it.</p>
<p>To add to the weirdness of the construct, you can even ask for a sequence based on an OVER clause, using the adventure works database as an example I created a sequence:</p>
<pre><span style="color:#0000ff;">create </span><span style="color:#008080;">sequence testSequence </span><span style="color:#0000ff;">as integer</span>
<span style="color:#008080;">start </span><span style="color:#0000ff;">with </span>1
<span style="color:#008080;">increment </span><span style="color:#0000ff;">by </span>1
<span style="color:#008080;">minvalue </span>1
<span style="color:#008080;">maxvalue </span>10000</pre>
<p>And then used it within a select statement as follows:</p>
<pre><span style="color:#0000ff;">select next </span><span style="color:#008080;">value </span><span style="color:#0000ff;">for </span><span style="color:#008080;">testSequence </span><span style="color:#0000ff;">over </span>(<span style="color:#0000ff;">order by </span><span style="color:#008080;">Name </span><span style="color:#0000ff;">asc</span>) <span style="color:#0000ff;">as </span><span style="color:#008080;">id</span>, <span style="color:#008080;">Name</span>
<span style="color:#0000ff;">from </span><span style="color:#008080;">Production.Product</span></pre>
<p>The results come back:</p>
<pre>1 Adjustable Race 2 All-Purpose Bike Stand 3 AWC Logo Cap 4 BB Ball Bearing ...</pre>
<p>In case you were thinking that was relatively useful, when you re-run the command, you of course are returned a different set of numbers, as the sequence does not restart, making this one of the weirdest features I have seen.</p>
<pre>505 Adjustable Race 506 All-Purpose Bike Stand 507 AWC Logo Cap 508 BB Ball Bearing ...</pre>
<p>If you attempt to place the order on the outside in the following manner, SQL Server will just throw an error. </p>
<pre><span style="color:#0000ff;">select next </span><span style="color:#008080;">value </span><span style="color:#0000ff;">for </span><span style="color:#008080;">testSequence </span><span style="color:#0000ff;">as </span><span style="color:#008080;">id</span>, <span style="color:#008080;">Name</span>
<span style="color:#0000ff;">from </span><span style="color:#008080;">Production</span>.<span style="color:#008080;">Product</span>
<span style="color:#0000ff;">order by </span><span style="color:#008080;">Name <span style="color:#0000ff;"><span style="color:#000000;">asc</span></span></span>
<span style="color:#ff0000;">Msg 11723, Level 15, State 1, Line 1</span>
<span style="color:#ff0000;">NEXT VALUE FOR function cannot be used directly in a statement that contains </span>
<span style="color:#ff0000;">an ORDER BY clause unless the OVER clause is specified.</span></pre>
<p>And to round off the errors you can expect to see when using this, when you run the sequence out of values, you will get:</p>
<pre><span style="color:#ff0000;">Msg 11728, Level 16, State 1, Line 1</span>
<span style="color:#ff0000;">The sequence object 'testSequence' has reached its minimum or maximum value. </span>
<span style="color:#ff0000;">Restart the sequence object to allow new values to be generated.</span></pre>
<p>Try create a sequence based on a numeric or decimal with some scale, such as numeric(6,2):</p>
<pre><span style="color:#ff0000;">Msg 11702, Level 16, State 2, Line 1</span>
<span style="color:#ff0000;">The sequence object 'testSequence' must be of data type int, bigint, smallint</span><span style="color:#ff0000;">,</span>
<span style="color:#ff0000;">tinyint, or decimal or numeric with a scale of 0, or any user-defined data type</span>
<span style="color:#ff0000;">that is based on one of the above integer data types.</span></pre>
<p>Or if you fail to get your starting value within the min and max boundaries you are setting:</p>
<pre><span style="color:#ff0000;">Msg 11703, Level 16, State 1, Line 1</span>
<span style="color:#ff0000;">The start value for sequence object 'testSequence' must be between the minimum</span>
<span style="color:#ff0000;">and maximum value of the sequence object.</span></pre>
<p>Overall sequences remain a bit of a niche feature for me in SQL Server, I just can not see any normal everyday activity needing to use them, although it would make porting of applications between Oracle and SQL Server a bit easier since they will both be able to use them.</p>
<p>In that kind of situation though I would still prefer the GUID mechanisms that we have available to us. They have the same benefits of being able to know a record ID in advance of using it as well as the ability to be stored in either database. It also has the added advantage of being able to be created whilst offline from the database, something a sequence can not do.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/identity/'>Identity</a>, <a href='http://sqlfascination.com/tag/sql-server-denali/'>SQL Server Denali</a>, <a href='http://sqlfascination.com/tag/t-sql/'>T-SQL</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/582/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/582/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/582/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/582/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/582/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/582/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/582/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/582/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/582/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/582/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/582/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/582/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/582/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/582/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=582&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2011/01/09/sql-server-denali-sequences/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Which User Made That Change?</title>
		<link>http://sqlfascination.com/2011/01/08/which-user-made-that-change/</link>
		<comments>http://sqlfascination.com/2011/01/08/which-user-made-that-change/#comments</comments>
		<pubDate>Sat, 08 Jan 2011 15:28:02 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[SQL Server Denali]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=576</guid>
		<description><![CDATA[If you have spent any time tinkering about in the transaction log, you will of already come across a bit of a problem when trying to decide what was done and by whom &#8211; the &#8216;what part&#8217; I have decoded in a few posts, but the &#8216;whom&#8217; part is a lot harder. As far as [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=576&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>If you have spent any time tinkering about in the transaction log, you will of already come across a bit of a problem when trying to decide what was done and by whom &#8211; the &#8216;what part&#8217; I have decoded in a few posts, but the &#8216;whom&#8217; part is a lot harder. As far as I can tell the log only contains the SPID of the user who opened the transaction, and does not give us any indication as to who that user really was.</p>
<p>From an actual investigative perspective this is a bit of a painful exercise, I can see a row was deleted but to find out who / what did that actual deletion I would have to start examining either the SQL Server logs or the Windows Server Logs. The default behaviour of SQL Server security though is to only log failed login attempts so the successful ones will not show up by default &#8211; to get those appearing you need to change your SQL Server security settings. You can access these logs from the SQL management studio using either the xp_readerrorlogs or sp_readerrorlogs procedures although the nature of the log and textural values make it difficult to then combine in a set based manner - I can humanly read the values but machine reading them for any purpose is a bit of a pain &#8211; there is also the issue that those logs will be cycled &#8211; and the old logs could well be completely offline.</p>
<p>So I would prefer an easier solution, keeping a record of the logins within the database regardless of the SQL Server security settings, and being in a form that allows me to use a bit more of a set based solution against it. To start with, we will need a table to store the information available to us during the logon process:</p>
<pre><span style="color:#0000ff;">create table master</span><span style="color:#339966;">.dbo.spidArchive </span>(
 <span style="color:#339966;">LoginTime  </span><span style="color:#0000ff;">datetime2</span>(7)
 ,<span style="color:#339966;">SPID   </span><span style="color:#0000ff;">integer</span>
 ,<span style="color:#339966;">ServerName  </span><span style="color:#0000ff;">nvarchar</span>(100)
 ,<span style="color:#339966;">LoginName  </span><span style="color:#0000ff;">nvarchar</span>(100)
 ,<span style="color:#339966;">LoginType  </span><span style="color:#0000ff;">nvarchar</span>(100)
 ,<span style="color:#339966;">LoginSID  </span><span style="color:#0000ff;">nvarchar</span>(100)
 ,<span style="color:#339966;">ClientHost </span><span style="color:#0000ff;">nvarchar</span>(100)
 ,<span style="color:#339966;">IsPooled  </span><span style="color:#0000ff;">tinyint</span>
)</pre>
<p>The spidArchive table here is created in the master database so that it can cover the connections for any of the databases. You can see we have access to a lot of useful information, not just who executed the command, but from which machine they logged in from. The next step is to get SQL Server to add a row to the table every time a login occurs &#8211; from SQL Server 2005 onwards we have had access to DDL triggers as well as DML triggers and have the ability to intercept a number of non-DML events.</p>
<pre><span style="color:#0000ff;">create trigger </span><span style="color:#339966;">spidLogin</span> <span style="color:#0000ff;">on </span>all <span style="color:#0000ff;">server</span>
<span style="color:#0000ff;">after </span><span style="color:#339966;">logon</span>
<span style="color:#0000ff;">as</span>
 <span style="color:#0000ff;">declare </span><span style="color:#339966;">@eventdata </span><span style="color:#0000ff;">xml</span>;
 <span style="color:#0000ff;">set </span><span style="color:#339966;">@eventdata </span>= <span style="color:#ff00ff;">EVENTDATA</span>();

 <span style="color:#0000ff;">INSERT INTO master</span>.<span style="color:#339966;">dbo.spidArchive</span>
 (
  <span style="color:#339966;">LoginTime</span>
  ,<span style="color:#339966;">SPID   </span>
  ,<span style="color:#339966;">ServerName  </span>
  ,<span style="color:#339966;">LoginName  </span>
  ,<span style="color:#339966;">LoginType  </span>
  ,<span style="color:#339966;">LoginSID  </span>
  ,<span style="color:#339966;">ClientHost </span>
  ,<span style="color:#339966;">IsPooled</span>
 )
 <span style="color:#0000ff;">VALUES </span>
 (
  <span style="color:#33cccc;">@eventdata.value</span>(<span style="color:#ff0000;">'(/EVENT_INSTANCE/PostTime)[1]'</span>,<span style="color:#ff0000;">'datetime2(7)'</span>)
  ,<span style="color:#33cccc;">@eventdata.value</span>(<span style="color:#ff0000;">'(/EVENT_INSTANCE/SPID)[1]'</span>,<span style="color:#ff0000;">'nvarchar(100)'</span>)
  ,<span style="color:#33cccc;">@eventdata.value</span>(<span style="color:#ff0000;">'(/EVENT_INSTANCE/ServerName)[1]'</span>,<span style="color:#ff0000;">'nvarchar(100)'</span>)
  ,<span style="color:#33cccc;">@eventdata.value</span>(<span style="color:#ff0000;">'(/EVENT_INSTANCE/LoginName)[1]'</span>,<span style="color:#ff0000;">'nvarchar(100)'</span>)
  ,<span style="color:#33cccc;">@eventdata.value</span>(<span style="color:#ff0000;">'(/EVENT_INSTANCE/LoginType)[1]'</span>,<span style="color:#ff0000;">'nvarchar(100)'</span>)
  ,<span style="color:#33cccc;">@eventdata.value</span>(<span style="color:#ff0000;">'(/EVENT_INSTANCE/SID)[1]'</span>,<span style="color:#ff0000;">'nvarchar(100)'</span>)
  ,<span style="color:#33cccc;">@eventdata.value</span>(<span style="color:#ff0000;">'(/EVENT_INSTANCE/ClientHost)[1]'</span>,<span style="color:#ff0000;">'nvarchar(100)'</span>)
  ,<span style="color:#33cccc;">@eventdata.value</span>(<span style="color:#ff0000;">'(/EVENT_INSTANCE/IsPooled)[1]'</span>,<span style="color:#ff0000;">'tinyint'</span>)
 )</pre>
<p>During the login process, the EventData() function returns a fixed format XML fragment from which we can extract the values we seek and simply insert into our spidArchive table. Now we have a log being taken of all connections being established to the server, we can start using this to translate from a SPID to a user, even when the user is no longer connected &#8211; as long as we know the SPID and the time, we just need to look for the closest entry in the past for that SPID, and that will indicate which user was currently logged on at the time. This function should go in the master database again.</p>
<pre><span style="color:#0000ff;">CREATE FUNCTION</span> <span style="color:#339966;">dbo</span>.<span style="color:#339966;">ConvertSpidToName</span>(<span style="color:#339966;">@SPID</span> <span style="color:#0000ff;">integer</span>,<span style="color:#339966;"> @Date </span><span style="color:#0000ff;">datetime2</span>(7)) <span style="color:#0000ff;">RETURNS nvarchar</span>(100) <span style="color:#0000ff;">AS</span>
<span style="color:#0000ff;">BEGIN</span>
 <span style="color:#0000ff;">DECLARE </span><span style="color:#339966;">@name </span><span style="color:#0000ff;">nvarchar</span>(100)
 <span style="color:#0000ff;">SELECT TOP</span>(1)<span style="color:#339966;"> @name</span> = <span style="color:#339966;">LoginName</span>
 <span style="color:#0000ff;">FROM </span>master.<span style="color:#339966;">dbo</span>.<span style="color:#339966;">spidArchive</span>
 <span style="color:#0000ff;">WHERE </span><span style="color:#339966;">SPID </span>= <span style="color:#339966;">@SPID </span>AND <span style="color:#339966;">LoginTime </span>&lt;= <span style="color:#339966;">@Date</span>
 <span style="color:#0000ff;">ORDER BY</span> <span style="color:#339966;">LoginTime </span><span style="color:#0000ff;">DESC</span>;
 <span style="color:#0000ff;">RETURN </span>@name;
<span style="color:#0000ff;">END</span></pre>
<p>This function just performs the logic stated above  and converts the SPID and DateTime into the login name for the user. Once this infrastructure is in place we can now directly use that in a call to ::fn_dblog(null,null) to translate the SPID column</p>
<pre><span style="color:#0000ff;">select master</span>.<span style="color:#339966;">dbo</span>.<span style="color:#339966;">ConvertSpidToName</span>(<span style="color:#ff00ff;">log</span>.<span style="color:#339966;">SPID</span>, <span style="color:#ff00ff;">log</span>.[<span style="color:#339966;">Begin Time</span>]) as <span style="color:#339966;">UserName</span>, <span style="color:#ff00ff;">log</span>.* <span style="color:#0000ff;">from </span>::<span style="color:#008000;">fn_dblog</span>(null,null) <span style="color:#ff00ff;">log</span></pre>
<p>What you will notice is that for the majority of log lines, there is no user name displayed &#8211; this is because the SPID is only recorded against the LOP_BEGIN_XACT entry, the beginning of the transaction. This doesn&#8217;t really present a problem, from previous experiments we know all the entries for an individual transaction are given a unique Tansaction ID which we can use to group them together. It becomes pretty trivial to join back to the log, and connect any transaction entries to the LOP_BEGIN_XACT record and produce the name on every row possible.</p>
<pre><span style="color:#0000ff;">select master</span>.<span style="color:#339966;">dbo</span>.<span style="color:#339966;">ConvertSpidToName</span>(<span style="color:#339966;">log2</span>.<span style="color:#339966;">SPID</span>, <span style="color:#339966;">log2</span>.[<span style="color:#339966;">Begin Time</span>]) as <span style="color:#339966;">UserName</span>, <span style="color:#ff00ff;">log</span>.*
<span style="color:#0000ff;">from </span>::<span style="color:#008000;">fn_dblog</span>(null,null) <span style="color:#ff00ff;">log</span>
left join ::<span style="color:#008000;">fn_dblog</span>(null,null) <span style="color:#339966;">log2 </span>on log.[<span style="color:#339966;">Transaction ID</span>] = log2.[<span style="color:#339966;">Transaction ID</span>] and <span style="color:#339966;">log2</span>.<span style="color:#339966;">Operation </span>= <span style="color:#ff0000;">'LOP_BEGIN_XACT'</span></pre>
<p>So overall it is not too hard to get the log entries attributed to the accounts that generated them.</p>
<p>A couple of final notes / caveats:</p>
<ul>
<li>If your application is using a trusted sub-system approach this of course will not work as a technique, since all the users will be logged into the application through an internal mechanism (such as a users table) and then the application service connects using it&#8217;s own credentials &#8211; always a good thing since then the user&#8217;s have no direct access to the database. In that kind of situation this is of no value, every connection will be shown up as the same user/ source.</li>
<li>Within my code I chose to use datetime2(7), to be as accurate as possible on the connections and timings, you could drop to just datetime for SQL Server 2005 but with only 1/300ths of a second accuracy there is a chance on a very busy server that you could see two entries for a single SPID at the same datetime &#8211; which would pose a bit of a problem.</li>
<li>The spidArchive table can not be allowed to grow unconstrained &#8211; I have not included anything here for clearing down the table, but it is not difficult to conceive of it being archived off, or cleaned up weekly via a SQL Agent job.</li>
</ul>
<div id="_mcePaste" class="mcePaste" style="position:absolute;width:1px;height:1px;overflow:hidden;top:0;left:-10000px;">﻿</div>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/internals/'>Internals</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/sql-server-denali/'>SQL Server Denali</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/576/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/576/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/576/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/576/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/576/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/576/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/576/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/576/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/576/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/576/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/576/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/576/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/576/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/576/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=576&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2011/01/08/which-user-made-that-change/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>SQL Server Denali &#8211; Paging</title>
		<link>http://sqlfascination.com/2010/12/31/sql-server-denali-paging/</link>
		<comments>http://sqlfascination.com/2010/12/31/sql-server-denali-paging/#comments</comments>
		<pubDate>Fri, 31 Dec 2010 16:56:43 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Indexes]]></category>
		<category><![CDATA[Query Plan]]></category>
		<category><![CDATA[SQL Server Denali]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=563</guid>
		<description><![CDATA[The introduction of paging within SQL Server Denali will have made a significant number of developers happy, all of which will of previously created home-baked solutions to the same problem. All the solutions have the same underlying problem &#8211; paging is by its nature is inefficient. Most solutions use the row number analytic function, and then sub-select [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=563&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The introduction of paging within SQL Server Denali will have made a significant number of developers happy, all of which will of previously created home-baked solutions to the same problem. All the solutions have the same underlying problem &#8211; paging is by its nature is inefficient. Most solutions use the row number analytic function, and then sub-select data from that. For a large dataset that presents a problem &#8211; the data has to be fully scanned, sorted and allocated row numbers. A suitable index can eliminate the sort operator, but you still end up scanning the entire index to allocate the row numbers.</p>
<p>In Denali, we can see that they have added support to the Order By clause, to include a starting offsets and syntax to denote how many rows we wish to return. The syntax can be examined on MSDN (<a href="http://msdn.microsoft.com/en-us/library/ms188385%28v=SQL.110%29.aspx#Offset">http://msdn.microsoft.com/en-us/library/ms188385%28v=SQL.110%29.aspx#Offset</a>) and in brief is:</p>
<pre>ORDER BY order_by_expression
    [ COLLATE collation_name ]
    [ ASC | DESC ]
    [ ,...n ]
[ &lt;offset_fetch&gt; ]

&lt;offset_fetch&gt; ::=
{
    OFFSET { integer_constant | offset_row_count_expression } { ROW | ROWS }
    [
      FETCH { FIRST | NEXT } {integer_constant | fetch_row_count_expression }
      { ROW | ROWS } ONLY
    ]
}</pre>
<p>Seeing this new syntax, made me want to try it out and see how the query plans are affected. I am using the trusty Adventure Works as usual &#8211; a version for Denali has been put on codeplex, so one quick download later and I was ready to test the new syntax. (Adventure Works download : <a href="http://msftdbprodsamples.codeplex.com/releases/view/55330">http://msftdbprodsamples.codeplex.com/releases/view/55330</a> )</p>
<p>For my tests, I used the production.product table, and wished to page the products based on their name. There is a non-clustered index on the Name field of the product table as well as a clustered index on the product_id, so what would the query plan give?</p>
<pre><span style="color:#0000ff;">select </span>* <span style="color:#0000ff;">from </span>Production.Product <span style="color:#0000ff;">order by </span>name<span style="color:#0000ff;"> asc </span>
offset 10 <span style="color:#0000ff;">rows fetch first </span>10 rows only</pre>
<p>And the query plan is not very surprising</p>
<p><img class="size-full wp-image-567 alignleft" title="query1" src="http://andrewhogg.files.wordpress.com/2010/12/query1.png?w=600&h=159" alt="" width="600" height="159" /></p>
<p style="text-align:left;"> </p>
<p style="text-align:left;">So even with a new syntax the underlying problem remains, the nature of paging is that you are scanning the data, with statistics io turned on the stats come back with Table &#8216;Product&#8217;. Scan count 1, logical reads 15 etc. not particularly exciting and what we would expect given the table is contained within 15 pages. It was because of the stats though that I noticed an anomaly,  in one of  the tests, I had dropped to returning only a single row from the table as follows:</p>
<pre><span style="color:#0000ff;">select </span>* <span style="color:#0000ff;">from </span>Production.Product <span style="color:#0000ff;">order by </span>name asc
offset 10 <span style="color:#0000ff;">rows fetch first </span>1 <span style="color:#0000ff;">rows </span>only</pre>
<p>What I noticed was that the statistics changed to Table &#8216;Product&#8217;. Scan count 1, logical reads 24 &#8211; the entire table is contained within 15 pages, so how could it jump to reading 24?</p>
<p> <a href="http://andrewhogg.files.wordpress.com/2010/12/query2.png"><img class="size-full wp-image-568 alignleft" title="query2" src="http://andrewhogg.files.wordpress.com/2010/12/query2.png?w=600&h=218" alt="" width="600" height="218" /></a></p>
<p>A quick check of the query plan showed what has changed, the engine decided that it was cheaper to use the Name index, which for the purposes of the ordering was narrower and therefore more efficient, and then join back to the main table via the clustered key. Understandable, although the additional pages read is unlikely to make this more efficient, but I doubt you would see much real world difference. An oddity, but nothing really significant in it.</p>
<p>This triggered a more interesting thought, what happens if we reduce our fields so that the index is considered a covering index? is SQL going to get smart when making a selection &#8211; so far we have only seen full table scans occurring.</p>
<pre><span style="color:#0000ff;">select </span>Name, ProductID <span style="color:#0000ff;">from </span>Production.Product <span style="color:#0000ff;">order by </span>name asc
offset 20 <span style="color:#0000ff;">rows fetch first </span>10 <span style="color:#0000ff;">rows </span>only</pre>
<p>The query is now being covered by the name index since the non-clustered index includes the clustered key (ProductID) &#8211; and this changes the query plan again, although its pretty subtle change to notice.</p>
<p><a href="http://andrewhogg.files.wordpress.com/2010/12/query3.png"><img class="alignleft size-full wp-image-569" title="query3" src="http://andrewhogg.files.wordpress.com/2010/12/query3.png?w=600&h=338" alt="" width="600" height="338" /></a></p>
<p>The expected index scan appears, but if you look closely at the tooltip for the scan, the number of rows being read in the scan is not the total number of rows in the index, but a product of the offset + the number of rows requested. This was also reflected within the statistics, showing only 2 logical reads &#8211; the index uses 6 pages in total. As I changed the number of rows to offset / return the Actual number of rows read changed accordingly. o with a covering index in place, the query engine gets a bit more efficient and does a forward scan of the index until the point at which we have passed a sufficient number of rows. This sounds good &#8211; we have avoided scanning the whole index to provide the paged results in a slightly more efficient manner.</p>
<p>Except those with a quick mind will realise that the performance degrades as you go further and further down the list, requesting the 490-500th products will results in 500 rows being checked, not 30. By putting in a covering index we have sacrificed consistency on query times to gain some potential performance &#8211; the full scans solutions will broadly speaking take the same time regardless of which 10 rows you might be requesting, since it has to scan, sort, allocate numbers and then sub-select.</p>
<p>As features go, I like the paging &#8211; it removes the need for all the different homegrown solutions that are out there, but the performance of it remains a problem &#8211; this is no silver bullet to paging performance problems that people have.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/indexes/'>Indexes</a>, <a href='http://sqlfascination.com/tag/query-plan/'>Query Plan</a>, <a href='http://sqlfascination.com/tag/sql-server-denali/'>SQL Server Denali</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/563/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/563/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/563/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/563/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/563/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/563/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/563/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/563/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/563/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/563/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/563/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/563/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/563/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/563/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=563&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/12/31/sql-server-denali-paging/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2010/12/query1.png" medium="image">
			<media:title type="html">query1</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2010/12/query2.png" medium="image">
			<media:title type="html">query2</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2010/12/query3.png" medium="image">
			<media:title type="html">query3</media:title>
		</media:content>
	</item>
		<item>
		<title>SQL Internals Viewer</title>
		<link>http://sqlfascination.com/2010/11/27/sql-internals-viewer/</link>
		<comments>http://sqlfascination.com/2010/11/27/sql-internals-viewer/#comments</comments>
		<pubDate>Sat, 27 Nov 2010 13:31:24 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[SQL Training]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=554</guid>
		<description><![CDATA[As Captain Oates once said, &#8216;I am just going outside and may be some time&#8217; &#8211; feels like quite a while since I had to time to see down and write something. I had a bit of time to take a look at the SQL Internals Viewer (http://internalsviewer.codeplex.com/) , it has been out for some [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=554&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As Captain Oates once said, &#8216;I am just going outside and may be some time&#8217; &#8211; feels like quite a while since I had to time to see down and write something.</p>
<p>I had a bit of time to take a look at the SQL Internals Viewer (<a href="http://internalsviewer.codeplex.com/">http://internalsviewer.codeplex.com/</a>) , it has been out for some time but I had never downloaded it to play around to see how useful it is in terms of a way of learning more about the internals.</p>
<p>The Page Viewer is excellent, the breakdown of a page into the component parts for a row and the display of the page data is a superb aid to anyone wanting to understand how the data is stored on a page. Whilst you can use DBCC PAGE to get at all this information, presenting it in such a readable form will satisfy most people easily.</p>
<p><img class="aligncenter size-full wp-image-555" title="pageviewer" src="http://andrewhogg.files.wordpress.com/2010/11/pageviewer.jpg?w=600&h=438" alt="" width="600" height="438" /></p>
<p>The page allocation map is a nice little addition, but really is just an extension of showing you what pages belong to which object etc.</p>
<p>The transaction log viewer though I was really looking forward to seeing, primarily to help me decode more transactions, but it has been a bit disappointing &#8211; the level of detail shown is very limited, and provides no real benefit over just looking at the log directly, or using the last transaction log trick I have previously posted.</p>
<p><img class="aligncenter size-full wp-image-556" title="Transaction Log Viewer" src="http://andrewhogg.files.wordpress.com/2010/11/translog.jpg?w=600&h=205" alt="" width="600" height="205" /></p>
<p>As you can see from the screenshot, the level of details is pretty light for a simple transaction, no actual breakdown of the log record itself is provided, which is a shame &#8211; whilst it does given you some basic information and will help some people, I think if you are at the stage where you are taking an interest in the transaction log, you are already beyond this point.</p>
<p>So as an educational / learning aid, it is pretty good on the page internals side &#8211; and anyone wanting an easier way to visualize that for learning it is still worth grabbing. I would love to see more on the Log side &#8211; but at present the project appears to be in hibernation, with no changes in some considerable time, so I suspect we will not see any enhancements now.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/internals/'>Internals</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/sql-training/'>SQL Training</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/554/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/554/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/554/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/554/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/554/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/554/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/554/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/554/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/554/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/554/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/554/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/554/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/554/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/554/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=554&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/11/27/sql-internals-viewer/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2010/11/pageviewer.jpg" medium="image">
			<media:title type="html">pageviewer</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2010/11/translog.jpg" medium="image">
			<media:title type="html">Transaction Log Viewer</media:title>
		</media:content>
	</item>
		<item>
		<title>Interval Partitioning in SQL Server 2008</title>
		<link>http://sqlfascination.com/2010/09/12/interval-partitioning-in-sql-server-2008/</link>
		<comments>http://sqlfascination.com/2010/09/12/interval-partitioning-in-sql-server-2008/#comments</comments>
		<pubDate>Sun, 12 Sep 2010 17:03:33 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Dynamic Partitioning]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=540</guid>
		<description><![CDATA[Another one of those features in Oracle that we do not natively have in SQL Server is interval partitioning, where you can automatically have partitions generated based on an interval automatically generated for you. The interval partitioning is yet again, another form of dynamic partitioning, so the thought was, could this be achieved within SQL Server? [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=540&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Another one of those features in Oracle that we do not natively have in SQL Server is interval partitioning, where you can automatically have partitions generated based on an interval automatically generated for you. The interval partitioning is yet again, another form of dynamic partitioning, so the thought was, could this be achieved within SQL Server?</p>
<p>My initial thought would be to use an instead of trigger, which would intercept the incoming values and take action on them if appropriate to extend the partition function. An initial look into the documentation suggests it will not be trivial &#8211; the BoL states:</p>
<pre>"Additionally, the following Transact-SQL statements are not allowed inside the
body of a DML trigger when it is used against the table or view that is the
target of the triggering action. CREATE INDEX, ALTER INDEX, DROP INDEX, DBCC
REINDEX, ALTER PARTITION FUNCTION, DROP TABLE..."</pre>
<p>The issue there being the alter partition function &#8211; any response to an incoming piece of data that is not properly covered by the existing partition mechanism will need to alter the partition information. Of course there are still ways around such restrictions but when experimenting, it seems the BoL is not correct to list the &#8216;ALTER PARTITION FUNCTION&#8217; command in the restrictions.</p>
<p>There are a few caveats to the following snippets of code, I am not attempting to deal with the rolling window affect, or complex storage pattern of a partitioned table of which I have written about in other posts. This is purely designed to demonstrate that it could be achieved, not to provide a total solution to the problem. The partitions created on the fly will all go to the Primary filegroup as well etc.</p>
<p>So start with a very basic partition function and scheme:</p>
<pre><span style="color:#0000ff;">CREATE </span><span style="color:#0000ff;">PARTITION FUNCTION</span> pfTest (<span style="color:#0000ff;">datetime</span>)
<span style="color:#0000ff;">AS RANGE</span> <span style="color:#666699;">LEFT </span><span style="color:#0000ff;">FOR VALUES</span> (<span style="color:#ff0000;">'20100104'</span> , <span style="color:#ff0000;">'20100111'</span> , <span style="color:#ff0000;">'20100118'</span>, <span style="color:#ff0000;">'20100125'</span>)
<span style="color:#0000ff;">CREATE PARTITION</span> SCHEME psTest <span style="color:#0000ff;">AS</span>
<span style="color:#0000ff;">PARTITION</span> pfTest <span style="color:#666699;">ALL</span> <span style="color:#0000ff;">TO</span> ([PRIMARY])</pre>
<p>And generate a table on the partition scheme:</p>
<pre><span style="color:#0000ff;">CREATE TABLE</span> IntervalTest (
MyID <span style="color:#0000ff;">int</span> <span style="color:#0000ff;">identity</span>(1,1) not null,   
MyField <span style="color:#0000ff;">Varchar</span>(200),
MyDate <span style="color:#0000ff;">datetime</span>
) ON psTest(MyDate)</pre>
<p>The next step is an &#8216;Instead of&#8217; trigger, which has to intercept the incoming data from the inserted table, and extend the partition function if required:</p>
<pre><span style="color:#0000ff;">CREATE TRIGGER</span> tr_IntervalTest <span style="color:#0000ff;">ON</span> IntervalTest <span style="color:#0000ff;">INSTEAD OF INSERT
</span><span style="color:#0000ff;">AS</span>
 <span style="color:#0000ff;">BEGIN</span>     

  -- get the current maximum partition value
  <span style="color:#0000ff;">DECLARE</span> @max_part_dt datetime;   <span style="color:#0000ff;">DECLARE</span> @max_inserted_dt datetime;
  <span style="color:#0000ff;">DECLARE</span> @weeks_to_add int;

  <span style="color:#0000ff;">SET</span> @max_inserted_dt = (<span style="color:#0000ff;">SELECT</span> <span style="color:#ff00ff;">max</span>(MyDate) <span style="color:#0000ff;">FROM</span> inserted);

  <span style="color:#0000ff;">SET</span> @max_part_dt = (
SELECT <span style="color:#ff00ff;">max</span>(<span style="color:#ff00ff;">convert</span>(<span style="color:#0000ff;">datetime</span>,value))
from sys.partition_functions f
  <span style="color:#0000ff;">inner join</span> sys.partition_range_values rv on f.function_id = rv.function_id
  <span style="color:#0000ff;">where</span> name = 'pfTest');      
<span style="color:#0000ff;">IF</span> (@max_inserted_dt &gt; <span style="color:#ff00ff;">dateadd</span>(D,7,@max_part_dt))
  <span style="color:#0000ff;">BEGIN</span>
   <span style="color:#008000;">-- need to potentially add multiple partition splits, it depends on
   -- how many weeks in advance the new data is.
   -- get a whole number of the weeks to add to ensure that we cover </span>
<span style="color:#008000;">   -- the new data
</span>   <span style="color:#0000ff;">SET</span> @weeks_to_add = <span style="color:#ff00ff;">ceiling</span>(<span style="color:#ff00ff;">datediff</span>(D,@max_part_dt, @max_inserted_dt) / 7.0)

   <span style="color:#008000;">-- loop around splitting the partition function with the new weekly values
   -- that need to be covered
</span>   <span style="color:#0000ff;">WHILE</span> @weeks_to_add &gt; 0
   <span style="color:#0000ff;">BEGIN</span>
    <span style="color:#008000;">-- increase the maximum partition date by 7 days and split the function
</span>    <span style="color:#0000ff;">SET </span>@max_part_dt = <span style="color:#ff00ff;">dateadd</span>(D,7,@max_part_dt);

    <span style="color:#0000ff;">ALTER PARTITION</span> <span style="color:#0000ff;">SCHEME</span> psTest <span style="color:#0000ff;">NEXT USED</span> [Primary];

    <span style="color:#0000ff;">ALTER PARTITION FUNCTION</span> pfTest() <span style="color:#0000ff;">SPLIT RANGE</span> (@max_part_dt);
    <span style="color:#0000ff;">SET</span> @weeks_to_add = @weeks_to_add - 1
   <span style="color:#0000ff;">END;</span>
  <span style="color:#0000ff;">END</span>;

  <span style="color:#008000;">-- finally insert the values
</span>  <span style="color:#0000ff;">INSERT INTO</span> IntervalTest (MyField, MyDate)
  SELECT MyField, MyDate
  FROM inserted;  
<span style="color:#0000ff;">END</span>     </pre>
<p>The code is pretty self-explanatory, but I would point out that it is only covering an insert, not an update &#8211; and this is not production code but an experiment to see if it could be done (contrary to the BoL). To &#8216;productionize&#8217; this would require significant work on exception handling, performance tuning, handling multiple filegroup partitioning, the list goes on, but all of it achievable.</p>
<p>A little test to insert a couple of values set ahead of the partition</p>
<pre><span style="color:#0000ff;">insert into</span> IntervalTest (MyField, MyDate)
<span style="color:#0000ff;">select</span> <span style="color:#ff0000;">'b'</span>, <span style="color:#ff0000;">'20100505'
</span><span style="color:#0000ff;">union</span>
<span style="color:#0000ff;">select</span><span style="color:#ff0000;"> 'c'</span>, <span style="color:#ff0000;">'20100606'</span></pre>
<p>And a check of the partition function values now show 23 values in the partition function, instead of the original  4 as follows:</p>
<pre>2010-01-04 00:00:00.000, 2010-01-11 00:00:00.000, 2010-01-18 00:00:00.000,
2010-01-25 00:00:00.000, 2010-02-01 00:00:00.000, 2010-02-08 00:00:00.000,
2010-02-15 00:00:00.000, 2010-02-22 00:00:00.000, 2010-03-01 00:00:00.000,
2010-03-08 00:00:00.000, 2010-03-15 00:00:00.000, 2010-03-22 00:00:00.000,
2010-03-29 00:00:00.000, 2010-04-05 00:00:00.000, 2010-04-12 00:00:00.000,
2010-04-19 00:00:00.000, 2010-04-26 00:00:00.000, 2010-05-03 00:00:00.000,
2010-05-10 00:00:00.000, 2010-05-17 00:00:00.000, 2010-05-24 00:00:00.000,
2010-05-31 00:00:00.000, 2010-06-07 00:00:00.000</pre>
<p>It has clearly created the partitions to cover the data being inserted and then performed that insertion.</p>
<p>So it can be done, but the constant cost of intercepting every insertion and update to provide this kind of dynamic partition is really not ideal, whether it could be made sufficiently efficient to work at the sort of scale that partitioning tends to be used at is debatable. I have it feeling that it would struggle - I would need to be lent a suitably sized server and SAN to test that one and see whether it could be made efficient enough.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/dynamic-partitioning/'>Dynamic Partitioning</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/table-partitioning/'>Table Partitioning</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/540/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/540/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/540/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=540&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/09/12/interval-partitioning-in-sql-server-2008/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Blank Transactions</title>
		<link>http://sqlfascination.com/2010/09/05/blank-transactions/</link>
		<comments>http://sqlfascination.com/2010/09/05/blank-transactions/#comments</comments>
		<pubDate>Sun, 05 Sep 2010 22:08:02 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=535</guid>
		<description><![CDATA[Busy time with a new addition to the household &#8211; sleep is clearly an optional parameter these days, but on to one of those oddities you might see in the transaction log. On many occasions you will see transactions in the log that have no operations, the individual transaction entry just has a LOP_BEGIN_XACT following [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=535&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Busy time with a new addition to the household &#8211; sleep is clearly an optional parameter these days, but on to one of those oddities you might see in the transaction log. On many occasions you will see transactions in the log that have no operations, the individual transaction entry just has a LOP_BEGIN_XACT following be a LOP_COMMIT_XACT, with no operations being recorded.</p>
<p>So what causes these?<br />
The immediate thought is a straight:</p>
<pre><span style="color:#0000ff;">Begin Transaction
</span><span style="color:#0000ff;">Commit Transaction</span></pre>
<p>If you try that and inspect the log, you will notice that it has not added in this mysterious, zero-operation transaction. So that is not the cause.</p>
<p>How about a rolled back transaction? well you should already know that this would not be the answer since the log reserves space ahead during a transaction to record undo operations, due to a rollback. To show that in more detail, given the following simple snippet of SQL:</p>
<pre><span style="color:#0000ff;">begin transaction</span>
<span style="color:#0000ff;">insert into</span> test (name) <span style="color:#0000ff;">values</span> (<span style="color:#ff0000;">'a'</span>)
<span style="color:#0000ff;">rollback transaction</span></pre>
<p>The log then shows the transaction beginning, performing the insert, and then rolling it back by deleting the record and recording it as an aborted transaction.</p>
<pre>LOP_ABORT_XACT
LOP_DELETE_ROWS
LOP_INSERT_ROWS
LOP_BEGIN_XACT</pre>
<p>So neither of the first obvious choices are the cause, the reason seems a bit bizarre, but centres around whether any change was attempted but ignored due to being unecessary. Setting up a test table and inserting that single row into it with a value of &#8216;a&#8217;, run the following statement:</p>
<pre><span style="color:#0000ff;">begin transaction</span>
<span style="color:#0000ff;">update</span> test <span style="color:#0000ff;">set</span> name = <span style="color:#ff0000;">'a'</span>
<span style="color:#0000ff;">rollback transaction</span></pre>
<p>Now when you inspect the log, there is a blank transaction, it recorded the start and end of the transaction, but no operations are shown. The same is true if the transaction is rolled back.</p>
<p>If the code is altered slightly to deliberately mean that no modification would occur though, the same does not hold true:</p>
<pre><span style="color:#0000ff;">begin transaction</span>
<span style="color:#0000ff;">update</span> test <span style="color:#0000ff;">set</span> name = <span style="color:#ff0000;">'a'</span> <span style="color:#0000ff;">where </span>1 = 0
<span style="color:#0000ff;">commit transaction</span></pre>
<p>Clearly the code is designed to make no modifications, so it is not surprising that no entry occurs in the transaction log, to make the test a bit fairer, let&#8217;s design the code in a way that it might make a modification, but it doesn&#8217;t.</p>
<pre><span style="color:#0000ff;">begin transaction</span>
<span style="color:#0000ff;">update</span> test <span style="color:#0000ff;">set</span> name = <span style="color:#ff0000;">'a'</span> <span style="color:#0000ff;">where</span> name = <span style="color:#ff0000;">'b'</span>
<span style="color:#0000ff;">commit transaction</span></pre>
<p>Still no entry in the transaction log.</p>
<p>So the distinction in the logging of these zero-op transactions is whether or not there was matching data to be altered, if a record was found but that the alteration was unnecessary we get a zero-op transaction appear in the log. It does make you wonder, why?</p>
<p>It also means that from an auditing perspective, the attempt to modify the data was not logged, not because it was unsuccessful, but because it was being altered to the values it already had.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/internals/'>Internals</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/535/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/535/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/535/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/535/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/535/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/535/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/535/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/535/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/535/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/535/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/535/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/535/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/535/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/535/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=535&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/09/05/blank-transactions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Subtle change to ::fn_dblog in SQL 2008</title>
		<link>http://sqlfascination.com/2010/07/15/subtle-change-to-fn_dblog-in-sql-2008/</link>
		<comments>http://sqlfascination.com/2010/07/15/subtle-change-to-fn_dblog-in-sql-2008/#comments</comments>
		<pubDate>Thu, 15 Jul 2010 21:13:51 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=527</guid>
		<description><![CDATA[Bizarrely given that transaction log decoding is not really viable as a technique beyond interest, it&#8217;s quite surprising how many people visit just to understand or learn how to do it. Either a lot of data is being lost and no backup&#8217;s exist &#8211; or it fascinates a large number of people. In either instance, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=527&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Bizarrely given that transaction log decoding is not really viable as a technique beyond interest, it&#8217;s quite surprising how many people visit just to understand or learn how to do it. Either a lot of data is being lost and no backup&#8217;s exist &#8211; or it fascinates a large number of people.</p>
<p>In either instance, there are sometimes subtle changes can pass you by, and the use of old habits can make life a bit harder. One such recent change that I missed was an alteration to the ::fn_dblog function. I really should have spent some time investigating the new fields that had been added &#8211; but with documentation being limited to say the least, it has not been the top priority and my habit is to fire up SQL 2005 when checking something in the transaction log &#8211; so I had not noticed them.</p>
<p>One of the changes is the creation of a link between the contents of sys.dm_tran_current_transaction and the transaction log itself. The contents of the view have not altered, but one of the new fields within the output from ::fn_fblog is a new field called Xact_ID &#8211; this field contains the same value as the Transaction_ID that is output from the DMV.</p>
<p>Now that is significantly convenient for when you&#8217;re poking around in logs trying to understand cause and effect .</p>
<p>A couple of simple stored procedures later to assist in saving and returning that filtered information as follows:</p>
<pre><span style="color:#0000ff;">CREATE PROCEDURE</span> sp_store_transaction
&lt;<span style="color:#0000ff;">AS</span>
<span style="color:#0000ff;">BEGIN</span>
<span style="color:#0000ff;">  DECLAR</span>E @dm_transaction_id bigint
<span style="color:#0000ff;">  SELECT </span>@dm_transaction_id = transaction_id <span style="color:#0000ff;">FROM </span><span style="color:#008000;">sys.dm_tran_current_transaction</span>
<span style="color:#0000ff;">  IF </span>OBJECT_ID (<span style="color:#ff0000;">N'tempdb.dbo.##db_last_transaction'</span>) IS NOT NULL
<span style="color:#0000ff;">    DROP TABLE</span> ##db_last_transaction
<span style="color:#0000ff;">  SELECT </span>[transaction id] INTO ##db_last_transaction
<span style="color:#0000ff;">  FROM </span>::<span style="color:#008000;">fn_dblog</span>(null,null)
<span style="color:#0000ff;">  WHERE </span>[Xact ID] = @dm_transaction_id
&lt;<span style="color:#0000ff;">END</span>
<span style="color:#0000ff;">GO</span></pre>
<pre><span style="color:#0000ff;">CREATE </span><span style="color:#0000ff;">PROCEDURE </span>sp_get_last_transaction
<span style="color:#0000ff;">AS</span>
<span style="color:#0000ff;">BEGIN</span>
<span style="color:#0000ff;">  SELECT</span> *
<span style="color:#0000ff;">  FROM </span>::<span style="color:#008000;">fn_dblog</span>(null,null)
<span style="color:#0000ff;">  WHERE</span> [Transaction ID] = (SELECT [transaction id] FROM ##db_last_transaction)
<span style="color:#0000ff;">END</span>
<span style="color:#0000ff;">GO</span></pre>
<p>And that is going to make looking at cause an effect far easier &#8211; first procedure you call anywhere within your transaction to store the current transaction_id within the log, and the second to retrieve the values for that stored transaction id. Something like:</p>
<pre><span style="color:#0000ff;">BEGIN TRAN</span>
<span style="color:#0000ff;">UPDATE </span>tbltest <span style="color:#0000ff;">SET </span>firstname = <span style="color:#ff0000;">'ab'</span>

<span style="color:#0000ff;">EXEC </span>sp_store_transaction
<span style="color:#0000ff;">COMMIT TRAN</span>

sp_get_last_transaction</pre>
<p>If you are wondering why I did not dump the log records out and only the transaction id that ties them together &#8211; if you dumped the records before the transaction committed or rolled back, you would not see the effect of that action &#8211; you need to retrieve all the log rows associated to that transaction, after it has finished.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/internals/'>Internals</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/527/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/527/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/527/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/527/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/527/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/527/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/527/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/527/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/527/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/527/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/527/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/527/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/527/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/527/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=527&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/07/15/subtle-change-to-fn_dblog-in-sql-2008/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>SQL Server Hash Partitioning</title>
		<link>http://sqlfascination.com/2010/05/31/sql-server-hash-partitioning/</link>
		<comments>http://sqlfascination.com/2010/05/31/sql-server-hash-partitioning/#comments</comments>
		<pubDate>Mon, 31 May 2010 18:53:47 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Dynamic Partitioning]]></category>
		<category><![CDATA[Query Parameterisation]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=514</guid>
		<description><![CDATA[It&#8217;s been a while since the last post, primarily due to changing jobs and now spending most of my time on Oracle &#8211; although it is always good to see the other side of the coin and see what it has to offer, but I won&#8217;t be abandoning SQL Server, that is for certain. One [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=514&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s been a while since the last post, primarily due to changing jobs and now spending most of my time on Oracle &#8211; although it is always good to see the other side of the coin and see what it has to offer, but I won&#8217;t be abandoning SQL Server, that is for certain.</p>
<p>One of the more interesting features to me in Oracle is hash partitioning &#8211; the ability to create a partition across a defined number of partitions, and then arbitrarily decide which partition the data will go in based on a hashing function. Why would that be handy? SQL Server partitioning is in effect a range partition, in which you define the dividing points on the number line / alphabet &#8211; which suits partitions based on a defined number range or date range, but does not suit partitioning of other types such as a GUID.</p>
<p>The merits of such a partition could be debated, since with a decent index in place the benefits of the partition elimination within the query plan can be limited. Regardless of those merits (and I am pretty sure it is not going to be performant at scale, however could SQL Server implement Hash Partitioning? On a side note, this could be considered semi-dynamic partitioning in that the partition is able to cope with additional data outside of the expected range, due to the hash function.</p>
<p>I&#8217;ve seen a few articles try and perform hash partitioning by pre-processing the insert statement, prior to insertion into the database, but what about something a bit more native?</p>
<p>To start with, we need to create a partition function and partition schema to support this endeavour, both are pretty easy to construct.</p>
<pre><span style="color:#0000ff;">CREATE</span> <span style="color:#0000ff;">PARTITION</span> <span style="color:#0000ff;">FUNCTION</span> [myPartitionFunction] (<span style="color:#0000ff;">int</span>)
<span style="color:#0000ff;">AS</span> <span style="color:#0000ff;">RANGE</span> LEFT <span style="color:#0000ff;">FOR</span> <span style="color:#0000ff;">VALUES</span> (100,200,300,400,500,600,700,800,900)
<span style="color:#0000ff;">CREATE</span> <span style="color:#0000ff;">PARTITION</span> <span style="color:#0000ff;">SCHEME</span> [myPartitionScheme] <span style="color:#0000ff;">AS</span>
<span style="color:#0000ff;">PARTITION</span> [myPartitionFunction] ALL <span style="color:#0000ff;">TO</span> ([FG1])</pre>
<p>I&#8217;ve set up the partition scheme to assign all of the partitions to FG1, just for convenience, it could easily be set to multiple filegroups, and instead of 9 partitions, this could be constructed with 999 partitions.</p>
<p>There are a variety of hashing algorithms and functions, but given the range covered by the partition function, I have chosen to use a very simple modulo on the converted binary of the unique identifier. The only trick here is that we must create the function with schema binding, otherwise SQL will refuse to use the function later on when we persist the column and partition on it.</p>
<pre><span style="color:#0000ff;">CREATE</span> <span style="color:#0000ff;">FUNCTION</span> GuidHash (@guid_value uniqueidentifier) <span style="color:#0000ff;">RETURNS</span> <span style="color:#0000ff;">int</span>
<span style="color:#0000ff;">WITH</span> <span style="color:#0000ff;">SCHEMABINDING</span> <span style="color:#0000ff;">AS</span>
<span style="color:#0000ff;">BEGIN</span>
 <span style="color:#0000ff;">RETURN</span> <span style="color:#ff00ff;">abs</span>(<span style="color:#ff00ff;">convert</span>(<span style="color:#0000ff;">bigint</span>,<span style="color:#ff00ff;">convert</span>(<span style="color:#0000ff;">varbinary</span>,@guid_value))) % 999
<span style="color:#0000ff;">END</span></pre>
<p>That is a pretty simple hashing function, but the point is to demonstrate is can be done, not to implement the best hashing algorithm that will give the most even distribution etc. The next step is to create the table, with the persisted column defined using the GuidHash function. If the function is not schema bound, you get an error thrown at this stage.</p>
<pre><span style="color:#0000ff;">CREATE</span> <span style="color:#0000ff;">TABLE</span> MyTable(  MyID <span style="color:#0000ff;">UniqueIdentifier</span> not null,  
SomeField <span style="color:#0000ff;">Char</span>(200), 
PartitionID <span style="color:#0000ff;">as</span> dbo.GuidHash(MyId) <span style="color:#0000ff;">PERSISTED</span>
)
<span style="color:#0000ff;">ON </span>myPartitionScheme(PartitionID)</pre>
<p>The surprise here is that is accepts the table creation definition - since when would you expect a partitioned table&#8217;s column to be a computed column?</p>
<p>Time to put an index on the table, given the data is indexed off a unique identifier, it would not be unusual to place a non-clustered index on the table and to use index-alignment, e.g. place it on the same partitioning scheme.</p>
<pre><span style="color:#0000ff;">CREATE NONCLUSTERED INDEX</span><span style="color:#0000ff;"> </span>[ix_id] <span style="color:#0000ff;">ON</span> [dbo].[MyTable] (  
[MyID] <span style="color:#0000ff;">ASC</span>,  
[PartitionID] <span style="color:#0000ff;">ASC</span>
) <span style="color:#0000ff;">ON</span> [myPartitionScheme]([PartitionID])</pre>
<p>Populate the table with some test data:</p>
<pre><span style="color:#0000ff;">DECLARE </span>@guid <span style="color:#0000ff;">uniqueidentifier</span>
<span style="color:#0000ff;">SET</span> @guid = <span style="color:#ff00ff;">newid</span>()
<span style="color:#0000ff;">INSERT INTO </span>mytable (myid, somefield) <span style="color:#0000ff;">VALUES</span> (@guid, <span style="color:#ff0000;">'some text'</span>)
go 10000</pre>
<p>So what happens when we select a single row from our data, for convienience I looked up a value in the table and grabbed the GUID &#8211; comparing the two queries side by side,</p>
<pre><span style="color:#0000ff;">SELECT </span>* <span style="color:#0000ff;">FROM </span>mytable <span style="color:#0000ff;">WHERE </span>myid =<span style="color:#ff0000;"> 'D41CA3AC-06D1-4ACC-ABCA-E67A18245596' </span>
<span style="color:#0000ff;">SELECT</span> * <span style="color:#0000ff;">FROM </span>mytable <span style="color:#0000ff;">WHERE </span>(partitionid = dbo.guidhash (<span style="color:#ff0000;">'D41CA3AC-06D1-4ACC-ABCA-E67A18245596'</span>) 
and myid = <span style="color:#ff0000;">'D41CA3AC-06D1-4ACC-ABCA-E67A18245596'</span>)</pre>
<p>The comparison is interesting, in percentage terms, it was 85% to 15% for the batch, the IO Statistics reads:</p>
<pre>First Query : Scan count 10, Logical Reads 21
Second Query : Scan count 1, Logical Reads 3</pre>
<p>So the hashing has clearly made the query faster &#8211; but that just means it was faster than the query that didn&#8217;t use the partition key which shouldn&#8217;t be too surprising &#8211; partition elimination vs checking every partition should win, so all it shows is that partition elimination is occurring. So how does it stack up against a normal table, e.g. have we gained anything? To test we need to put 10k rows into the same table, minus the computed column, index it and perform the same kind of select &#8211; all easy stuff so I will not write the code here, the results of a select from a normal table?</p>
<pre>Normal Table Query : Scan Count 1, Logical Reads 3</pre>
<p>And when run side by side, the SSMS window reports a 50% split of work between the two queries within the batch &#8211; which is not surprising given the IO costs were listed as the same &#8211; so where is the catch? There is no such thing as a free lunch, and the additional cost here is the CPU to generate the PartitionID value for the hashed GUID, but as a technique to partition based on a GUID, it has some merits.</p>
<p>One final thing that I did notice is that under Simple Parameterization, the GuidHash based query does not parameterize, which would start having detrimental effects on the query cache, once the database was placed under forced parameterization, then the query did parameterize appropriately &#8211; so you either want forced parameterization on or to use stored procedures &#8211; I would vote for the later there.</p>
<p>As a technique is has some merits, but you have to remember to manually include the PartitionID column within each query and run it through the hashing function &#8211; which is not ideal, but manageable.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/dynamic-partitioning/'>Dynamic Partitioning</a>, <a href='http://sqlfascination.com/tag/query-parameterisation/'>Query Parameterisation</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/table-partitioning/'>Table Partitioning</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/514/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/514/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/514/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/514/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/514/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/514/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/514/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/514/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/514/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/514/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/514/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/514/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/514/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/514/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=514&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/05/31/sql-server-hash-partitioning/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Immutable Primary Key vs Immutable Clustered Key</title>
		<link>http://sqlfascination.com/2010/04/19/immutable-primary-key-vs-immutable-clustered-key/</link>
		<comments>http://sqlfascination.com/2010/04/19/immutable-primary-key-vs-immutable-clustered-key/#comments</comments>
		<pubDate>Mon, 19 Apr 2010 22:07:10 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Best Practise]]></category>
		<category><![CDATA[Indexes]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=509</guid>
		<description><![CDATA[It is often said that a primary key should be immutable, and this advice is echoed on a multitude of sites some of which strengthen it to a &#8216;law&#8217; - but we know with databases that absolutes are rare and it is very difficult to be 100% prescriptive. There is then no mention of the clustering [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=509&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It is often said that a primary key should be immutable, and this advice is echoed on a multitude of sites some of which strengthen it to a &#8216;law&#8217; - but we know with databases that absolutes are rare and it is very difficult to be 100% prescriptive. There is then no mention of the clustering key being immutable alongside it, which strikes me as strange since it is just as important.</p>
<p>What happens within SQL Server to the row if you change the clustered key?</p>
<ul>
<li>If you change the clustered key value the row must be physically moved and can result in page splits / fragmentation.</li>
<li>A change of the clustered key requires all the non-clustered indexes to be updated to reflect the clustering key choice.</li>
</ul>
<p>And if you change the primary key?</p>
<ul>
<li>A change of the primary key, has to be reflected in each of the other tables that use the key as a linking mechanism.</li>
<li>The primary key must still uniquely identify the row within the table.</li>
</ul>
<p>Clearly different issues, but why does the primary key immutability get so much attention and not the clustering key? The default behaviour of SQL Server is that the primary key becomes the clustering key, so in essence all 4 points get applied, but you can choose a different primary key to the clustering key.</p>
<p>What sort of expense are we talking about if we allow the clustering key to not be immutable and started altering a row&#8217;s clustered key value? To get a better understanding of what is going on under the hood, I&#8217;ll construct an example and check the transaction log.</p>
<p>Creating a simple table in SQL and inserting a few rows to set up the test is pretty easy:</p>
<pre><span style="color:#0000ff;">CREATE TABLE</span> [dbo].[MyTable](
 [MyID] [int] <span style="color:#0000ff;">IDENTITY</span>(1,1) NOT NULL,
 [FirstName] [varchar](20) <span style="color:#0000ff;">COLLATE</span> SQL_Latin1_General_CP1_CI_AS NOT NULL,
 [SecondName] [varchar](30) <span style="color:#0000ff;">COLLATE</span> SQL_Latin1_General_CP1_CI_AS NOT NULL,
 <span style="color:#0000ff;">CONSTRAINT</span> [PK_Table_1] <span style="color:#0000ff;">PRIMARY KEY CLUSTERED</span>
(
 [FirstName] <span style="color:#0000ff;">ASC</span>
)<span style="color:#0000ff;">WITH</span> (IGNORE_DUP_KEY = <span style="color:#0000ff;">OFF</span>) <span style="color:#0000ff;">ON</span> [PRIMARY]
) <span style="color:#0000ff;">ON</span> [PRIMARY]</pre>
<p>Insert a few rows and then issue a simple update statement.</p>
<pre><span style="color:#0000ff;">update</span> mytable <span style="color:#0000ff;">set</span> firstname = <span style="color:#ff0000;">'TestFirstName'</span>, Secondname = <span style="color:#ff0000;">'TestSecondName'</span> <span style="color:#0000ff;">where</span> MyID = 2</pre>
<p>Inspect the transaction log and it is noticable that the log does not contain a LOP_MODIFY_ROW or LOW_MODIFY_COLUMNS within it, but contains a LOP_DELETE_ROWS and a LOP_INSERT_ROWS. Instead of just modifying the data, SQL has removed the row and reinserted it. A few other items appear with the transaction, LOP_BEGIN_XACT and LOP_COMMIT_XACT as you would expect to start and commit the transaction. There is also a LOP_SET_BITS on the LCX_PFS which is not surprising to see either, since we have potentially affected the free space level of the page the data was inserted into.</p>
<p>That maps to exactly what we expect from a high level logical perspective &#8211; the row has to be moved and there is no LOP_MOVE_ROW operation. This results in the row being placed into the transaction log twice, as a before and after.</p>
<p>What happens if we perform an update that does not include the clustered key?</p>
<pre><span style="color:#0000ff;">update</span> mytable <span style="color:#0000ff;">set</span> Secondname = <span style="color:#ff0000;">'AnotherTestSecondName'</span> <span style="color:#0000ff;">where</span> MyID = 3</pre>
<p>This time the log only includes 2 entries, the LOP_BEGIN_XACT / LOP_COMMIT_XACT and a single LOP_MODIFY_ROW which is more as you would expect.</p>
<p>Size wise, the transaction log entry length for the first alteration was 96 + 148 + 56 + 160 + 52 = 512 bytes. For the second entry it was only 96 + 144 + 52 = 292. So the alteration used more log space and due to write ahead logging it must be committed to the disk, but the actual difference for a single row does not look too significant.</p>
<p>Well, whilst it does not look significant, you have to remember that the row being modified was very small. As previous examples have shown the LOP_DELETE_ROWS and LOP_INSERT_ROWS include the entire contents of the row being removed / added, so with a larger row the entire contents of the row would be added to the log twice, compared to the simple modification. That would start to get expensive.</p>
<p>So altering the clustering key is clearly expensive for the transaction log in comparison to a normal update, and this example did not have additional non-clustered indexes added to the table, which would also then require even more entries to deal with the removal and re-insertion of the non-clustered index values.</p>
<p>Given a choice I would make both immutable; the primary key shouldn&#8217;t be the only one to get special treatment and be designed to be immutable, make the clustering key immutable as well.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/best-practise/'>Best Practise</a>, <a href='http://sqlfascination.com/tag/indexes/'>Indexes</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/509/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/509/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/509/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=509&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/04/19/immutable-primary-key-vs-immutable-clustered-key/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>SQL Immersion Event &#8211; Dublin 2010</title>
		<link>http://sqlfascination.com/2010/04/13/sql-immersion-event-dublin-2010/</link>
		<comments>http://sqlfascination.com/2010/04/13/sql-immersion-event-dublin-2010/#comments</comments>
		<pubDate>Tue, 13 Apr 2010 17:26:22 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[SQL Training]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=501</guid>
		<description><![CDATA[I attended the SQL Immersion event last year in Dublin and can honestly say that it was the best training course I have ever attended. The level of detail is phenomenal and the interaction with Paul and Kim is superb. I can not recommend the course heavily enough and anyone who is serious about SQL should [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=501&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I attended the SQL Immersion event last year in Dublin and can honestly say that it was the best training course I have ever attended. The level of detail is phenomenal and the interaction with Paul and Kim is superb. I can not recommend the course heavily enough and anyone who is serious about SQL should make the effort to attend one of these, I would even go as far as to say fund it yourself if you have to. <a href="http://www.prodata.ie" target="_blank">Prodata</a> have not only managed to get Paul and Kim back to run it again, but have also got 2 additional master class courses being scheduled as well. I have a feeling these must be closer to including more information / material from the SQL MCM course, which would be superb, but I will have to check whether the bank balance can handle doing them.</p>
<p>The Immersion event is split again into two tracks, DBA and Developer although both sides of that fence benefit from having a good in-depth understanding of the other, so I would ignore the distinction and go for the full &#8216;Immersion&#8217;, there is no reason that a DBA shouldn&#8217;t understand indexes and index tuning in depth, or that a developer shouldn&#8217;t have a good understanding of the transaction log and internal structure within SQL. I did the full course before and spent most evenings doing even more stuff and using the day&#8217;s material to find out new things, many of which have become topics that I have written about.</p>
<p>Early registration to the courses attracts a 15% discount, but using the promotion code SQLH you will get a 20% discount instead. On the full Immersion course that is a further ~100 euros off the price, which can&#8217;t be bad.</p>
<p>The Immersion event is running from the 28th June to 1st July, and registration is <a href="http://www.prodata.ie/Events/sqlimmersiondublin2010/" target="_blank">here</a>.</p>
<p>The two additional master classes are being run the week after, and these are advertised as being material that is not on the Immersion course, but as mentioned &#8211; I&#8217;m not entirely sure what that is, and given how much detail is on the immersion course, that is going to have to be some very deep internals stuff.</p>
<p>The performance master class is being run on the 5th and 6th of July, registration is <a href="http://www.prodata.ie/Events/sqlimmersionDublin2010/MasterClassPTO.aspx" target="_blank">here</a>, whilst the DR master class registration is <a href="http://www.prodata.ie/Events/sqlimmersionDublin2010/MasterClassDR.aspx" target="_blank">here</a>. Where&#8217;s that cheque book?</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/sql-training/'>SQL Training</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/501/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/501/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/501/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/501/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/501/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/501/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/501/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/501/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/501/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/501/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/501/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/501/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/501/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/501/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=501&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/04/13/sql-immersion-event-dublin-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Query Parameterization and Wildcard Searches</title>
		<link>http://sqlfascination.com/2010/04/06/query-parameterization-and-wildcard-searches/</link>
		<comments>http://sqlfascination.com/2010/04/06/query-parameterization-and-wildcard-searches/#comments</comments>
		<pubDate>Tue, 06 Apr 2010 19:14:27 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Plan Cache]]></category>
		<category><![CDATA[Query Parameterisation]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=495</guid>
		<description><![CDATA[Time flies when you&#8217;re busy, and it has been far too long since I last posted. To business however, and I noticed a problem in query parameterization the other day which does not make much sense at first glance. To demonstrate, I will use the AdventureWorks example database and use a couple of simple queries. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=495&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Time flies when you&#8217;re busy, and it has been far too long since I last posted. To business however, and I noticed a problem in query parameterization the other day which does not make much sense at first glance. To demonstrate, I will use the AdventureWorks example database and use a couple of simple queries. As with all parameterization issues, you need to make sure that you know which mode the database is within, so I&#8217;ll begin by setting it to Simple Mode</p>
<pre><span style="color:#0000ff;">ALTER DATABASE</span> AdventureWorks <span style="color:#0000ff;">SET</span> PARAMETERIZATION SIMPLE</pre>
<p>And then run two queries, separately so that they are not considered a single batch.</p>
<pre><span style="color:#0000ff;">Select </span>* <span style="color:#0000ff;">from</span> HumanResources.Employee <span style="color:#0000ff;">where</span> loginid like <span style="color:#ff0000;">'%a%'</span>
<span style="color:#0000ff;">Select</span> * <span style="color:#0000ff;">from</span> HumanResources.Employee <span style="color:#0000ff;">where</span> loginid like <span style="color:#ff0000;">'%b%'</span></pre>
<p>Under simple parameterization it should not be too surprising to see that when the query cache is inspected, the queries have not been parameterized, and 2 entries exist within the cache. So what happens when the mode is changed to Forced?</p>
<pre><span style="color:#0000ff;">ALTER DATABASE</span> AdventureWorks <span style="color:#0000ff;">SET</span> PARAMETERIZATION FORCED</pre>
<p>Clear down the query cache and try the two queries again, in the hope of a plan cache hit &#8211; and it hasn&#8217;t changed. Two query plans still show in the cache and there was no parameterization. Perhaps it is the existence of the 2 wildcard characters? no, altering the wild-cards makes no difference, removing them entirely still results in the query plan generating a separate plan cache entry.</p>
<p>Parameterization is not limited to dates and numbers, it will work on strings without any problem, but clearly the introduction of the like clause prevents the cache hit. This behaviour is on both SQL Server 2005 and 2008 &#8211; which is a bit annoying.</p>
<p>So how can we get around this problem?</p>
<p>Well bizarrely by just using a different syntax to mean the same thing. PatIndex works just like the like clause and takes a wildcard, but returns the position. In a like clause we are just interested in a match &#8211; whilst the pat index gives us a character position. If the pattern is not found it returns zero, so the simple replacement is to using patindex and look for any values greater than zero.</p>
<pre><span style="color:#0000ff;">Select </span>* <span style="color:#0000ff;">from</span> HumanResources.Employee <span style="color:#0000ff;">where </span><span style="color:#ff00ff;">patindex</span>(<span style="color:#ff0000;">'%a%'</span>,loginid) &gt; 0 
<span style="color:#0000ff;">Select</span> * <span style="color:#0000ff;">from</span> HumanResources.Employee <span style="color:#0000ff;">where</span> <span style="color:#ff00ff;">patindex</span>(<span style="color:#ff0000;">'%b%'</span>,loginid) &gt; 0</pre>
<p>In simple mode this still produces 2 cache hits, but in forced mode you get a plan cache hit finally!</p>
<p>If solving it was only that simple&#8230; by using PatIndex the query where clause has become non-sargable &#8211; which makes no difference if you have a wild card either side of your expression but if you only had a trailing wildcard then this would produce a very bad performance hit. The cost of the extra query plan in memory is unlikely to be more than the use of scans to resolve the query, so faced with a few additional query plans in memory using wildcards, you might be best to leave them there.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/plan-cache/'>Plan Cache</a>, <a href='http://sqlfascination.com/tag/query-parameterisation/'>Query Parameterisation</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/495/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/495/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/495/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=495&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/04/06/query-parameterization-and-wildcard-searches/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Locating Table Scans Within the Query Cache</title>
		<link>http://sqlfascination.com/2010/03/10/locating-table-scans-within-the-query-cache/</link>
		<comments>http://sqlfascination.com/2010/03/10/locating-table-scans-within-the-query-cache/#comments</comments>
		<pubDate>Wed, 10 Mar 2010 21:53:00 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Indexes]]></category>
		<category><![CDATA[Plan Cache]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=484</guid>
		<description><![CDATA[Some time back, Pinal Dave published a blog entry with an example of an XML script that used XQuery to examine the query cache &#8211; the script in itself is a very useful example of using XQuery against the query plans, but doesn&#8217;t quite hit the mark in terms of being an invaluable performance tuning script since [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=484&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Some time back, Pinal Dave published a<a href="http://blog.sqlauthority.com/2009/03/17/sql-server-practical-sql-server-xml-part-one-query-plan-cache-and-cost-of-operations-in-the-cache/" target="_blank"> blog entry </a>with an example of an XML script that used XQuery to examine the query cache &#8211; the script in itself is a very useful example of using XQuery against the query plans, but doesn&#8217;t quite hit the mark in terms of being an invaluable performance tuning script since it provides an information overload and doesn&#8217;t help locate those annoying query problems.</p>
<p>Using it as inspiration however, you might find this useful when tracking down dodgy queries. A number of key items have been added:</p>
<ul>
<li>The database the scan has occurred in.</li>
<li>The schema the scan has occured in.</li>
<li>The table name the scan has been performed on.</li>
</ul>
<p>Understandably, very useful fields to additionally expose, since these allow filtering of the results to exclude tables that are of no interest due to their size (small dimension / lookup tables for example.)</p>
<pre>WITH XMLNAMESPACES(DEFAULT N'http://schemas.microsoft.com/sqlserver/2004/07/showplan'),
CachedPlans
(DatabaseName,SchemaName,ObjectName,PhysicalOperator, LogicalOperator, QueryText,QueryPlan, CacheObjectType, ObjectType)
AS
(
SELECT
Coalesce(RelOp.op.value(N'TableScan[1]/Object[1]/@Database', N'varchar(50)') , 
RelOp.op.value(N'OutputList[1]/ColumnReference[1]/@Database', N'varchar(50)') ,
RelOp.op.value(N'IndexScan[1]/Object[1]/@Database', N'varchar(50)') ,
'Unknown'
)
as DatabaseName,
Coalesce(
RelOp.op.value(N'TableScan[1]/Object[1]/@Schema', N'varchar(50)') ,
RelOp.op.value(N'OutputList[1]/ColumnReference[1]/@Schema', N'varchar(50)') ,
RelOp.op.value(N'IndexScan[1]/Object[1]/@Schema', N'varchar(50)') ,
'Unknown'
)
as SchemaName,
Coalesce(
RelOp.op.value(N'TableScan[1]/Object[1]/@Table', N'varchar(50)') ,
RelOp.op.value(N'OutputList[1]/ColumnReference[1]/@Table', N'varchar(50)') ,
RelOp.op.value(N'IndexScan[1]/Object[1]/@Table', N'varchar(50)') ,
'Unknown'
)
as ObjectName,
RelOp.op.value(<a href="mailto:N'@PhysicalOp'">N'@PhysicalOp'</a>, N'varchar(50)') as PhysicalOperator,
RelOp.op.value(<a href="mailto:N'@LogicalOp'">N'@LogicalOp'</a>, N'varchar(50)') as LogicalOperator,
st.text as QueryText,
qp.query_plan as QueryPlan,
cp.cacheobjtype as CacheObjectType,
cp.objtype as ObjectType
FROM
sys.dm_exec_cached_plans cp
CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st
CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) qp
CROSS APPLY qp.query_plan.nodes(N'//RelOp') RelOp (op)
)
SELECT
DatabaseName,SchemaName,ObjectName,PhysicalOperator
, LogicalOperator, QueryText,CacheObjectType, ObjectType, queryplan
FROM
CachedPlans
WHERE
CacheObjectType = N'Compiled Plan'
and
(
PhysicalOperator = 'Clustered Index Scan' or PhysicalOperator = 'Table Scan' or
PhysicalOperator = 'Index Scan')</pre>
<p>The final alteration is the limitation of the results to only those query plans that include scans, although you could use this to target hash matches or other potentially expensive operations that indicate there is a query plan / indexing opportunity to investigate.</p>
<p>Using this script makes it far easier to run through the query cache and can easily be further modified to include a link to the sys.dm_exec_query_stats via the plan_handle so you could also pull the execution count for the queries with the scans to further prioritize performance tuning work.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/indexes/'>Indexes</a>, <a href='http://sqlfascination.com/tag/plan-cache/'>Plan Cache</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/484/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/484/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/484/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/484/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/484/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/484/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/484/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/484/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/484/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/484/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/484/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/484/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/484/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/484/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=484&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/03/10/locating-table-scans-within-the-query-cache/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>A Strange Case of a Very Large LOP_FORMAT_PAGE</title>
		<link>http://sqlfascination.com/2010/02/28/a-strange-case-of-a-very-large-lop_format_page/</link>
		<comments>http://sqlfascination.com/2010/02/28/a-strange-case-of-a-very-large-lop_format_page/#comments</comments>
		<pubDate>Sun, 28 Feb 2010 15:49:35 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=482</guid>
		<description><![CDATA[I have been involved in a long running case with a client, trying to establish why an ETL process of a couple of gig of data, would result in over 5 gig of transaction log space being used. The database was set in bulk mode, and confirmed as being correctly in bulk mode with appropriate [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=482&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I have been involved in a long running case with a client, trying to establish why an ETL process of a couple of gig of data, would result in over 5 gig of transaction log space being used.</p>
<p>The database was set in bulk mode, and confirmed as being correctly in bulk mode with appropriate full backups and transaction log backups occurring &#8211; so how could the log be expanding so rapidly for such a simple import. One clue was that this was only occurring on a single system, and that no other system had ever witnessed this behaviour, so it had to be something environmental.</p>
<p>The output from the transaction log captured during a load window was sent to me and some quick aggregation of the rows show that there was close to 280k LOP_FORMAT_PAGE entries, but instead of the normal log entry size of 84 bytes, they were coming in at 8276 bytes. This was due to the log record overhead but basically meant that for each page being allocated by the inload, an entry was added larger than the page just added.</p>
<p>I contacted Paul Randal to ask why the log entry would get allocated this large and he identified it as not being in bulk logged mode &#8211; which made sense given the log expansion but there was little proof. Examining an individual LOP_FORMAT_PAGE entry I could see a clear repeating pattern within the hex data &#8211; which would be consistent to a page with rows within it.</p>
<p>One field of what was being imported was a known value for every row (the data related to a specific week number), and translating that value into hex, it was visible multiple times. Using the breakdown of a row and that this value was field number 3 the row could be decoded and actual values of a row imported reconstructed.</p>
<p>Using these values the main database was queried and that row did indeed exist &#8211; the data has 27 decimal values within it also, so the chances of a random match are incredibly small. This proved that the log was recording every single row, even though it was in bulk logged mode.</p>
<p>There are a number of things that can prevent bulk logged mode from operating correctly, and on examining the list the problem became clear to the client&#8217;s DBA; the main database backup was still running at the same time as the data inload was being scheduled. The bulk log operations have to be fully logged during the backup to ensure that the database can be returned to a consistent state once it has been restored.</p>
<p>It is the only time I have seen the LOP_FORMAT_PAGE taking up this much space in a transaction log, and if anyone else gets stuck on a similar problem, google should at least find something this time.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/internals/'>Internals</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/482/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/482/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/482/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/482/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/482/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/482/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/482/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/482/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/482/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/482/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/482/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/482/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/482/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/482/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=482&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/02/28/a-strange-case-of-a-very-large-lop_format_page/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Decoding a Simple Update Statement Within the Transaction Log</title>
		<link>http://sqlfascination.com/2010/02/21/decoding-a-simple-update-statement-within-the-transaction-log/</link>
		<comments>http://sqlfascination.com/2010/02/21/decoding-a-simple-update-statement-within-the-transaction-log/#comments</comments>
		<pubDate>Sun, 21 Feb 2010 14:05:01 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=473</guid>
		<description><![CDATA[After decoding the delete statement within the transaction log, I thought I would tackle another statement – an update statement, although I am attempting the simple version of the update which is only to update a single value within the fixed width fields section. Again to assist in decoding the log entry, I have used [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=473&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>After <a href="http://sqlfascination.com/2010/02/03/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-1/" target="_blank">decoding the delete </a>statement within the transaction log, I thought I would tackle another statement – an update statement, although I am attempting the simple version of the update which is only to update a single value within the fixed width fields section. Again to assist in decoding the log entry, I have used the AdventureWorks database and know the values of the data before and after the modification, so it is easier to decode how the log entry is constructed.</p>
<p>As before the database has been backed up whilst in fully logged mode and the transaction log backed up to make it as small as possible. Once the database was ready I issued a simple statement against the database</p>
<pre><span style="color:#0000ff;">update </span>HumanResources.Employee
<span style="color:#0000ff;">set</span> MaritalStatus = 'M' <span style="color:#0000ff;">where</span> employeeid = 6</pre>
<p>The employee with the ID  of 6 has changed marital status, congratulations employee id 6.</p>
<p>The first confusion is that two LOP_MODIFY_ROW entries show up against the LCX_Clustered context, for the HumanResources.Employee.PK_Employee_EmployeeID allocation unit / name, but only 1 modification has been made.</p>
<p>What is the second modification?</p>
<p>Against each LOP_MODIFY_ROW there is an [Offset in Row] column, which indicates where within the row the modification has been made. The first modification is recorded at byte 24, the second is at byte 57. Instead of providing a field number / name, the alteration is being recorded at a byte level, so to figure out which columns have changed, we must map out the structure in terms of byte positions for the row.</p>
<table border="1" cellspacing="0" cellpadding="0" width="598">
<tbody>
<tr>
<td width="58" valign="top">Offset</td>
<td width="160" valign="top">Name</td>
<td width="95" valign="top">Size in Bytes</td>
<td width="284" valign="top">Value</td>
</tr>
<tr>
<td width="58" valign="top">0</td>
<td width="160" valign="top">Status Bits</td>
<td width="95" valign="top">2</td>
<td width="284" valign="top"> </td>
</tr>
<tr>
<td width="58" valign="top">2</td>
<td width="160" valign="top">Column Number Offset</td>
<td width="95" valign="top">2</td>
<td width="284" valign="top"> </td>
</tr>
<tr>
<td width="58" valign="top">4</td>
<td width="160" valign="top">EmployeeID</td>
<td width="95" valign="top">4</td>
<td width="284" valign="top">6</td>
</tr>
<tr>
<td width="58" valign="top">8</td>
<td width="160" valign="top">ContactID</td>
<td width="95" valign="top">4</td>
<td width="284" valign="top">1028</td>
</tr>
<tr>
<td width="58" valign="top">12</td>
<td width="160" valign="top">ManagerID</td>
<td width="95" valign="top">4</td>
<td width="284" valign="top">109</td>
</tr>
<tr>
<td width="58" valign="top">16</td>
<td width="160" valign="top">BirthDate</td>
<td width="95" valign="top">8</td>
<td width="284" valign="top">1965-04-19 00:00:00.000</td>
</tr>
<tr>
<td width="58" valign="top">24</td>
<td width="160" valign="top">MaritalStatus</td>
<td width="95" valign="top">2</td>
<td width="284" valign="top">S</td>
</tr>
<tr>
<td width="58" valign="top">26</td>
<td width="160" valign="top">Gender</td>
<td width="95" valign="top">2</td>
<td width="284" valign="top">M</td>
</tr>
<tr>
<td width="58" valign="top">28</td>
<td width="160" valign="top">HireDate</td>
<td width="95" valign="top">8</td>
<td width="284" valign="top">1998-01-20 00:00:00.000</td>
</tr>
<tr>
<td width="58" valign="top">36</td>
<td width="160" valign="top">SalariedFlag</td>
<td width="95" valign="top">1</td>
<td width="284" valign="top">1</td>
</tr>
<tr>
<td width="58" valign="top">37</td>
<td width="160" valign="top">VacationHours</td>
<td width="95" valign="top">2</td>
<td width="284" valign="top">40</td>
</tr>
<tr>
<td width="58" valign="top">39</td>
<td width="160" valign="top">SickLeaveHours</td>
<td width="95" valign="top">2</td>
<td width="284" valign="top">40</td>
</tr>
<tr>
<td width="58" valign="top">41</td>
<td width="160" valign="top">CurrentFlag</td>
<td width="95" valign="top">1</td>
<td width="284" valign="top">1</td>
</tr>
<tr>
<td width="58" valign="top">41</td>
<td width="160" valign="top">rowguid</td>
<td width="95" valign="top">16</td>
<td width="284" valign="top">E87029AA-2CBA-4C03-B948-D83AF0313E28</td>
</tr>
<tr>
<td width="58" valign="top">57</td>
<td width="160" valign="top">ModifiedDate</td>
<td width="95" valign="top">8</td>
<td width="284" valign="top">2010-02-20 18:13:03.710</td>
</tr>
<tr>
<td width="58" valign="top">65</td>
<td width="160" valign="top">Column Count</td>
<td width="95" valign="top">2</td>
<td width="284" valign="top"> </td>
</tr>
<tr>
<td width="58" valign="top">67</td>
<td width="160" valign="top">Variable Column Count</td>
<td width="95" valign="top">2</td>
<td width="284" valign="top">3</td>
</tr>
<tr>
<td width="58" valign="top">69</td>
<td width="160" valign="top">Variable Offets</td>
<td width="95" valign="top">6</td>
<td width="284" valign="top"> </td>
</tr>
<tr>
<td width="58" valign="top">75</td>
<td width="160" valign="top">NationalIDNumber</td>
<td width="95" valign="top">16</td>
<td width="284" valign="top">24756624</td>
</tr>
<tr>
<td width="58" valign="top">91</td>
<td width="160" valign="top">LoginID</td>
<td width="95" valign="top">44</td>
<td width="284" valign="top">adventure-works\david0</td>
</tr>
<tr>
<td width="58" valign="top">135</td>
<td width="160" valign="top">Title</td>
<td width="95" valign="top">100</td>
<td width="284" valign="top">Marketing Manager</td>
</tr>
</tbody>
</table>
<p>Again, because there are no nullable columns at all, there is no nullability bitmap within the structure. This mapping is specific to this row, but the offsets for the fixed portion of the row would be valid for all the records, it is only the variable width columns that would change per data row.</p>
<p>The two modifications are at byte 24 and 57, which is clearly the Marital Status and the Modified Date. </p>
<p>Why is the modified date being changed? The adventure works employee table has an update trigger on it, which alters the modified data column to be the current date when any change is made to the row.</p>
<p>For the Marital status transaction log entry, there are two values we need to logically know – what was it before the change, and what is the value after the change.</p>
<p>The Row Log Contents 0 and Row Log Contents 1 fields clearly provide that information</p>
<pre>Row Log Contents 0 : 0x53
Row Log Contents 1 : 0x4D
<span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">convert</span>(nchar,0x53) = S
<span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">convert</span>(nchar,0x4D) = M</pre>
<p>So they are clearly the before and after results.</p>
<p>Checking the transaction log entry against the trigger, the log row contents are dates within hex format, which are not too difficult to deal with</p>
<p>In Hex, and Row Log Contents  0 contains the previous modified date, although the SQL Management Studio results show the Hex has been shortened somewhat from the normal  8 bytes for a date. Padding the extra zeros in gives us:</p>
<pre>0x 00 00 00 00 35 95 00 00 = 0x 00 00 95 35 00 00 00 00  = 2004-07-31 00:00:00.000</pre>
<p>The same rules apply to the  RowLog Contents 1 which has also been cut short.</p>
<pre>0x A5 F1 D6 00 24 9D 00 00 =  00 00 9D 24 00 D6 F1 A5 =  2010-02-21 13:02:35.217</pre>
<p>Which is unsurprisingly today’s date and when I made the modification.</p>
<p>The immediate question is did the log contain two entries because there were two different causes of the change, or does it record a transaction log row per value modified. Since the Offset In Row value is within the transaction log row, you could guess at 1 entry per change.</p>
<p>To check, I restored the database to the starting position and issued a statement with two modifications:</p>
<pre><span style="color:#0000ff;">update </span>HumanResources.Employee
<span style="color:#0000ff;">set</span> MaritalStatus = <span style="color:#ff0000;">'M'</span>,
VacationHours = 60
<span style="color:#0000ff;">where</span> employeeid = 6</pre>
<p>The transaction log alters signficantly, the modified date LOP_MODIFY_ROW entry with an offset of 57 still gets added, but the alteration of the two values does not produce 2 x LOP_MODIFY_ROW but a single LOP_MODIFY_COLUMNS – this appears to have a more complex format that is going to be tougher to crack, so that will have to wait until another day.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/internals/'>Internals</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/sql-server-2008/'>SQL Server 2008</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/473/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/473/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/473/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/473/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/473/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/473/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/473/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/473/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/473/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/473/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/473/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/473/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/473/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/473/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=473&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/02/21/decoding-a-simple-update-statement-within-the-transaction-log/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Prodata March Event Sign Up Available</title>
		<link>http://sqlfascination.com/2010/02/07/prodata-march-event-sign-up-available/</link>
		<comments>http://sqlfascination.com/2010/02/07/prodata-march-event-sign-up-available/#comments</comments>
		<pubDate>Sun, 07 Feb 2010 15:43:03 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=468</guid>
		<description><![CDATA[Just a quick note to say the sign up for the March SQL Academy event run by Prodata is available here. The February event is coming shortly and I will be taking the flight over for some more Irish hospitality and in-depth discussions on SQL Server Analysis services. You can probably still sign up for [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=468&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Just a quick note to say the sign up for the March SQL Academy event run by <a href="http://www.prodata.ie" target="_blank">Prodata</a> is available <a href="http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032428019&amp;Culture=en-IE" target="_blank">here</a>. The February event is coming shortly and I will be taking the flight over for some more Irish hospitality and in-depth discussions on SQL Server Analysis services. You can probably still sign up for the February <a href="http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032428019&amp;Culture=en-IE" target="_blank">event</a> on the 16th, throw me a comment if you plan on going.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/468/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/468/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/468/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=468&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/02/07/prodata-march-event-sign-up-available/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>How Do You Decode A Simple Entry in the Transaction Log? (Part 2)</title>
		<link>http://sqlfascination.com/2010/02/05/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-2/</link>
		<comments>http://sqlfascination.com/2010/02/05/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-2/#comments</comments>
		<pubDate>Fri, 05 Feb 2010 08:56:44 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=452</guid>
		<description><![CDATA[In the previous post we pulled out the raw information relating to a transaction log entry for a deletion of a single row.  In this post we will decode the binary section of the log entry for the deletion performed. A reminder of the Row Log Contents for the Clustered Index log entry that we need to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=452&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>In the <a href="http://sqlfascination.com/2010/02/03/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-1/" target="_blank">previous post</a> we pulled out the raw information relating to a transaction log entry for a deletion of a single row.  In this post we will decode the binary section of the log entry for the deletion performed.</p>
<p>A reminder of the Row Log Contents for the Clustered Index log entry that we need to decode.</p>
<pre>0x3000410001000000B90400001000000000000000406700004D004D0000000000CB8900004A15001E004AD0E1AA37C27449B4D593524773771800000000359500001000000003005D008500BD003100340034003100370038003000370061006400760065006E0074007500720065002D0077006F0072006B0073005C006700750079003100500072006F00640075006300740069006F006E00200054006500630068006E0069006300690061006E0020002D0020005700430036003000</pre>
<p>So we have 180 bytes of information relating to the deletion, the question is &#8211; what is it? The most obvious answer has to be the record that has been deleted &#8211; lots of fixed log information has already been exposed as named fields, this generic variable length value must be the row that we deleted.</p>
<p>How to decode it though? applying the structure of a row is a good place,  the format mentioned before is detailed very nicely detailed in Kalen Delaney&#8217;s SQL Internal&#8217;s book so let&#8217;s use that as a guide.</p>
<ul>
<li>2 Byte : Status Bits</li>
<li>2 Bytes: Offset to find number of columns</li>
<li>X Bytes:Fixed Length Columns</li>
<li>2 Bytes: Total Number of Columns in the data row</li>
<li>1 Bit per column, Rounded up: Nullability Bitmap</li>
<li>2 Bytes:Number of Variable Length Columns within the data row</li>
<li>2 Bytes per variable length column : Row Offset marking the end of each variable length column</li>
<li>X Bytes:Variable Length Columns</li>
</ul>
<p>Ok, let&#8217;s start assigning the values from the log hex dump into the fields.</p>
<p>Status Bits (2 Bytes) : 3000<br />
Offset to Find Number of Columns (2 Bytes) : 4100<br />
Because of the endian ordering in effect, the offset should be read as 0041 &#8211; 65 in Decimal.  So there must be 61 Bytes of Fixed width data, since we have already used 4 bytes.</p>
<p>Let&#8217;s remind ourselves of the table structure:</p>
<pre>CREATE TABLE [HumanResources].[Employee](
 [EmployeeID] [int] IDENTITY(1,1) NOT NULL,
 [NationalIDNumber] [nvarchar](15) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
 [ContactID] [int] NOT NULL,
 [LoginID] [nvarchar](256) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
 [ManagerID] [int] NULL,
 [Title] [nvarchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
 [BirthDate] [datetime] NOT NULL,
 [MaritalStatus] [nchar](1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
 [Gender] [nchar](1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
 [HireDate] [datetime] NOT NULL,
 [SalariedFlag] [dbo].[Flag] NOT NULL CONSTRAINT [DF_Employee_SalariedFlag]  DEFAULT ((1)),
 [VacationHours] [smallint] NOT NULL CONSTRAINT [DF_Employee_VacationHours]  DEFAULT ((0)),
 [SickLeaveHours] [smallint] NOT NULL CONSTRAINT [DF_Employee_SickLeaveHours]  DEFAULT ((0)),
 [CurrentFlag] [dbo].[Flag] NOT NULL CONSTRAINT [DF_Employee_CurrentFlag]  DEFAULT ((1)),
 [rowguid] [uniqueidentifier] ROWGUIDCOL  NOT NULL CONSTRAINT [DF_Employee_rowguid]  DEFAULT (newid()),
 [ModifiedDate] [datetime] NOT NULL CONSTRAINT [DF_Employee_ModifiedDate]  DEFAULT (getdate()),
 CONSTRAINT [PK_Employee_EmployeeID] PRIMARY KEY CLUSTERED</pre>
<p>The fixed columns are as follows:</p>
<pre>[EmployeeID] [int]  = 4
[ContactID] [int] = 4
[ManagerID] [int] = 4
[BirthDate] [datetime] = 8
[MaritalStatus] [nchar](1) = 2
[Gender] [nchar](1) = 2
[HireDate] [datetime] = 8
[SalariedFlag] [dbo].[Flag] = 1
[VacationHours] [smallint] = 2
[SickLeaveHours] [smallint] = 2
[CurrentFlag] [dbo].[Flag] = 0 - second bit.
[rowguid] [uniqueidentifier] = 16
[ModifiedDate] [datetime] = 8</pre>
<p>Adventure Works uses a user defined type of Flag, which is just declared as a bit. So thats 61 bytes of fixed width storage &#8211; which is the value we expected.</p>
<p>We need to know the column order, we could assume it is the same as the table declaration, but we should double check the actual colum ordering to be certain.</p>
<pre><span style="color:#0000ff;">select</span> colorder, syscolumns.name
<span style="color:#0000ff;">from</span> syscolumns
inner join systypes <span style="color:#0000ff;">on</span> syscolumns.xusertype = systypes.xusertype
<span style="color:#0000ff;">where </span>id =<span style="color:#ff00ff;">object_id</span>(<span style="color:#ff0000;">'HumanResources.Employee'</span>) and variable = 0
<span style="color:#0000ff;">order</span> <span style="color:#0000ff;">by</span> colorder

colorder name
-------- -------------
1        EmployeeID
3        ContactID
5        ManagerID
7        BirthDate
8        MaritalStatus
9        Gender
10       HireDate
11       SalariedFlag
12       VacationHours
13       SickLeaveHours
14       CurrentFlag
15       rowguid
16       ModifiedDate</pre>
<p>So the order looks good, let&#8217;s start assigning values:</p>
<pre>[EmployeeID] [int]  = 01000000 = 00000001 = 1
[ContactID] [int] = B9040000 = 000004B9 = 1209
[ManagerID] [int] = 10000000 = 00000001 = 1
[BirthDate] [datetime] = 0000000040670000 = 0000674000000000 = '1972-05-15 00:00:00.000'  - (Use Select convert(datetime, 0x0000674000000000) to check)
[MaritalStatus] [nchar](1) = 004D = 77 = UniCode for M (Use Select nchar(77) to check)
[Gender] [nchar](1) = 4D00 = 004D = 77 = M
[HireDate] [datetime] = 00000000CB890000 = 000089CB00000000 = '1996-07-31 00:00:00.000'
[SalariedFlag] [dbo].[Flag] = 4A = 01001010 in Binary, which will need further work.
[VacationHours] [smallint] = 1500 = 0015 = 21
[SickLeaveHours] [smallint] = 1E00 = 001E = 30
[CurrentFlag] [dbo].[Flag] = Contained within the First Flag Byte
[rowguid] [uniqueidentifier] = 4AD0E1AA37C27449B4D5935247737718
[ModifiedDate] [datetime] = 0000000035950000 = 0000953500000000 = '2004-07-31 00:00:00.000'
Number of Columns : 1000 = 0010 - = 16 Columns - which is correct.
Nullability Bitmap : 1 bit per Column, so 16/8 = 2 Bytes : 0000
Number of Variable Rows : 0300 = 0003 = 3 Variable Length Columns
Var Column End 1 : 5d00 = 005D = 93
Var Column End 2 : 8500 = 0085 = 133
Var Column End 3 : BD00 = 00BD = 189</pre>
<p>The number of columns was at the offset of 65, that used 2 bytes, the nullability was 2, the variable number of rows was 2 bytes and each variable row pointer used a further 2 each, so the data starts at byte after e.g. 65 + 12 = 77. First variable width data ends at 93, so the first variable width column is using 16 bytes of space.</p>
<p>I should mention why the nullability bitmap was 2, all the columns are declared as not null, so the bitmap did not increase in size, if there was a single nullable column, then the bitmap would have a bit per column in the table  &#8211; not a bit per nullable column, it is an all or nothing approach.</p>
<pre>Raw Values : 3100 3400 3400 3100 3700 3800 3000 3700

Reverse Bytes : 0031 0034 0034 0031 0037 0038 0030 0037
Result : 14417807  (You can use Select NChar(0x0031) etc to check each value.)</pre>
<p>Next up is 93 to 133, a gap of 40 bytes.</p>
<pre>Raw Values    : 6100 6400 7600 6500 6E00 7400 7500 7200 6500 
2D00 7700 6F00 7200 6B00 7300 5C00 6700 7500 7900 3100
Reverse Bytes : 0061 0064 0076 0065 006E 0074 0075 0072 0065 
002D 0077 006F 0072 006B 0073 005C 0067 0075 0079 0031
Output : adventure-works\guy1</pre>
<p>Final column, should be 189 &#8211; 133 = 56 bytes long, which it is, and decoding it again in the same way gives the following.</p>
<pre>Raw Values    : 5000 7200 6F00 6400 7500 6300 7400 6900 6F00 6E00 2000 5400 6500 6300
6800 6E00 6900 6300 6900 6100 6E00 2000 2D00 2000 5700 4300 3600 3000
Reverse Bytes : 0050 0072 006F 0064 0075 0063 0074 0069 006F 006E 0020 0054 0065 0063
 0068 006E 0069 0063 0069 0061 006E 0020 002D 0020 0057 0043 0036 0030
Result : Production Technician - WC60</pre>
<p>That is the row contents decoded, and the deleted row&#8217;s original values are now exposed.</p>
<p>As you can see from the complexity of decoding one part of a trivial delete operation within the log, this is not something that you would wish to rely on / use as an audit trail etc. But hopefully it has provided some insight and added to the minimal documentation on how to approach the decoding process. I will try to decode more of the operations as, insert and update shouldn&#8217;t be too hard to manage, but sucess is not guarenteed.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/internals/'>Internals</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/452/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/452/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/452/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/452/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/452/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/452/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/452/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/452/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/452/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/452/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/452/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/452/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/452/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/452/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=452&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/02/05/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>How Do You Decode A Simple Entry in the Transaction Log? (Part 1)</title>
		<link>http://sqlfascination.com/2010/02/03/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-1/</link>
		<comments>http://sqlfascination.com/2010/02/03/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-1/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 22:35:31 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[Transaction Log]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=438</guid>
		<description><![CDATA[There is always a lot of interest in reading log records within SQL, although when you start getting into it you soon realise that it is an undocumented quagmire and quickly decide that there are better things you can do with your time. It has taken a non-trivial amount of time to decode and has created a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=438&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>There is always a lot of interest in reading log records within SQL, although when you start getting into it you soon realise that it is an undocumented quagmire and quickly decide that there are better things you can do with your time.</p>
<p>It has taken a non-trivial amount of time to decode and has created a long documented process, so it will be in two parts, so I should caveat this undertaking with two clear statements first:</p>
<ul>
<li>There is no real reason you want to go poking around in the log except for &#8216;interest value&#8217; &#8211; if you need to actually properly read log files then as you are about to see, using commercial applications will look very cheap in comparison to the time you would spend trying to do this. Even better &#8211; make sure you have proper backup&#8217;s so that you never need to try retrieve rows from the file!</li>
<li>There is basically very little external documentation to guide you in reading an entry, I am learning it by doing it, and using what little documentation there is, along with trying some logical deductions and good understanding of internals from training &#8211; it is designed to act as an example of figuring the log entry out, but also demonstrating that this is not trivial and not something you will want to do.</li>
</ul>
<p>That said, let&#8217;s start by trying to work out a single log record.</p>
<p>Using the adventure works example database, I have set it into fully logged mode and taken a backup to ensure that the transaction log is being used correctly. I&#8217;ve removed the foreign key constraints and triggers on the HumanResources.EmployeeTable, since I want to be able to delete a row and not trip over the example FK&#8217;s and trigger that prevents deletion.</p>
<p>To get to the log file values out, I am using the following snippet of SQL:</p>
<pre><span style="color:#0000ff;">Select </span>* <span style="color:#0000ff;">from</span> ::fn_dblog(null,null)</pre>
<p>On initial inspection the log has only a not too many rows, altering the Diff Map, File Header etc. So I issue the command to delete a single row from the table.</p>
<pre><span style="color:#0000ff;">Delete</span> <span style="color:#0000ff;">from </span>HumanResources.Employee <span style="color:#0000ff;">where</span> EmployeeId = 1</pre>
<p>And then select the contents of the log, and I have copied some of the fields from the output:</p>
<table>
<tbody>
<tr>
<th>Current LSN</th>
<th>Operation</th>
<th>Context</th>
<th>Transaction ID</th>
<th>Log Record Length</th>
<th>AllocUnitName</th>
</tr>
<tr>
<td>000001bd:000001d8:0002</td>
<td>LOP_BEGIN_XACT</td>
<td>LCX_NULL</td>
<td>0000:0000da58</td>
<td>96</td>
<td>NULL</td>
</tr>
<tr>
<td>000001bd:000001d8:0003</td>
<td>LOP_DELETE_ROWS</td>
<td>LCX_MARK_AS_GHOST</td>
<td>0000:0000da58</td>
<td>148</td>
<td>HumanResources.Employee.AK_Employee_LoginID</td>
</tr>
<tr>
<td>000001bd:000001d8:0004</td>
<td>LOP_SET_BITS</td>
<td>LCX_DIFF_MAP</td>
<td>0000:00000000</td>
<td>56</td>
<td>Unknown Alloc Unit</td>
</tr>
<tr>
<td>000001bd:000001d8:0005</td>
<td>LOP_DELETE_ROWS</td>
<td>LCX_MARK_AS_GHOST</td>
<td>0000:0000da58</td>
<td>124</td>
<td>HumanResources.Employee.AK_Employee_NationalIDNumber</td>
</tr>
<tr>
<td>000001bd:000001d8:0006</td>
<td>LOP_SET_BITS</td>
<td>LCX_DIFF_MAP</td>
<td>0000:00000000</td>
<td>56</td>
<td>Unknown Alloc Unit</td>
</tr>
<tr>
<td>000001bd:000001d8:0007</td>
<td>LOP_DELETE_ROWS</td>
<td>LCX_MARK_AS_GHOST</td>
<td>0000:0000da58</td>
<td>120</td>
<td>HumanResources.Employee.AK_Employee_rowguid</td>
</tr>
<tr>
<td>000001bd:000001d8:0008</td>
<td>LOP_DELETE_ROWS</td>
<td>LCX_MARK_AS_GHOST</td>
<td>0000:0000da58</td>
<td>108</td>
<td>HumanResources.Employee.IX_Employee_ManagerID</td>
</tr>
<tr>
<td>000001bd:000001d8:0009</td>
<td>LOP_SET_BITS</td>
<td>LCX_DIFF_MAP</td>
<td>0000:00000000</td>
<td>56</td>
<td>Unknown Alloc Unit</td>
</tr>
<tr>
<td>000001bd:000001d8:000a</td>
<td>LOP_DELETE_ROWS</td>
<td>LCX_MARK_AS_GHOST</td>
<td>0000:0000da58</td>
<td>288</td>
<td>HumanResources.Employee.PK_Employee_EmployeeID</td>
</tr>
<tr>
<td>000001bd:000001d8:000b</td>
<td>LOP_SET_BITS</td>
<td>LCX_PFS</td>
<td>0000:00000000</td>
<td>56</td>
<td>HumanResources.Employee.PK_Employee_EmployeeID</td>
</tr>
<tr>
<td>000001bd:000001d8:000c</td>
<td>LOP_COMMIT_XACT</td>
<td>LCX_NULL</td>
<td>0000:0000da58</td>
<td>52</td>
<td>NULL</td>
</tr>
</tbody>
</table>
<p>That&#8217;s a lot of entries for a single delete &#8211; which is explained when you check the AllocUnitName, the delete has to also delete the entries within the indexes and the Adventure works table I am working against does indeed have 5 indexes, 4 Non-clustered and 1 Clustered. So that is making a lot of sense, and the whole operation is surrounded by a LOP_BEGIN_XACT and LOP_COMMIT_XACT, and we know from normal SQL terminology with SET XACT ABORT ON / OFF that it is about transactional scope and whether a whole transaction rolls back if a single item within the statement.</p>
<p>Let&#8217;s concentrate on the record deletion for the clustered index, with the larger log length of 288. That is a long record for what is in theory marking the row as a ghost, which suggests the row is within the log record. This is also backed up by the differing lengths for the other ghost record marks, which differ in size, just as the index row size does.</p>
<p>What are the interesting fields available to us from the Log record that we can pick up on straight away:</p>
<ul>
<li>Transaction ID : All the deletions belong to the same transaction ID, so we know they are related / part of the same transactional operation.</li>
<li>Previous LSN : Each subsequent deletion in the list, shows the previous one as the previous LSN, so we can see the chain of log records.</li>
<li>Page ID : We can tell which page was altered.</li>
<li>Slot ID : Which slot within the page was altered.</li>
<li>SPID : The LOP_BEGIN_XACT row shows the SPID that issued the command</li>
<li>BeginTime : The LOP_BEGIN_XACT row shows the DateTime when the command was issued.</li>
<li>Transaction Name : The LOP_BEGIN_XACT row shows the type of transaction, for this one shows DELETE.</li>
<li>EndTime : The LOP_COMMIT_XACT row shows the end time of the transaction.</li>
</ul>
<p>Interestingly, there is no indication as to who issued the command, there is a TransactionSID &#8211; which might turn out to be the account SID, but that will be a tangental investigation.<br />
The final column to have a look at is RowLogContents, for the clustered index row deletion, rowlog contents 0 has the larger set of data to work on.</p>
<pre>0x3000410001000000B90400001000000000000000406700004D004D0000000000CB8900004A15001E004AD0E1AA37C27449B4D593524773771800000000359500001000000003005D008500BD003100340034003100370038003000370061006400760065006E0074007500720065002D0077006F0072006B0073005C006700750079003100500072006F00640075006300740069006F006E00200054006500630068006E0069006300690061006E0020002D0020005700430036003000</pre>
<p>If that does make you go running for the hills, then you have a chance of decoding it.</p>
<p>That&#8217;s only 180 Bytes, but this is the log row contents, not the actual log entry itself, so the fixed log fields are not within it, let&#8217;s work on it. The structure of a row is a good place, since we know expect the structure of the row to be present within the log. The format is very nicely detailed in Kalen Delaney&#8217;s SQL Internal&#8217;s book so let&#8217;s use that.</p>
<ul>
<li>2 Byte : Status Bits</li>
<li>2 Bytes: Offset to find number of columns</li>
<li>X Bytes:Fixed Length Columns</li>
<li>2 Bytes: Total Number of Columns in the data row</li>
<li>1 Bit per column, Rounded up: Nullability Bitmap</li>
<li>2 Bytes:Number of Variable Length Columns within the data row</li>
<li>2 Bytes per variable length column : Row Offset marking the end of each variable length column</li>
<li>X Bytes:Variable Length Columns</li>
</ul>
<p>We can use this in <a href="http://sqlfascination.com/2010/02/05/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-2/" target="_blank">Part 2</a> to decode the binary contents of the log record.</p>
<br />Filed under: <a href='http://sqlfascination.com/category/sql-server/'>SQL Server</a> Tagged: <a href='http://sqlfascination.com/tag/internals/'>Internals</a>, <a href='http://sqlfascination.com/tag/sql-server-2005/'>SQL Server 2005</a>, <a href='http://sqlfascination.com/tag/transaction-log/'>Transaction Log</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/438/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/438/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/438/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/438/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/438/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/438/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/438/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/438/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/438/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/438/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/438/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/438/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/438/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/438/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=438&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/02/03/how-do-you-decode-a-simple-entry-in-the-transaction-log-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Why Are Date Tricks in SQL 2005 a Problem in Waiting?</title>
		<link>http://sqlfascination.com/2010/01/26/why-are-date-tricks-in-sql-2005-a-problem-in-waiting/</link>
		<comments>http://sqlfascination.com/2010/01/26/why-are-date-tricks-in-sql-2005-a-problem-in-waiting/#comments</comments>
		<pubDate>Tue, 26 Jan 2010 23:22:34 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Best Practise]]></category>
		<category><![CDATA[DateTime2]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=429</guid>
		<description><![CDATA[One of the long time annoyances about the Date functions in SQL is that a number of them do not function as developer would wish, the prime example I&#8217;m going to focus on is the DateDiff function. The way it operates is very contrary to what a developer might expect or want from it - the function [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=429&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>One of the long time annoyances about the Date functions in SQL is that a number of them do not function as developer would wish, the prime example I&#8217;m going to focus on is the DateDiff function. The way it operates is very contrary to what a developer might expect or want from it - the function counts the number of boundaries that crossed for the specific units selected, not the number of whole units between the dates, as an example:</p>
<pre><span style="color:#0000ff;">declare </span>@date1 <span style="color:#0000ff;">datetime</span>
<span style="color:#0000ff;">declare</span> @date2 <span style="color:#0000ff;">datetime</span>
<span style="color:#0000ff;">set</span> @date1 = <span style="color:#ff0000;">'20100101 23:00:00'</span>
<span style="color:#0000ff;">set</span> @date2 = <span style="color:#ff0000;">'20100102 01:00:00'</span>
<span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">datediff</span>(d, @date1, @date2)</pre>
<p>And the result is 1, since the units selected was days, the boundary line is predictably at midnight, so even though the time span is only 2 hours, it would count as 1 day &#8211; that is not intuiative. Now this is all documented so we cannot complain or grumble. If you wanted to know whether a full day has passed, you used hourly units instead and made sure you had the logic to deal with this.</p>
<p>All of this leaves you with a pretty bad resolution however, you can get the hours difference, but minutes and seconds are not available &#8211; so you have to datediff on those units and do some maths. It really makes for a ham-fisted way of getting a duration.</p>
<p>So people work around the problem by converting the date to the numeric equivalent and manipulating that directly.</p>
<pre><span style="color:#0000ff;">declare </span>@date1 <span style="color:#0000ff;">datetime</span>
<span style="color:#0000ff;">set</span> @date1 = <span style="color:#ff0000;">'20100101 12:15:30'</span>
<span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">convert</span>(<span style="color:#0000ff;">float</span>,@date1) 40177.5107638889</pre>
<p>The decimal part represents the percentage through the day, which is not really how the underlying binary storage stores it, it uses a number to represent the number of 1/300ths of a second since the day started.</p>
<p>This format was very forgiving though, if you wanted to add a day, instead of using DateAdd, you could just add 1 to the number, very convienient.</p>
<p>It does however make it easier to create a pseudo-timespan by deducting one date&#8217;s numeric representation from another dates, although the code is somewhat long-winded. As a side note, make sure you convert to float and not real &#8211; real has not go sufficient accuracy for this to work.</p>
<pre><span style="color:#0000ff;">declare </span>@date1 <span style="color:#0000ff;">datetime</span>
<span style="color:#0000ff;">declare</span> @date2 <span style="color:#0000ff;">datetime</span>
<span style="color:#0000ff;">set</span> @date1 = <span style="color:#ff0000;">'20100101 12:00:00'</span>
<span style="color:#0000ff;">set</span> @date2 = <span style="color:#ff0000;">'20100102 13:15:35'</span>
<span style="color:#0000ff;">declare </span>@result float
<span style="color:#0000ff;">set</span> @result = convert(float,@date2) - convert(float,@date1) <span style="color:#0000ff;">declare</span> @DurationDays <span style="color:#0000ff;">float</span>
<span style="color:#0000ff;">declare</span> @DurationTime <span style="color:#0000ff;">float</span>
<span style="color:#0000ff;">declare</span> @DurationHours <span style="color:#0000ff;">float</span>
<span style="color:#0000ff;">declare </span>@DurationMinutes <span style="color:#0000ff;">float</span>
<span style="color:#0000ff;">declare</span> @DurationSeconds <span style="color:#0000ff;">float</span>
<span style="color:#0000ff;">set</span> @DurationDays = <span style="color:#ff00ff;">floor</span>(@result)
<span style="color:#0000ff;">set</span> @DurationTime = (@result - <span style="color:#ff00ff;">floor</span>(@result) )
<span style="color:#0000ff;">set</span> @DurationTime = @DurationTime * 86400 <span style="color:#0000ff;">set</span> @DurationHours = <span style="color:#ff00ff;">floor</span>(@DurationTime / 3600)
<span style="color:#0000ff;">set</span> @DurationTime = @DurationTime - @DurationHours * 3600
<span style="color:#0000ff;">set</span> @DurationMinutes = <span style="color:#ff00ff;">floor</span>(@DurationTime/60)
<span style="color:#0000ff;">set</span> @DurationTime = @DurationTime - @DurationMinutes * 60
<span style="color:#0000ff;">set</span> @DurationSeconds  = @DurationTime
<span style="color:#0000ff;">select </span>@DurationDays <span style="color:#0000ff;">as</span> Days,  @DurationHours <span style="color:#0000ff;">as</span> Hours ,  
@DurationMinutes <span style="color:#0000ff;">as</span> Minutes,  @DurationSeconds <span style="color:#0000ff;">as</span> Seconds

Days              Hours              Minutes           Seconds
----------------- ------------------ ----------------- -----------------
1                 1                  15                35.0000002188608</pre>
<p>Bit of a hack and was it really any shorter or better? Debatable. Whilst it can get time span information out, when used within SQL 2008 using the new datetime2 types, the wheels fall off:</p>
<pre><span style="color:#0000ff;">declare</span> @date1 <span style="color:#0000ff;">datetime2</span>(7)
<span style="color:#0000ff;">set</span> @date1 =<span style="color:#ff0000;"> '20100101 12:00:00'
</span><span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">convert</span>(float,@date1)
<span style="color:#ff0000;">Msg 529, Level 16, State 2, Line 3 Explicit conversion from data type datetime2 to float is not allowed.</span></pre>
<p>And that is where the problem comes in &#8211; the new datetime2 types will not allow the date to be converted to a number, and a number of these developer tricks no longer work.</p>
<p>Most if not all the tricks can be re-written to use multiple date functions with some mathematical logic &#8211; and it can be done without the code ever knowing if it had been embedded within a function / stored procedure.  Where you would see a less transparent move to the datetime2 data types would be where developers had embedded some of the tricks directly into ad-hoc SQL, that will fail if the type is altered. In an ideal world, is that the code never contains these techniques of course, but we do not all live in that nirvana.</p>
<p>So on the one hand Datetime2 gives great accuracy and can reduce storage, but on the other hands, the tricks used in the past to deal with the inadequacies of the built-in Date functions no longer work.</p>
<p>What I would really like is a revamp of the Date functions and the introduction of  a time span type &#8211; could be a long wait.</p>
<br />Posted in SQL Server Tagged: Best Practise, DateTime2, SQL Server 2005, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/429/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/429/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/429/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/429/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/429/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/429/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/429/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/429/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/429/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/429/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/429/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/429/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/429/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/429/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=429&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/01/26/why-are-date-tricks-in-sql-2005-a-problem-in-waiting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Case Sensitivity in the Query Plan Cache</title>
		<link>http://sqlfascination.com/2010/01/17/case-sensitivity-in-the-query-plan-cache/</link>
		<comments>http://sqlfascination.com/2010/01/17/case-sensitivity-in-the-query-plan-cache/#comments</comments>
		<pubDate>Sun, 17 Jan 2010 18:10:49 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Plan Cache]]></category>
		<category><![CDATA[Query Parameterisation]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=416</guid>
		<description><![CDATA[A surprising facet of the query plan cache is that it matches ad-hoc queries not only on their text, but the case of the query text must exactly match as well. This is documented quite clearly on MSDN although it is a bit of a surprising behaviour. It also does not change based on whether [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=416&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A surprising facet of the query plan cache is that it matches ad-hoc queries not only on their text, but the case of the query text must exactly match as well. This is documented quite clearly on <a href="http://technet.microsoft.com/en-us/library/ee343986.aspx" target="_blank">MSDN</a> although it is a bit of a surprising behaviour. It also does not change based on whether the collation of the server is case-sensitive or not.</p>
<p>The documentation gives a statement on case-sensitivity in the plan cache, but no mention of whether the behaviour changes under &#8216;Forced&#8217; parameterization, which asks SQL to be more aggressive in extracting query literals and generating a query cache hit &#8211; so I decided to have a look and see how it acted in &#8216;Forced&#8217; vs &#8216;Simple&#8217;.</p>
<p>Whether the database was in &#8216;Simple&#8217; or &#8216;Forced&#8217; the behaviour did not change - but it turns out that it is not case-sensitive on keywords,  just on the object names.</p>
<p>To show the case sensitivity behaviour I have used the AdventureWorks sample database as a testing ground. Prior to each test I cleared the procedure cache using DBCC FreeProcCache.</p>
<p>I then issued two very simple queries:</p>
<pre><span style="color:#0000ff;">SELECT </span>* <span style="color:#0000ff;">from</span> humanresources.employee <span style="color:#0000ff;">Select</span> * <span style="color:#0000ff;">from</span> humanresources.employee</pre>
<p>When the query cache is inspected, there are two entries &#8211; it remains case-sensitive.</p>
<pre>sql_statement                                 execution_count
--------------------------------------------- --------------------
SELECT * from humanresources.employee         1  
SELECT * from humanresources.employee         1</pre>
<p>The query so simple and has no parameters so I suspect the simple / forced parameterization routines must not activate.</p>
<p>If we add a parameter to the query, then the parameterization activates and gets to work on the query string prior to the cache lookup. Both simple and forced are able to cope with such a simple query so both performed the parameterization.</p>
<pre><span style="color:#0000ff;">SELECT</span> * <span style="color:#0000ff;">FROM</span> humanresources.employee <span style="color:#0000ff;">WHERE</span> employeeID = 1
<span style="color:#0000ff;">Select</span> * <span style="color:#0000ff;">From</span> humanresources.employee <span style="color:#0000ff;">Where</span> employeeID = 1</pre>
<p>Inspect the query plan cache when running in forced:</p>
<pre>sql_statement                                             execution_count
--------------------------------------------------------- --------------------
(@0 int)select * from humanresources . employee           2
where employeeID = @0</pre>
<p>Inspect the query plan cache when running in simple:</p>
<pre>sql_statement                                             execution_count
--------------------------------------------------------- --------------------
(@1 tinyint)SELECT * FROM [humanresources].[employee]     2
WHERE [employeeID]=@1</pre>
<p>The results show a plan cache hit, but more importantly show up a rather obvious difference in the parameterization routines for each mode:</p>
<ul>
<li>Simple changes keywords to upper case, Forced changes them to lowercase.</li>
<li>Simple places square brackets around the objects, forced does not.</li>
<li>Simple chooses to replace the literal with a tinyint, Forced uses an Int.</li>
<li>Simple starts the parameters at @1, forced starts at @0</li>
</ul>
<p>The differences can be filed under bizarre, strange and just inconsistent, although they do both get the job done, which counts at the end of the day.</p>
<p>What is then disappointing is that the same is not true for the tables and fields named in the query. Changing the case of one of the objects prevents the caching again.</p>
<pre><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> humanresources.employee <span style="color:#0000ff;">where</span> EmployeeID = 1
<span style="color:#0000ff;">Select</span> * <span style="color:#0000ff;">From</span> humanresources.employee <span style="color:#0000ff;">where</span> employeeID = 1</pre>
<p>Inspect the query cache (this one from forced mode):</p>
<pre>sql_statement                                            execution_count
-------------------------------------------------------- --------------------
(@0 int)select * from humanresources . employee          1
where EmployeeID = @0
(@0 int)select * from humanresources . employee          1
where employeeID = @0</pre>
<p>So we are back to the situation of no cache hit.</p>
<p>It seems very strange that the parameterization only ensures the casing of the keywords is consistant to give it a better chance of a query plan cache hit &#8211; if this was a case-insensitive server than it is a valid optimization to try increase the chances of a plan cache hit.</p>
<p>The converse you would think, is that it would be an inherently risky optimization on a case-sensitive database? &#8211; but in fact it is an optimization that would never be needed or made &#8211; if anything a case-sensitive database server will have a better chance of making a query plan cache hit since all the tables names and field names have to exactly match the stored object names &#8211; and so the queries will which have a greater chance of matching each other.</p>
<p>It could clearly do more to try give a match, but I suspect the complications and edge cases, such as  database / server case-sensitive collation mis-match account for why it might seem easier than it really would be to make better.</p>
<br />Posted in SQL Server Tagged: Plan Cache, Query Parameterisation, SQL Server 2005, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/416/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/416/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/416/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/416/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/416/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/416/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/416/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/416/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/416/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/416/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/416/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/416/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/416/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/416/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=416&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/01/17/case-sensitivity-in-the-query-plan-cache/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Dynamic Partitioning : What objects are using a Partition Schema? (SQL Tuesday #002)</title>
		<link>http://sqlfascination.com/2010/01/12/dynamic-partitioning-what-objects-are-using-a-partition-schema-sql-tuesday-002/</link>
		<comments>http://sqlfascination.com/2010/01/12/dynamic-partitioning-what-objects-are-using-a-partition-schema-sql-tuesday-002/#comments</comments>
		<pubDate>Tue, 12 Jan 2010 23:18:08 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Dynamic Partitioning]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[SQL Tuesday]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=383</guid>
		<description><![CDATA[As part of the look at dynamic partitioning one of the first problems I have come across is finding what objects are currently placing data within a partition schema, this can be both tables as well as indexes for a table or indexes for a view (which can also be partitioned). This has tied in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=383&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As part of the look at <a href="http://sqlfascination.com/tag/dynamic-partitioning/" target="_blank">dynamic partitioning </a>one of the first problems I have come across is finding what objects are currently placing data within a partition schema, this can be both tables as well as indexes for a table or indexes for a view (which can also be partitioned).</p>
<p>This has tied in nicely with <a href="http://sqlblog.com/blogs/adam_machanic/archive/2010/01/04/invitation-for-t-sql-tuesday-002-a-puzzling-situation.aspx" target="_blank">Adam Machanic&#8217;s SQL Tuesday</a> in which we were to describe a confusing situation and possible solution.</p>
<p>Certainly it has been a bit confusing to get to the bottom of what should be a relatively trivial question &#8211; programmatically determine what objects are using my partition schema? </p>
<p>If I am going to auto-balance the partitions then I have to know what objects are using the partition schema and will be affected by any balancing &#8211; we can not assume it is a single object, since we know both the table and non-clustered indexes will often be aligned on the same partition schema, as could other tables / objects.    </p>
<p>So the first place I chose to check was the system views for the partition objects, sys.partition_functions and sys.partition_schemes &#8211; with the partition schemes being where you would expect to start.      </p>
<pre><span style="color:#0000ff;">SELECT </span>* <span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partition_schemes</span></pre>
<p> </p>
<table>
<tbody>
<tr>
<th>name</th>
<th>data_space_id</th>
<th>type</th>
<th>type_desc</th>
<th>is_default</th>
<th>function_id</th>
</tr>
<tr>
<td>psBalanceLeft</td>
<td>65601</td>
<td>PS</td>
<td>PARTITION_SCHEME</td>
<td>0</td>
<td>65536</td>
</tr>
</tbody>
</table>
<p>Unfortunately the results from partition schemes view is spectacularly unhelpful, aside from inheriting a number of columns from the data spaces system view, it only adds function_id &#8211; being the ID of the partition function used in the schema. It at least has the name of the partition scheme, so that definitely is going to have to be used at a later point.    </p>
<p>The immediately useful looking value is the function_id linking the scheme to the partition function, so I had a look inside the partition functions view remember what it has.      </p>
<pre><span style="color:#0000ff;">SELECT </span>* <span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partition_functions</span></pre>
<p> </p>
<table>
<tbody>
<tr>
<th>name</th>
<th>function_id</th>
<th>type</th>
<th>type_desc</th>
<th>fanout</th>
<th>boundary_values_on_right</th>
</tr>
<tr>
<td>pfBalanceLeft</td>
<td>65536</td>
<td>R</td>
<td>RANGE</td>
<td>10</td>
<td>0</td>
</tr>
</tbody>
</table>
<p>The output does not particularily lead anywhere useful &#8211; the function most certainly is not going to tell me which objects are assigned to it, since the tables / indexes get directly assigned to the partition scheme, this looks a dead end. So the only other option is to go to the data spaces system view:    </p>
<pre><span style="color:#0000ff;">SELECT </span>ps.name, ds.*
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partition_schemes</span> ps
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.data_spaces</span> ds <span style="color:#0000ff;">on </span>ps.data_space_id = ps.data_space_id</pre>
<p> </p>
<table>
<tbody>
<tr>
<th>name</th>
<th>name</th>
<th>data_space_id</th>
<th>type</th>
<th>type_desc</th>
<th>is_default</th>
</tr>
<tr>
<td>psBalanceLeft</td>
<td>psBalanceLeft</td>
<td>65601</td>
<td>PS</td>
<td>PARTITION_SCHEME</td>
<td>0</td>
</tr>
</tbody>
</table>
<p>Not a stellar move &#8211; there are no obvious leads here.</p>
<p>So I can obtain the relation between the partition scheme and the storage but that it is so far. Given those two dead ends I next considered the problem from the opposite direction &#8211; sys.partitions claims to contain a row for each partitioned tables and index in the database &#8211; which should provide another starting point.    </p>
<pre><span style="color:#0000ff;">SELECT </span>* <span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partitions</span></pre>
<p>Some of the output was as follows:    </p>
<table>
<tbody>
<tr>
<th>partition_id</th>
<th>object_id</th>
<th>index_id</th>
<th>partition_number</th>
<th>hobt_id</th>
<th>rows</th>
</tr>
<tr>
<td>72057594040549376</td>
<td>53575229</td>
<td>2</td>
<td>1</td>
<td>72057594040549376</td>
<td>10</td>
</tr>
<tr>
<td>72057594040614912</td>
<td>53575229</td>
<td>2</td>
<td>2</td>
<td>72057594040614912</td>
<td>10</td>
</tr>
<tr>
<td>72057594040680448</td>
<td>53575229</td>
<td>2</td>
<td>3</td>
<td>72057594040680448</td>
<td>10</td>
</tr>
<tr>
<td>72057594040745984</td>
<td>53575229</td>
<td>2</td>
<td>4</td>
<td>72057594040745984</td>
<td>10</td>
</tr>
<tr>
<td>72057594040811520</td>
<td>53575229</td>
<td>2</td>
<td>5</td>
<td>72057594040811520</td>
<td>10</td>
</tr>
<tr>
<td>72057594040877056</td>
<td>53575229</td>
<td>2</td>
<td>6</td>
<td>72057594040877056</td>
<td>10</td>
</tr>
<tr>
<td>72057594040942592</td>
<td>53575229</td>
<td>2</td>
<td>7</td>
<td>72057594040942592</td>
<td>10</td>
</tr>
<tr>
<td>72057594041008128</td>
<td>53575229</td>
<td>2</td>
<td>8</td>
<td>72057594041008128</td>
<td>10</td>
</tr>
<tr>
<td>72057594041073664</td>
<td>53575229</td>
<td>2</td>
<td>9</td>
<td>72057594041073664</td>
<td>10</td>
</tr>
<tr>
<td>72057594041139200</td>
<td>53575229</td>
<td>2</td>
<td>10</td>
<td>72057594041139200</td>
<td>136</td>
</tr>
<tr>
<td>72057594041204736</td>
<td>149575571</td>
<td>1</td>
<td>1</td>
<td>72057594041204736</td>
<td>10</td>
</tr>
<tr>
<td>72057594041270272</td>
<td>149575571</td>
<td>1</td>
<td>2</td>
<td>72057594041270272</td>
<td>10</td>
</tr>
<tr>
<td>72057594041335808</td>
<td>149575571</td>
<td>1</td>
<td>3</td>
<td>72057594041335808</td>
<td>10</td>
</tr>
<tr>
<td>72057594041401344</td>
<td>149575571</td>
<td>1</td>
<td>4</td>
<td>72057594041401344</td>
<td>10</td>
</tr>
<tr>
<td>72057594041466880</td>
<td>149575571</td>
<td>1</td>
<td>5</td>
<td>72057594041466880</td>
<td>10</td>
</tr>
<tr>
<td>72057594041532416</td>
<td>149575571</td>
<td>1</td>
<td>6</td>
<td>72057594041532416</td>
<td>10</td>
</tr>
<tr>
<td>72057594041597952</td>
<td>149575571</td>
<td>1</td>
<td>7</td>
<td>72057594041597952</td>
<td>10</td>
</tr>
<tr>
<td>72057594041663488</td>
<td>149575571</td>
<td>1</td>
<td>8</td>
<td>72057594041663488</td>
<td>10</td>
</tr>
<tr>
<td>72057594041729024</td>
<td>149575571</td>
<td>1</td>
<td>9</td>
<td>72057594041729024</td>
<td>10</td>
</tr>
<tr>
<td>72057594041794560</td>
<td>149575571</td>
<td>1</td>
<td>10</td>
<td>72057594041794560</td>
<td>136</td>
</tr>
</tbody>
</table>
<p>This definitely has my partition schema in there somewhere since I know I have 10 partitions and have set the row quantities up to be 10 rows for the first 9 partitions and 136 rows in the tenth, it is pretty visible.</p>
<p>I&#8217;ve also got an indexed view on the same table which explains the duplicate set of values, and a NC index on the table which explained the triplicate set of values I&#8217;ve not pasted in. This is in essence what I am after though; finding out which object_id&#8217;s reference the partition schema.</p>
<p>A couple of immediate dead ends have also appeared:    </p>
<ul>
<li>The partition_id looks useful but is the unique id of the partition record not an ID relating to the partition schema.</li>
</ul>
<ul>
<li>The hobt_id is the heap or b-tree pointer for that specific partition so is not going to be able to help us since there are 10 hobt_id&#8217;s per object on the schema, all different.</li>
</ul>
<p>It does however provide the object_id which we know we can translate into a name very easily and a partition_number column which only ever exceeds 1 on a partitioned table. So with a bit of a throw-away style query to just select those with a partition_number of 2 to make sure we only select partitioned objects gave me the following:  </p>
<pre><span style="color:#0000ff;">SELECT </span>o.Name, s.*
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partitions </span>s
<span style="color:#0000ff;">INNER</span> <span style="color:#0000ff;">JOIN</span> <span style="color:#008000;">sys.objects</span> o on s.object_id = o.object_id
<span style="color:#0000ff;">WHERE</span> partition_number = 2</pre>
<p> </p>
<table>
<tbody>
<tr>
<th>Name</th>
<th>partition_id</th>
<th>object_id</th>
<th>index_id</th>
<th>partition_number</th>
</tr>
<tr>
<td>tblTestBalance</td>
<td>72057594039304192</td>
<td>53575229</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>foo</td>
<td>72057594039959552</td>
<td>85575343</td>
<td>0</td>
<td>2</td>
</tr>
<tr>
<td>tblTestBalance</td>
<td>72057594040614912</td>
<td>53575229</td>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td>vwTestView</td>
<td>72057594041270272</td>
<td>149575571</td>
<td>1</td>
<td>2</td>
</tr>
</tbody>
</table>
<p>So now I can see the partitioned objects from the other direction, but I have found no relation between the objects identified as partitioned and the partition schemas discovered earlier. </p>
<p>There is also an index_id being shown and documented as &#8216;the index within the object to which this partition belongs&#8217;, but this is not an object_id for an index, but the index_id you would normally use within a dbcc in command, 0 for heap, 1 for clustered index etc, so there must be a relation to the sys.indexes view &#8211; which when I thought about it made complete sense &#8211; the sys.indexes table is really badly named, since it is not a row per index, but a row per heap or index. </p>
<p>Not the best name in the world for it, but let&#8217;s move on &#8211; given we have both the object ID and the index_id we can join on both / filter in the where clause. </p>
<pre><span style="color:#0000ff;">SELECT </span>O.Name, p.*
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partitions</span> p
<span style="color:#0000ff;">INNER JOIN</span><span style="color:#008000;"> sys.objects</span> O on p.object_id = o.object_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.indexes</span> I on O.object_id = I.object_id and p.index_id = I.index_id
<span style="color:#0000ff;">WHERE</span> partition_number = 2</pre>
<p>Same output as before, since I have selected no fields from sys.indexes yet, checking the list an immediate candidate jumped out, data_space_id &#8211; I already had an odd-looking data_space_id earlier, so can the index link to it successfully?  </p>
<pre><span style="color:#0000ff;">SELECT </span>O.Name, ds.name
<span style="color:#0000ff;">FROM</span><span style="color:#008000;"> sys.partitions </span>p
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.objects</span> O on p.object_id = o.object_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.indexes</span> I on O.object_id = I.object_id and P.index_id = I.index_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.data_spaces</span> ds on i.data_space_id = ds.data_space_id <span style="color:#0000ff;">WHERE</span> partition_number = 2</pre>
<p>Which gave the following results:   </p>
<table>
<tbody>
<tr>
<th>name</th>
<th>name</th>
</tr>
<tr>
<td>tblTestBalance</td>
<td>psBalanceLeft</td>
</tr>
<tr>
<td>foo</td>
<td>psBalanceLeft</td>
</tr>
<tr>
<td>tblTestBalance</td>
<td>psBalanceLeft</td>
</tr>
<tr>
<td>vwTestView</td>
<td>psBalanceLeft</td>
</tr>
</tbody>
</table>
<p>I then rejoined in the partition schemes using the same join I did earlier:     </p>
<pre><span style="color:#0000ff;">SELECT</span> O.Name, ds.name
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partitions</span> p
<span style="color:#0000ff;">INNER</span> <span style="color:#0000ff;">JOIN</span> <span style="color:#008000;">sys.objects </span>O on p.object_id = o.object_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.indexes</span> I on O.object_id = I.object_id and P.index_id = I.index_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.data_spaces</span> ds on i.data_space_id = ds.data_space_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.partition_schemes</span> ps on ds.data_space_id = ps.data_space_id</pre>
<p>Running this gave 40 rows, 4 objects x 10 partitions, so I was filtered down to the partitioned objects but I returned too many rows &#8211; I could have used a group by clause but it seemed simpler to just select a single partition number, and since I know partition number 1 will always exist, that was the simplest to use.</p>
<p>I am now down all the way to the partition scheme, time to select some of the more interesting columns I found along the way, that are applicable to the dynamic partitioning problem I am looking at, the main one being the object name and type. </p>
<p>The final solution I&#8217;ve arrived at to get to the tables / indexes and indexed views using a partition scheme is:     </p>
<pre><span style="color:#0000ff;">SELECT </span>O.Name as TableName, I.Name as IndexName, I.Type, I.type_desc as IndexType, ps.name as PartitionSchema
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.objects </span>O
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.partitions</span> p on P.object_id = O.object_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.indexes</span> i on p.object_id = i.object_id and p.index_id = i.index_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.data_spaces</span> ds on i.data_space_id = ds.data_space_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.partition_schemes</span> ps on ds.data_space_id = ps.data_space_id
<span style="color:#0000ff;">WHERE</span> p.partition_number = 1</pre>
<table>
<tbody>
<tr>
<th>TableName</th>
<th>IndexName</th>
<th>Type_ID</th>
<th>Type_Desc</th>
<th>PartitionSchema</th>
</tr>
<tr>
<td>tblTestBalance</td>
<td>PK_tblTestBalance</td>
<td>1</td>
<td>CLUSTERED</td>
<td>psBalanceLeft</td>
</tr>
<tr>
<td>foo</td>
<td>NULL</td>
<td>0</td>
<td>HEAP</td>
<td>psBalanceLeft</td>
</tr>
<tr>
<td>tblTestBalance</td>
<td>testncindex</td>
<td>2</td>
<td>NONCLUSTERED</td>
<td>psBalanceLeft</td>
</tr>
<tr>
<td>vwTestView</td>
<td>IxViewTest</td>
<td>1</td>
<td>CLUSTERED</td>
<td>psBalanceLeft</td>
</tr>
</tbody>
</table>
<p style="text-align:justify;">
<p style="text-align:justify;">
<p style="text-align:justify;">This can be further filtered to an individual partition scheme based on the name trivially, but the output is giving us the information I am after &#8211; a list of what objects and type of object is allocated to the partition schemas.       </p>
<p>There should be an easier way to get to this information than joining 5 system views, but that seems to be the only way I could manage to solve the problem.</p>
<br />Posted in SQL Server Tagged: Dynamic Partitioning, SQL Server 2005, SQL Server 2008, SQL Tuesday, Table Partitioning <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/383/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/383/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/383/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/383/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/383/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/383/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/383/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/383/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/383/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/383/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/383/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/383/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/383/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/383/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=383&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/01/12/dynamic-partitioning-what-objects-are-using-a-partition-schema-sql-tuesday-002/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Dynamic Partitioning : Wishlist</title>
		<link>http://sqlfascination.com/2010/01/11/dynamic-partitioning-wishlist/</link>
		<comments>http://sqlfascination.com/2010/01/11/dynamic-partitioning-wishlist/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 21:07:27 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Dynamic Partitioning]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=372</guid>
		<description><![CDATA[Whilst I consider dynamic partitioning something that really doesn&#8217;t serve a valid purpose that I can find yet, I decided to use it as an exercise to program a basic form of it within T-SQL over the coming weeks. Given a blank piece of paper and some realism, what are the aims for the design [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=372&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Whilst I consider <a href="http://sqlfascination.com/2010/01/05/is-dynamic-partitioning-in-sql-server-possible/" target="_blank">dynamic partitioning </a>something that really doesn&#8217;t serve a valid purpose that I can find yet, I decided to use it as an exercise to program a basic form of it within T-SQL over the coming weeks.</p>
<p>Given a blank piece of paper and some realism, what are the aims for the design and T-SQL:</p>
<ul>
<li>Batch based rebalancing &#8211; real-time is not realistic so let&#8217;s start with an overnight job.</li>
<li>Choice to Balance by different metrics (Rows vs Physical Storage)</li>
<li>Balance across a setup-defined fixed number of partitions &#8211; so that they do not run out.</li>
<li>Ability to migrate Filegroups into and out of the Partition Scheme &#8211; e.g. schedule them for removal over the coming nights.</li>
<li>Ability to limit the processing to a window &#8211; this is not easy, but a log of earlier migrations would offer guidance on how much processing could be done within an allotted time span.</li>
<li>Ability to choose the specify the balancing as an online operation &#8211; partitioning being enterprise only we can automatically rely on online index rebuilds being available.</li>
</ul>
<p>That&#8217;s not a bad start although I bet it is harder than it sounds.</p>
<p>Let&#8217;s just consider the &#8216;balancing act&#8217; itself regardless of the options. A partition schema is not a database table &#8211; which automatically complicates matters since multiple tables and indexes can use the same partition schema. This means that any change to a partition scheme / function will directly affect more than 1 table / index. Any calculations for the number of rows / size of data will have to take all the tables and indexes into account.</p>
<p>It might seem unusual to place more than one table on a partition schema, but it really isn&#8217;t. You would commonly place any NC indexes also on the same partition schema to keep them &#8216;aligned&#8217;, so having multiple tables for the same &#8216;alignment&#8217; purpose shouldn&#8217;t seem weird. If you consider the multi-tenancy usage of the partitioned table, then you can see why you could have dozens of tables all on the same partition schema.</p>
<p>These requirements are the starting point for the T-SQL and as I come across issues I will write them up.</p>
<br />Posted in SQL Server Tagged: Dynamic Partitioning, SQL Server 2005, SQL Server 2008, Table Partitioning <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/372/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/372/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/372/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=372&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/01/11/dynamic-partitioning-wishlist/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Is Dynamic Partitioning in SQL Server Possible?</title>
		<link>http://sqlfascination.com/2010/01/05/is-dynamic-partitioning-in-sql-server-possible/</link>
		<comments>http://sqlfascination.com/2010/01/05/is-dynamic-partitioning-in-sql-server-possible/#comments</comments>
		<pubDate>Tue, 05 Jan 2010 21:25:37 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Dynamic Partitioning]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=363</guid>
		<description><![CDATA[I often see people asking whether dynamic table partitioning exists in SQL Server, or they provide a scenario that would effectively be asking the same question. So let&#8217;s get the easy answer out now &#8211; straight out of the box SQL Server has no dynamic partitioning. To be fair, straight out of the box there [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=363&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I often see people asking whether dynamic table partitioning exists in SQL Server, or they provide a scenario that would effectively be asking the same question. So let&#8217;s get the easy answer out now &#8211; straight out of the box SQL Server has no dynamic partitioning.</p>
<p>To be fair, straight out of the box there is no tooling surrounding partitioning either except for a handful of DMV&#8217;s &#8211; if you want to <a href="http://sqlfascination.com/2009/12/09/rolling-a-partition-forward-part-1/" target="_blank">automate a rolling window</a>, then you need to program that yourself. SQL Server 2008 added a few bits; it struck me that if you need to use a wizard to turn an existing table into a partitioned table then your not really planning ahead.</p>
<p>So if it is possible to automate a rolling window system, surely it is possible to automate some kind of dynamic partitioning?</p>
<p>Well, that depends on what the definition of &#8216;dynamic partitioning&#8217; is when it comes to SQL, which would normally be defined by the person who needs the feature to solve their specific issue. Before I start writing up a wish list of options and features to guide me hacking some SQL together to solve the problem &#8211; you have to ask; do you really need dynamic partitioning?</p>
<p>Table Partitioning by its nature suits larger volumes of data in a rolling window, where we migrate older data out and bring in new values. However, partitioning has been used for a variety of purposes that it possibly was not considered for originally such as:</p>
<ul>
<li>Performance gain through Partition Elimination</li>
<li>Multi-Tenancy databases, placing each client in a separate partition</li>
</ul>
<p>Bizarrely each of those reasons has a counter argument:</p>
<ul>
<li>Partition elimination only benefits queries that include the partition key in the where clause, otherwise it is detrimental to the query since it requires every partition is examined.</li>
<li>Aside from the limit of 1000 partitions therefore 1000 customers, security is easier to compromise, upgrades per customer are not possible and the whole backup restore strategy for individual customers get&#8217;s very complex since you do not wish to restore the whole table but a single partition.</li>
</ul>
<p>Back to the question, do we really need dynamic partitioning?</p>
<p>The complexity and scale of most partitioned tables indicates that they should not occur by &#8216;accident&#8217;, and retro-fitting a partitioned table indicates a lack of data modelling / capacity planning.  The &#8216;alternative&#8217; reasons for partitioning, are amongst some of the drivers for the dynamic partitioning request.</p>
<p>To make best use of the partitioned table feature requires planning and design, in which case it does not need to be &#8216;dynamic&#8217;.</p>
<p>That all being said, in the coming posts I am going to write-up my wish list of features to start building a basic dynamic partitioning system and then make it more complex over time &#8211; it makes for a fun exercise.</p>
<p>If you have any thoughts on features you would want to see in it, just add them in a comment.</p>
<br />Posted in SQL Server Tagged: Dynamic Partitioning, SQL Server 2005, SQL Server 2008, Table Partitioning <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/363/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/363/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/363/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=363&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2010/01/05/is-dynamic-partitioning-in-sql-server-possible/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>How Can You Tell if a Database is in Pseudo Full/Bulk Logged Mode?</title>
		<link>http://sqlfascination.com/2009/12/13/how-can-you-tell-if-a-database-is-in-pseudo-fullbulk-logged-mode/</link>
		<comments>http://sqlfascination.com/2009/12/13/how-can-you-tell-if-a-database-is-in-pseudo-fullbulk-logged-mode/#comments</comments>
		<pubDate>Sun, 13 Dec 2009 17:39:21 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Backups]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=352</guid>
		<description><![CDATA[I was asked on Friday, &#8220;how do you tell if a database logging mode is reporting bulk or full, but it is still in simple?&#8221; &#8211; as mentioned before, a database is not in really full / bulk logged unless a full backup has been taken. Until that time the database is still running in [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=352&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I was asked on Friday, &#8220;how do you tell if a database logging mode is reporting bulk or full, but it is still in simple?&#8221; &#8211; as <a href="http://sqlfascination.com/2009/11/06/when-is-bulk-logged-mode-not-what-it-says/" target="_blank">mentioned before</a>, a database is not in really full / bulk logged unless a full backup has been taken. Until that time the database is still running in a simple mode, sometimes referred to as pseudo-simple. It is not easy to spot, because the properties of the database will report full / bulk as appropriate and give no indication that it is not actually logging in the way it says.</p>
<p>The existence of a backup of the database is not a reliable enough mechanism for this, since the database can be backed up and then moved out of full / bulk logged mode into simple and back again. This breaks the backup and transaction log chain, but the database is still reporting full &#8211; to make it worse there is a backup record showing on the history, giving it an air of legitimacy.</p>
<p>The backup records can be accessed from the sys.sysdatabases and msdb.dbo.backupset, <a href="http://code.msdn.microsoft.com/SQLExamples/Wiki/View.aspx?title=LastBackUpDate" target="_blank">MSDN</a> even has an example script showing how to see when a database was last backed up and by whom.</p>
<pre><span style="color:#0000ff;">SELECT </span>
T1.Name <span style="color:#0000ff;">as</span> DatabaseName, <span style="color:#ff00ff;">COALESCE</span>(<span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(12), <span style="color:#ff00ff;">MAX</span>(T2.backup_finish_date), 101),<span style="color:#ff0000;">'Not Yet Taken'</span>) <span style="color:#0000ff;">as</span> LastBackUpTaken, <span style="color:#ff00ff;">COALESCE</span>(<span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(12), <span style="color:#ff00ff;">MAX</span>(T2.user_name), 101),'NA') as UserName
<span style="color:#0000ff;">FROM </span><span style="color:#008000;">sys.sysdatabases </span>T1 LEFT OUTER JOIN msdb.dbo.backupset T2 <span style="color:#0000ff;">ON</span> T2.database_name = T1.name
<span style="color:#0000ff;">GROUP BY</span> T1.Name
<span style="color:#0000ff;">ORDER BY</span> T1.Name</pre>
<p>To play around with the scripts you probably want a test database:</p>
<pre><span style="color:#0000ff;">CREATE DATABASE </span>[LogModeTest]<span style="color:#0000ff;"> ON  PRIMARY</span>
( <span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'LogModeTest'</span>, <span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\LogModeTest.mdf' </span>, <span style="color:#0000ff;">SIZE</span> = 3072KB , <span style="color:#0000ff;">MAXSIZE</span> = UNLIMITED, <span style="color:#0000ff;">FILEGROWTH</span> = 1024KB )
 LOG ON
( <span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'LogModeTest_log'</span>, <span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\LogModeTest_log.ldf'</span> , <span style="color:#0000ff;">SIZE</span> = 1024KB , <span style="color:#0000ff;">MAXSIZE</span> = 2048GB , <span style="color:#0000ff;">FILEGROWTH</span> = 10%)
 <span style="color:#0000ff;">COLLATE</span> Latin1_General_CI_AI</pre>
<p>With a minor alteration to the MSDN script you can get the backup history for this database:</p>
<pre><span style="color:#0000ff;">SELECT</span>
T1.Name <span style="color:#0000ff;">as </span>DatabaseName, <span style="color:#ff00ff;">COALESCE</span>(<span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(12), <span style="color:#ff00ff;">MAX</span>(T2.backup_finish_date), 101),<span style="color:#ff0000;">'Not Yet Taken'</span>) <span style="color:#0000ff;">as</span> LastBackUpTaken <span style="color:#0000ff;">FROM </span><span style="color:#008000;">sys.sysdatabases </span>T1
LEFT OUTER JOIN msdb.dbo.backupset T2 ON T2.database_name = T1.name
<span style="color:#0000ff;">WHERE</span> T1.Name = 'LogModeTest'
<span style="color:#0000ff;">GROUP BY</span> T1.Name</pre>
<p>The results show the database is not yet backed up:</p>
<pre>DatabaseName                  LastBackUpTaken 
----------------------------- ---------------
LogModeTest                   Not Yet Taken</pre>
<p>That is easy to fix, so let&#8217;s take a backup of the database and recheck the last backup value.</p>
<pre><span style="color:#0000ff;">BACKUP DATABASE</span> [LogModeTest]<span style="color:#0000ff;"> TO  DISK</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Backup\LogModeTest.bak'</span> <span style="color:#0000ff;">WITH</span> NOFORMAT, NOINIT,  <span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'LogModeTest-Full Database Backup'</span>, SKIP, NOREWIND, NOUNLOAD,  STATS = 10

DatabaseName                   LastBackUpTaken 
------------------------------ ---------------
LogModeTest                    12/13/2009</pre>
<p>As expected the date of the backup is now set. If we alter the logging mode of the database to simple we will break the transaction log chain. To demonstrate the backup information being an unreliable source, let&#8217;s change to simple, create a table and then return to the fully logged mode.</p>
<pre><span style="color:#0000ff;">ALTER DATABASE</span> [LogModeTest] <span style="color:#0000ff;">SET </span>RECOVERY SIMPLE <span style="color:#0000ff;">WITH</span> NO_WAIT
<span style="color:#0000ff;">CREATE TABLE</span> foo(id <span style="color:#0000ff;">int</span> identity)
<span style="color:#0000ff;">ALTER DATABASE</span> [LogModeTest] <span style="color:#0000ff;">SET</span> RECOVERY FULL <span style="color:#0000ff;">WITH</span> NO_WAIT</pre>
<p>If we now attempt to backup the transaction log, SQL is going to throw an error.</p>
<pre><span style="color:#0000ff;">BACKUP LOG </span>[LogModeTest] <span style="color:#0000ff;">TO  DISK</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Backup\LogModeTest.bak'</span> <span style="color:#0000ff;">WITH</span> NOFORMAT, NOINIT,  <span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'LogModeTest-Transaction Log  Backup'</span>, SKIP, NOREWIND, NOUNLOAD,  STATS = 10 

<span style="color:#ff0000;">Msg 4214, Level 16, State 1, Line 1
BACKUP LOG cannot be performed because there is no current database backup.
Msg 3013, Level 16, State 1, Line 1
BACKUP LOG is terminating abnormally.</span></pre>
<p>And if we check the database backup history using the MSDN script:</p>
<pre>DatabaseName                   LastBackUpTaken
------------------------------ ---------------
LogModeTest                    12/13/2009</pre>
<p>So the backup history continues to show a date of the last full backup even though the transaction log chain is now broken. SQL certainly knows the database has not had a full backup since swapping into fully logged mode, so any transaction log backup is invalid, thus the error.</p>
<p>There is an easier way to find out that you are in pseudo-simple mode, without trying to perform a transaction log backup:</p>
<pre><span style="color:#0000ff;">SELECT </span>name, <span style="color:#ff00ff;">COALESCE</span>(<span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(30),last_log_backup_lsn), 'No Full Backup Taken') as BackupLSN 
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.databases</span>
<span style="color:#0000ff;">INNER JOIN</span><span style="color:#008000;"> sys.database_recovery_status</span> on <span style="color:#008000;">sys.databases</span>.database_id = <span style="color:#008000;">sys.database_recovery_status</span>.database_id</pre>
<p>Run this against your server and it lists the databases that have had a backup taken (by the existence of a backup LSN) and which have not had a full backup that could be used in recovery. If we then backup the database and recheck the values, the test database now records an LSN, showing it is out of psuedo-simple and into the full / bulk logged modes.</p>
<p>So that indicates whether we are in pseudo simple or not, but does not link back to the properties of the database to check what is the actual database logging mode &#8211; you are primarily only interested in databases that are not in simple mode in the first place, but are running in psuedo-simple due to the lack of a relevant full database backup. We can alter the query to handle this specific situation and the result is:</p>
<pre><span style="color:#0000ff;">SELECT</span> name, recovery_model_desc, <span style="color:#ff00ff;">COALESCE</span>(<span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(30),last_log_backup_lsn), 'No Full Backup Taken') <span style="color:#0000ff;">as</span> BackupLSN 
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.databases</span>
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.database_recovery_status</span> on<span style="color:#008000;"> sys.databases</span>.database_id = <span style="color:#008000;">sys.database_recovery_status</span>.database_id
<span style="color:#0000ff;">WHERE</span> <span style="color:#008000;">sys.databases</span>.recovery_model &lt;&gt; 3 <span style="color:#0000ff;">AND</span> last_log_backup_lsn is null</pre>
<p>If you run that query against your database server and get any results then you have databases that are not running the recovery mode they are indicating / you that you think they are &#8211; which would generally not be a good thing.</p>
<br />Posted in SQL Server Tagged: Backups, Internals, SQL Server 2005, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/352/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/352/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/352/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/352/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/352/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/352/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/352/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/352/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/352/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/352/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/352/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/352/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/352/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/352/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=352&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/12/13/how-can-you-tell-if-a-database-is-in-pseudo-fullbulk-logged-mode/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Rolling a Partition Forward &#8211; Part 2</title>
		<link>http://sqlfascination.com/2009/12/10/rolling-a-partition-forward-part-2/</link>
		<comments>http://sqlfascination.com/2009/12/10/rolling-a-partition-forward-part-2/#comments</comments>
		<pubDate>Thu, 10 Dec 2009 22:52:37 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=290</guid>
		<description><![CDATA[The first part of this topic provided a mini-guide to loading data into a partitioned table and a few helpful DMV based statements that can help you automate the process. The unloading of the data should in theory be easier, but to do this in an automated fashion you are more reliant on the DMVs [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=290&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The<a href="http://sqlfascination.com/2009/12/09/rolling-a-partition-forward-part-1/" target="_blank"> first part </a>of this topic provided a mini-guide to loading data into a partitioned table and a few helpful DMV based statements that can help you automate the process. The unloading of the data should in theory be easier, but to do this in an automated fashion you are more reliant on the DMVs and system views to get to the right information.</p>
<p>The steps to unload a partition of data are:</p>
<ul>
<li><em>Discover which file group the oldest partition is on.</em></li>
<li>Create a staging table on the same filegroup with an identical schema and indexes</li>
<li><em>Switch the data out to the staging table</em></li>
<li><em>Merge the partition function</em></li>
<li>Archive / Drop the data as appropriate.</li>
</ul>
<p> As in part 1, there are 3 sections of the process which are not so common, whilst the creation of a table and the archive / drop of the old data at the end is standard T-SQL that you will be using regularly.</p>
<p><strong>Discover which Filegroup the Oldest Partition is On</strong></p>
<p>When checking for the oldest filegroup, I have assumed that the basis of the rolling window is that the highest boundary is the most recent data, whilst the lowest boundary is the oldest &#8211; in essence time is moving forward and the partition key ascends, not descends. The oldest boundary will therefore be boundary 1, how do you get the Filegroup name of the filegroup this partition is on? A somewhat complex use of a set of DMV&#8217;s.</p>
<pre><span style="color:#0000ff;">SELECT<span style="color:#008000;"> </span></span><span style="color:#008000;">sys.filegroups.Name </span><span style="color:#0000ff;">as</span> FileGroupName <span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partition_schemes </span>
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.destination_data_spaces</span> <span style="color:#0000ff;">ON</span> <span style="color:#008000;">sys.destination_data_spaces</span>.partition_scheme_id =<span style="color:#008000;"> sys.partition_schemes</span>.data_space_id
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.filegroups</span> <span style="color:#0000ff;">ON  </span><span style="color:#008000;">sys.filegroups</span>.data_space_id = <span style="color:#008000;">sys.destination_data_spaces</span>.data_space_ID
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.partition_range_values</span> <span style="color:#0000ff;">ON</span>  <span style="color:#008000;">sys.partition_range_values</span>.Boundary_ID = <span style="color:#008000;">sys.destination_data_spaces</span>.destination_id
AND <span style="color:#008000;">sys.partition_range_values</span>.function_id =<span style="color:#008000;"> sys.partition_schemes</span>.function_id
<span style="color:#0000ff;">WHERE</span><span style="color:#008000;"> sys.partition_schemes</span>.name = '<span style="color:#ff0000;">YourPartitionScheme</span>'
and<span style="color:#008000;"> sys.partition_range_values</span>.boundary_id = 1</pre>
<p>This will return the name of file group, which allows you to create the staging table for the partition switch out on the correct filegroup.</p>
<p>Whilst the data space ID&#8217;s do alter in sequence depending on the partition function being a left or right based partition, the boundary ID remains consistent, which is why it is used to discover the oldest and not the destination_id / data_space_id.</p>
<p><strong>Switch the Data Out to the Staging Table</strong></p>
<p>Switching the data out is not complex, it just is the reverse syntax of switching the partition in essence. Under the hood you are redirecting IAM pointers, so the switch is considered a meta-data command and exceptionally fast.</p>
<pre><span style="color:#0000ff;">ALTER TABLE</span> YourPartitionedTable <span style="color:#0000ff;">SWITCH PARTITION</span> 1 <span style="color:#0000ff;">TO</span> YourStagingTable</pre>
<p>The partition number used is in effect the boundary id, and the oldest boundary is for partition 1 the rolling window.</p>
<p><strong>Merge the Partition Function</strong></p>
<p>The last complex stage is the merging of the partition function, the command explicitly needs the value from the partition function that represents the partition. If you were doing this by hand you would know it, but to automate the process requires the discovery of this information from the DMV&#8217;s again.</p>
<pre><span style="color:#0000ff;">SELECT</span> value
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partition_range_values</span>
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.partition_functions</span> <span style="color:#0000ff;">ON</span> <span style="color:#008000;">sys.partition_functions</span>.function_id = <span style="color:#008000;">sys.partition_range_values</span>.function_id  
<span style="color:#0000ff;"> WHERE </span>name = '<span style="color:#ff0000;">YourPartitionFunctionName</span>' <span style="color:#0000ff;">AND </span>boundary_id = 1</pre>
<p>Again, we are using the boundary value of 1 to extract only the oldest partition function value, but this can then be used in a partition function merge command.</p>
<pre><span style="color:#0000ff;">ALTER PARTITION FUNCTION </span><span style="color:#ff00ff;">YourPartitionFunctionName</span>()<span style="color:#0000ff;"> MERGE RANGE</span> (YourBoundaryValue)</pre>
<p><strong></strong> </p>
<p><strong>Conclusion</strong></p>
<p>Using the DMV&#8217;s and appropriate stored procedures, the rolling window can be automated and does not require hand-crufted SQL to work &#8211; just use of the DMV&#8217;s to get the key values you need to be able to construct the harder parts of the process.</p>
<p>If you are following the<a href="http://sqlfascination.com/2009/10/15/guidance-on-how-to-layout-a-partitioned-table-across-filegroups/" target="_blank"> guide on partition layout </a>I wrote before, then the filegroup you have just removed the data from becomes the next spare filegroup to be used to house the next time data is imported. If you store this within the database, the next load will be able to automatically know where to place the data and set the next used filegroup value to, closing the loop so to speak.</p>
<br />Posted in SQL Server Tagged: SQL Server 2005, SQL Server 2008, Table Partitioning <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/290/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/290/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/290/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/290/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/290/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/290/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/290/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/290/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/290/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/290/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/290/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/290/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/290/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/290/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=290&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/12/10/rolling-a-partition-forward-part-2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Rolling a Partition Forward &#8211; Part 1</title>
		<link>http://sqlfascination.com/2009/12/09/rolling-a-partition-forward-part-1/</link>
		<comments>http://sqlfascination.com/2009/12/09/rolling-a-partition-forward-part-1/#comments</comments>
		<pubDate>Wed, 09 Dec 2009 23:10:55 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=284</guid>
		<description><![CDATA[I have covered how to layout a partitioned table across filegroups previously, but have not gone through the steps of rolling a partitioned window &#8211; it sounds a simple process but with all the file group and pre-requisites for it to run smoothly anyone starting with partitioned tables could probably use a little guide. As [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=284&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I have covered how to layout a <a href="http://sqlfascination.com/2009/10/15/guidance-on-how-to-layout-a-partitioned-table-across-filegroups/" target="_blank">partitioned table across filegroups</a> previously, but have not gone through the steps of rolling a partitioned window &#8211; it sounds a simple process but with all the file group and pre-requisites for it to run smoothly anyone starting with partitioned tables could probably use a little guide. As you are about to see the process is quite intricate so I will go through the load process on this post and the unload on the next.</p>
<p>Because no one case fits all, I have made some assumptions / limitations to provide a guide, specifically:</p>
<ul>
<li>The main partitioned table has a clustered index.</li>
<li>The layout is following the mechanism of keeping a staging filegroup and spare filegroup as detailed in the layout post.</li>
<li>The rollout process intends to remove the oldest data / partition.</li>
<li>The process is designed for large loads, not single inserts.</li>
</ul>
<p>So let&#8217; s see what it takes to prepare and get data into a partitioned table:</p>
<ul>
<li>Create a staging table on your dedicated ETL filegroup, of an identical column schema to your partitioned table.</li>
<li>Load the data into the staging table.</li>
<li>Move the staging table to the spare filegroup, using a clustered index creation. (The need for the spare was covered in the layout post)</li>
<li>Add any additional Non-Clustered indexes required to match the partitioned table indexes.</li>
<li>Constrain the data so that it is considered trusted &#8211; the constraint must ensure all values are within the partition boundary you intend to place it within.</li>
<li><em>Set the Partition Schema Next Used Filegroup</em></li>
<li><em>Split the Partition Function</em></li>
<li><em>Switch the staging table into the main partitioned table</em></li>
</ul>
<p>That was all just to bulk load data into a partitioned table &#8211; a long list and plenty of opportunity for it to go wrong, but most of these steps use T-SQL that you will be very familiar with - it is only the last 3 items that use less common SQL and are harder to automate, since there is no built-in tools to do the work for you.</p>
<p><strong>Setting the Next Used Filegroup</strong></p>
<p>The intention when setting the filegroup is to declare where the partition should locate data for the partition function split that is about to occur. Whilst you can <a href="http://sqlfascination.com/2009/09/30/how-to-remember-the-next-used-filegroup-in-a-partition-scheme/" target="_blank">discover</a> what the previous setting might be, it is not advisable to rely on it but set it every time, just before performing a partition function split. The syntax for the command is:</p>
<pre><span style="color:#0000ff;">ALTER PARTITION SCHEME</span> YourPartitionSchemeName <span style="color:#0000ff;">NEXT USED</span> [YourSpareFG]</pre>
<p><strong>Splitting the Partition Function</strong></p>
<p>This splitting of the partition function is in effect the creation of an extra dividing section on the number line / date line representing the partitioned table. If you split a partition that already has data the operation will be quite expensive since can be forced to move data between filegroups, so it is common in a rolling window scenario that you split to handle the incoming data, which is always in advance of your existing data, e.g. If you are storing sales data partitioned by the month/year of the sales date, and currently only hold up until November, you would not insert any data for December until the partition for December had been created.</p>
<p>The syntax forward:</p>
<pre><span style="color:#0000ff;">ALTER PARTITION FUNCTION </span>YourPartitionFunctionName() <span style="color:#0000ff;">SPLIT RANGE</span> (YourBoundaryValue)</pre>
<p>But when importing new data in an automated fashion, you might not know whether the new partition split has already been performed or not, so how can you check whether the new boundary value is already created in the partition function? DMV&#8217;s can provide the answer:</p>
<pre><span style="color:#0000ff;">SELECT </span><span style="color:#ff00ff;">count</span>(value) as ValueExists <span style="color:#0000ff;">FROM </span><span style="color:#339966;">sys.partition_range_values</span>
<span style="color:#0000ff;">INNER JOIN </span><span style="color:#339966;">sys.PARTITION_FUNCTIONS</span> ON  <span style="color:#339966;">sys.PARTITION_FUNCTIONS</span>.function_id  = <span style="color:#339966;">sys.partition_range_values</span>.function_id
<span style="color:#3366ff;"><span style="color:#0000ff;">WHERE </span></span>name = 'YourPartitionFunctionName' AND value = YourBoundaryValue</pre>
<p>A returned value of 0 would indicate it did not exist, whilst a 1 would indicate a boundary value had already been created.</p>
<p><strong>Switching the Staging Table In</strong></p>
<p>Switching the staging table into the newly created partition looks relatively easy but needs the partition number:</p>
<pre><span style="color:#0000ff;">ALTER TABLE</span> yourStagingTable <span style="color:#0000ff;">SWITCH TO</span> YourPartitionedTable <span style="color:#0000ff;">PARTITION</span> PartitionNumber</pre>
<p>Where do you get the partition number from? The partition number is the boundary ID, and is numbered starting at 1 from the furthers left partition sequentially upwards. If you know the boundary value you have set for the partition, you can get the boundary id using the DMV&#8217;s again</p>
<pre><span style="color:#0000ff;">SELECT </span>boundary_id
<span style="color:#0000ff;">FROM</span> <span style="color:#008000;">sys.partition_range_values</span>
<span style="color:#0000ff;">INNER JOIN</span> <span style="color:#008000;">sys.partition_functions</span> ON<span style="color:#008000;"> sys.partition_functions</span>.function_id  = <span style="color:#008000;">sys.partition_range_values</span>.function_id
<span style="color:#0000ff;">WHERE</span>  name = 'YourPartitionFunctionName' AND value= YourBoundaryValue</pre>
<p>These additional DMVs allow you to get access to the data you need to automate the process in stored procedures, finding out the boundary IDs in one step, to be used in the next step etc.</p>
<p>These are the trickier parts of the process to automate that need the help of the DMVs. In the next post I will go through the unloading of the old data.</p>
<br />Posted in SQL Server Tagged: SQL Server 2005, SQL Server 2008, Table Partitioning <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/284/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/284/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/284/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/284/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/284/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/284/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/284/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/284/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/284/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/284/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/284/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/284/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/284/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/284/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=284&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/12/09/rolling-a-partition-forward-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>How Can You Spot the Procedure Cache Being Flooded?</title>
		<link>http://sqlfascination.com/2009/12/03/how-can-you-spot-the-procedure-cache-being-flooded/</link>
		<comments>http://sqlfascination.com/2009/12/03/how-can-you-spot-the-procedure-cache-being-flooded/#comments</comments>
		<pubDate>Thu, 03 Dec 2009 22:32:34 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Plan Cache]]></category>
		<category><![CDATA[Query Parameterisation]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=278</guid>
		<description><![CDATA[This comes from a question I had a couple of days ago &#8211; the SQL Server: Buffer Manager : Page Life Expectancy provides a performance counter that indicates the current lifetime of a page within memory. As data pages and query object pages are being added to the buffer pool they will of course build up [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=278&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This comes from a question I had a couple of days ago &#8211; the SQL Server: Buffer Manager : Page Life Expectancy provides a performance counter that indicates the current lifetime of a page within memory. As data pages and query object pages are being added to the buffer pool they will of course build up and SQL will come under memory pressure as a result. The normal advice is that this figure should be above 300 seconds, indicating that a page should stay in memory for at least 5 minutes.</p>
<p>This figure however, includes both the data cache and the procedure cache &#8211; which means you can not determine whether the pages being flushed are a result of churning data pages or you are in a situation where ad hoc queries are flooding the procedure cache. You can of course look at the procedure cache using DMV&#8217;s and see the number of objects grow and then shrink, but this is not particularly scientific, nor is it measurable within a trace.</p>
<p>The page life expectancy can easily be traced within Perfmon, but how do you measure the procedure cache? well are a couple of events you can trace in SQL profiler, the primary one I would like to be working do not seem to properly register the event, whilst the secondary does at least work. The two counters are SP:Cache Remove and SP:Cache Insert.</p>
<p>The SP:Cache Remove has 2 Event Sub Classes listed in <a href="http://blogs.msdn.com/sqlprogrammability/archive/2007/01/19/12-0-plan-cache-trace-events-and-performance-counters.aspx" target="_blank">documentation </a>produced by the SQL Programmability team, sub class 2 is for a deliberate procedure cache flush, such as a DBCC FreeProcCache command, sub class 1 is for when a compiled plan is removed due to memory pressure. In testing the deliberate procedure cache flush does show up in the profiler traces, with an event subclass value of &#8217;2 &#8211; Proc Cache Flush&#8217; &#8211; but after a number of tests, I can not ever get the event to be raised when the procedure cache is under memory pressure. If it did then we  would have exactly what I was after, an easy, traceable and recordable way to show a procedure cache under too much pressure.</p>
<p>The SP:Cache Insert is more of a backup mechanism to show the procedure cache is being flooded, but only on the basis that you would count the number of times this event shows up within a trace over a period of time. In essence a SP:Cache Insert is only going to occur if a query does not have a matching query plan within the cache. A large number of these within a short period of time is also going to be an indication that the procedure cache is potentially being flooded.</p>
<p>Combine a large number of SP:Cache Inserts with a low Page Life Expectancy and you can suspect you definitely have a procedure cache flooding problem.</p>
<p>So there is a kind of mechanism to determine whether a low page life expectancy is from data page churn or query page churn, but if the SP:Cache Remove subclass 1 event actually worked, it would be a lot easier. Once you know your plan cache is being flooded, you are then looking to <a href="http://sqlfascination.com/2009/10/31/simple-vs-forced-query-parameterization/" target="_blank">check whether forced parameterization </a>is the worth using to eliminate the issue.</p>
<br />Posted in SQL Server Tagged: Plan Cache, Query Parameterisation, SQL Server 2005, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/278/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/278/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/278/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=278&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/12/03/how-can-you-spot-the-procedure-cache-being-flooded/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Prodata SQL Academy Events</title>
		<link>http://sqlfascination.com/2009/12/02/prodata-sql-academy-events/</link>
		<comments>http://sqlfascination.com/2009/12/02/prodata-sql-academy-events/#comments</comments>
		<pubDate>Wed, 02 Dec 2009 11:21:25 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[SQL Training]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=273</guid>
		<description><![CDATA[If you haven&#8217;t seen them advertised, Bob Duffy from Prodata is running a series of SQL Academy half day training session in Dublin, hosted at the Microsoft Auditorium in their offices in Leopardstown &#8211; the events are level 300 which suits the half day slot allocated for the sessions &#8211; yesterday&#8217;s was about performance tuning [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=273&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>If you haven&#8217;t seen them advertised, <a href="http://blogs.msdn.com/boduff/default.aspx" target="_blank">Bob Duffy</a> from <a href="http://www.prodata.ie/" target="_blank">Prodata </a>is running a series of SQL Academy half day training session in Dublin, hosted at the Microsoft Auditorium in their offices in Leopardstown &#8211; the events are level 300 which suits the half day slot allocated for the sessions &#8211; yesterday&#8217;s was about performance tuning an optimisation so myself and a colleague took a short flight over and enjoyed the excellent Irish hospitality. The talk was recorded so there will no doubt be a webcast published at some point published by Technet in Ireland. The talk primarily went through using perfmon counters and wait states &#8211; and the available tools that can make this a lot easier by wrapping up and correlating results from different logging mechanisms.</p>
<p>I would recommend keeping an eye out for the cast when it appears, since troubleshooting a production environment is all about using non-intrusive means to understand what is crippling the systems &#8211; memory, cpu, IO etc. If you are not practised at this form of troubleshooting it is very difficult to know which performance counters and wait states to observe amongst the thousands that exist &#8211; as well as which DMV&#8217;s can give you the critical information to diagnose the problems. (It was quite interesting that the demonstration performance issue he was looking at was fundamentally a combination of a missing index but more critically was a lack of query parameterisation since it was in simple mode. The counters used to diagnose this problem, and the symptoms that you might encounter I have <a href="http://sqlfascination.com/2009/10/31/simple-vs-forced-query-parameterization/" target="_blank">previously written about</a>.)</p>
<p>The wait-state side of the talk was very interesting, I often use a combination of DMV&#8217;s and perfmon in the field to diagnose, but have only used a certain amount of the wait-state information and do not delve into it as deeply &#8211; I will definitely be adding a few more wait states to the list for the future.</p>
<p>The next event is on February 16th and covers SQL Analysis Services &#8211; <a href="http://msevents.microsoft.com/CUI/EventDetail.aspx?EventID=1032428016&amp;Culture=en-IE" target="_blank">registration </a>is already open.</p>
<br />Posted in SQL Server Tagged: SQL Server 2005, SQL Server 2008, SQL Training <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/273/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/273/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/273/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/273/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/273/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/273/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/273/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/273/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/273/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/273/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/273/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/273/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/273/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/273/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=273&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/12/02/prodata-sql-academy-events/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Why is SQL Azure and Index Fragmentation a Bad Combination?</title>
		<link>http://sqlfascination.com/2009/11/25/why-is-sql-azure-and-index-fragmentation-a-bad-combination/</link>
		<comments>http://sqlfascination.com/2009/11/25/why-is-sql-azure-and-index-fragmentation-a-bad-combination/#comments</comments>
		<pubDate>Wed, 25 Nov 2009 22:20:10 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Indexes]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Azure]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=261</guid>
		<description><![CDATA[I&#8217;ve been thinking through and experimenting a bit more with some of the concepts in SQL Azure &#8211; specifically I was considering the impact of fragmentation on both the storage (in terms of the storage limit) as well as the maintenance. This is not a new issue, DBA&#8217;s face fragmentation regularly and can deal with it [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=261&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been thinking through and experimenting a bit more with some of the concepts in SQL Azure &#8211; specifically I was considering the impact of fragmentation on both the storage (in terms of the storage limit) as well as the maintenance. This is not a new issue, DBA&#8217;s face fragmentation regularly and can deal with it in a variety of ways, but with SQL Azure the problem looks magnified by a lack of tools and working space. Whilst looking into this, I then realised that there is an unfortunate consequence of not knowing how much data space your index is actually using.</p>
<p>Each table in SQL Azure has to have a clustered index if data is going to be inserted into it and clustered indexes can suffer from fragmentation if chosen poorly. The combination of SQL Azure and the time-honoured fragmentation provides three consequences about it, fragmentation:</p>
<ul>
<li>will occur and you have no way in which to measure it due to the lack of DMV support.</li>
<li>will create wasted space within your space allocation limit.</li>
<li>will reduce your performance.</li>
</ul>
<p>You could work it out if you knew how much space you had actually used vs. what the size of the data held is, but we are unable to measure either of those values. If you have chosen the data compression option on the index then even those values would not give you a fragmentation ratio.</p>
<p>This leaves us with a situation in which you can not know how much you are fragmented, meaning:</p>
<ul>
<li>You schedule a regular index rebuild.</li>
<li>Hope SQL Azure performs index rebuilds for you.</li>
</ul>
<p>I&#8217;m not aware of SQL Azure doing this for you &#8211; and you do not have SQL Agent facilities either.</p>
<p>So this seems very wrong, the concept of SQL Azure is to take away a lot of the implementation details and hassle from the subscriber &#8211; DR and failover is handled etc. But there looks to be as gap in which certain items such as fragmentation is falling &#8211; I have not seen any documentation saying SQL Azure handles it (but there could be some hidden somewhere and I hope there is!) and neither are you given the right tools in which to program and handle it yourself.</p>
<p>What happens when you hit that size limit?</p>
<pre><span style="color:#ff0000;">Msg 40544, Level 20, State 5, Line 1 The database has reached its size quota. Partition or delete data, drop indexes, or consult the documentation for possible resolutions. Code: 524289 </span></pre>
<p>That took a lot of time to get to, (SQL Azure is not fast), but was generated using a simple example that would also demonstrate fragmentation.</p>
<pre><span style="color:#0000ff;">Create Table</span> fragtest ( id <span style="color:#0000ff;">uniqueidentifier primary key clustered</span>,
padding <span style="color:#0000ff;">char</span>(3000)
) </pre>
<p>Very simple stuff, deliberately using a clustered key on a GUID to cause a decent level of fragmentation, and also using the padding fixed with character field to ensure 2 rows per page only, maximising the page splits.</p>
<pre><span style="color:#0000ff;">insert into</span> fragtest <span style="color:#0000ff;">values</span> (<span style="color:#ff00ff;">newid</span>(), <span style="color:#ff00ff;">replicate</span>(<span style="color:#ff0000;">'a'</span>,1000))
go 200000</pre>
<p>Because of the randomness of the newid() function, the level of fragmentation is not predictable but will certainly occur &#8211; in my test I hit the wall on 196,403 records and failed with an out of space message.</p>
<p>Given the 2 rows per page and the number of rows, with ~0% fragmentation the data should be able ~767Mb &#8211; that is considerably short of 1 Gb &#8211; so there is a significant level of fragmentation in there wasting space, about 23% of it. If you include the 2k per page being wasted by the awkward row size then the actual raw data stored is roughly ~60% of the overall size allowing for row overheads etc.</p>
<p>So there are two important points from this contrived example:</p>
<ul>
<li>You can lose significant space from bad design.</li>
<li>Doing this backs you into a corner that you will not be able to get out of &#8211; this is the worst part.</li>
</ul>
<p>How are you cornered? well, try work out how to get out of the situation and defrag the clustered index / free up the space, you could:</p>
<ul>
<li>Attempt an index rebuild.</li>
<li>Try to rebuild it with SORT_IN_TEMP.</li>
<li>Drop the index.</li>
<li>Delete data.</li>
</ul>
<p>The first three fail, the SORT_IN_TEMP is not supported and would not of rescued the situation either since you have no working space in which to write the newly sorted rows prior to removing the old ones.  So do you really want to delete data? I don&#8217;t think we can consider that an option for now.</p>
<p>This all seems like a &#8216;rock&#8217; and a &#8216;hard place&#8217;; whilst SQL Azure can support these data quantities,  it seems prudent that you never consider actually going close to them at all &#8211; and that you equally are going to find it difficult to understand if you are close to them, since there is no way of measuring the fragmentation. The alternative is that you manually rebuild indexes on a regular basis to control fragmentation, but then enough free space is going to have to be left to allow you to rebuild your largest index without running out of space &#8211; reducing your data capacity significantly.</p>
<p>The corner is not entirely closed off, the way out of the corner would be to create another SQL Azure database within my account and select the data from database1.fragtest to database2.fragtest and then drop the original table and transfer it back &#8211; not ideal but it would work in an emergency.</p>
<p>I think the key is to design to make sure you do not have to face this issue; keep your data quantities very much under the SQL Azure size limits, and watch for the potential of tables being larger than the remaining space and preventing an re-indexing from occurring.</p>
<p>Interested to know your thoughts on this one, and what other consequences of being close to the limit will come out.</p>
<br />Posted in SQL Server Tagged: Indexes, Internals, SQL Azure <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/261/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/261/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/261/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=261&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/11/25/why-is-sql-azure-and-index-fragmentation-a-bad-combination/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>PDC09 : Day 3 &#8211; SQL Azure and Codename &#8216;Houston&#8217; announcement</title>
		<link>http://sqlfascination.com/2009/11/20/pdc-09-day-3-sql-azure-and-codename-houston-announcement/</link>
		<comments>http://sqlfascination.com/2009/11/20/pdc-09-day-3-sql-azure-and-codename-houston-announcement/#comments</comments>
		<pubDate>Fri, 20 Nov 2009 01:03:48 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[PDC 09]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA['Houston']]></category>
		<category><![CDATA[MaxDop]]></category>
		<category><![CDATA[PDC09]]></category>
		<category><![CDATA[Query Parameterisation]]></category>
		<category><![CDATA[SQL Azure]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=248</guid>
		<description><![CDATA[The PDC is just about over, the final sessions have finished and the place is emptying rapidly &#8211; the third day has included a lot of good information about SQL Azure, the progress made to date on it as well as the overall direction &#8211; including a new announcement by David Robinson, Senior PM on [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=248&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The PDC is just about over, the final sessions have finished and the place is emptying rapidly &#8211; the third day has included a lot of good information about SQL Azure, the progress made to date on it as well as the overall direction &#8211; including a new announcement by David Robinson, Senior PM on the Azure team about a project codenamed &#8216;Houston&#8217; .</p>
<p>During the sessions today the 10Gb limit on a SQL Azure database was mentioned a number of times, but each time was caveated with the suggestion that this is purely the limit right now, and it will be increased. To get around this limit, you can partition your data across multiple SQL Azure databases, as long as your application logic understands which database to get the data from. There was no intrinsic way of creating a view across the databases, but it immediately made me consider that if you were able to use the linked server feature of the full SQL Server, you could link to multiple Azure databases and created a partitioned view across the SQL Azure databases &#8211; got to try that out when I get back to the office but I do not expect it to work.</p>
<p>SQL Azure handles all of the resilience, backup, DR modes etc, and it remains hidden from you &#8211; although when connected to the SQL Azure database you do see a &#8216;Master&#8217; database present. It is not really a &#8216;Master&#8217; in the same way that we think of one, and it quickly becomes apparent how limited that version of the &#8216;Master&#8217; really is &#8211; it exists purely to give you a place to create logins and databases. It could have been called something else to make it a bit clearer but one of the SQL Azure team said it was to keep compatibility to other 3rd party applications that expected there to be a master.</p>
<p>SQL Azure supports transactions as mentioned before, but given the 10GB limit currently on a database you will be partitioning your data across databases. That will be a problem, because the system does not support distributed transactions, so any atomic work that is to be committed on multiple databases at once it going to have to be controlled manually / crufted in code, which is not ideal and a limitation to be aware of.</p>
<p>Equally cross database joins came up as an area with problems &#8211; they can be made, but it appears there are performance issues &#8211; interested to start running some more tests there and see whether you can mimic a partitioned view across databases using joins. The recommendation was to duplicate reference data between databases to avoid joins, so lookup tables would appear in each database in effect, removing the cross database join.</p>
<p>On the futures list:</p>
<ul>
<li>The ability to have dynamic partition splits looked interesting, regular SQL server does not have this facility within a partitioned table &#8211; so if Azure can do it across databases then this might come up on the SQL roadmap as a feature &#8211; that could be wishful thinking.</li>
<li>Better tooling for developers and administrators &#8211; that is a standard future roadmap entry.</li>
<li>Ability to Merge database partitions.</li>
<li>Ability to Split database partitions.</li>
</ul>
<p>So SQL Azure has grown up considerably and continues to grow, in the hands-on-labs today I got to have more of a play with it and start testing more of the subtle limitations and boundaries that are in place. Connecting to an azure database via SQL Server Management Studio is trivial and the object explorer contains a cut down version of the normal object tree, but includes all the things you would expect such as tables, views and stored procedures.</p>
<p>Some limitations of the lack of master and real admin access become apparent pretty fast, no DMV support, no ability to check your current size. No ability to change a significant number of options, in fact, the bulk of the options are not even exposed.</p>
<p>Two of my personal favourites I took an immediate look at, maxdop and parameterization.</p>
<ul>
<li>Maxdop is set at 1, although you can not see it, and attempting to override it throws an error from the query windows, telling you that it is not permitted. Do not plan on parallel query execution, you will not get it.</li>
<li>I attempted to test the query parameterisation using the date literal trick and it appeared to remain parametrized, as though the database is in &#8216;forced&#8217; parameterisation mode, so is more likely to get parameter sniffing problems but I have not been able to concretely prove it as yet, but the early indication is the setting is &#8216;Forced&#8217;</li>
</ul>
<p>One other interesting concept was that a table had to have a clustered index, it was not optional if you wanted to get data into the table, although is did not stop me from creating a table without a clustered index, I had not attempted to populate data into it to see this limit in action &#8211; a case of too much to do and so little time.</p>
<p>On one of the final talks about SQL Azure, David Robinson announced a project codenamed &#8216;Houston&#8217; &#8211; (there will be so many &#8216;we have a problem&#8217; jokes on that one) which is basically a silverlight equivalent of SQL Server Management Studio. The concept comes from the SQL Azure being within the cloud, and if the only way to interact with it is by installing SSMS locally then it does not feel like a consistent story.</p>
<p>From the limited preview, it only contains the basics but it clearly let you create tables, stored procedures and views, edit them, even add data to tables in a grid view reminiscent of Microsoft Access. The UI was based around the standard ribbon bar, object window on the left and working pane on the right. It was lo-fi to say the least  but you could see conceptually where it could go &#8211; given enough time it could become a very good SSMS replacement, but I doubt it will be taken that far. There was an import and Export button on the ribbon with what looked to be &#8216;Excel&#8217; like icons but nothing was said / shown of them. Date wise &#8216;Targetting sometime in 2010&#8242;, so this has some way to go and is not even in beta as yet.</p>
<p>So that was PDC09, excellent event, roll on the next one!</p>
<br />Posted in PDC 09, SQL Server Tagged: 'Houston', MaxDop, PDC09, Query Parameterisation, SQL Azure <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/248/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/248/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/248/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/248/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/248/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/248/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/248/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/248/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/248/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/248/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/248/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/248/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/248/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/248/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=248&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/11/20/pdc-09-day-3-sql-azure-and-codename-houston-announcement/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>PDC09 : Day 2 Keynote</title>
		<link>http://sqlfascination.com/2009/11/19/pdc09-day-2-keynote/</link>
		<comments>http://sqlfascination.com/2009/11/19/pdc09-day-2-keynote/#comments</comments>
		<pubDate>Thu, 19 Nov 2009 03:59:02 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[PDC 09]]></category>
		<category><![CDATA[PDC09]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=242</guid>
		<description><![CDATA[The PDC has been an amazing place to be today, the buzz and excitement generated from the keynote this morning has permeated the entire convention centre and understandably so &#8211; this is primarily a conference for IT people and of course what is the best way to get IT folk on board? give them some hardware, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=242&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The PDC has been an amazing place to be today, the buzz and excitement generated from the keynote this morning has permeated the entire convention centre and understandably so &#8211; this is primarily a conference for IT people and of course what is the best way to get IT folk on board? give them some hardware, usb sticks, usb drives, t-shirts &#8211; but Steve Sinofksy and Microsoft went one better this morning in the PDC keynote.</p>
<p>You could sense something was coming in that they were going through a number of netbook/ laptop devices talking about how they have learnt more about how the hardware is constructed and used. This led them to the creation of a kind of reference laptop device with it all built-in, so it became an ultimate development platform. I was half expecting something along the lines &#8216;and we are offering this to you today for a discount of&#8230; &#8216; &#8211; since it is was clearly a very nice device that was being shown to the crowd, the spec we now know is: </p>
<ul>
<li>Acer Laptop</li>
<li>Dual core Celeron U2300 chip</li>
<li>2 Gig of ram,</li>
<li>250 gig hard disk,</li>
<li>win 7 ultimate 64 bit, preinstalled with office 2010 beta.</li>
<li>Tablet style PC with touch screen</li>
<li>1366&#215;768 resolution, Intel GMA 4500 MHD graphics</li>
<li>Webcam, 3G Sim Card support, HDMI output, built-in memory card reader.</li>
</ul>
<p>It manages to score 3.2 on the Windows Experience Index, which is pretty impressive for a semi netbook style laptop, the score is understandably pegged by the graphics performance.</p>
<p>What I did not expect and the hordes went wild at, was the fact that he said &#8211; &#8220;today we are giving you one of these laptops, for free&#8221; &#8211; queue complete madness.</p>
<p>But to be fair, there is a considerable amount of buzz from the announcements and features being demonstrated, I have to confess that they are not SQL / Data related which is of course my passion , but they are worth mentioning:</p>
<ul>
<li>Silverlight 4 &#8211; Entered beta today, and can be downloaded, release expected first half of 2010, so expect Mix010 to contain the release announcements on that.</li>
<li>Silverlight 4 feature set for has been pumped up in all the key areas the technology was lacking, print support, context menus, access to media devices such as webcams and audio, drag / drop, rich text support, clipboard access,&#8230;. the list goes on.</li>
<li>Office 2010 beta is now available and can be download, powerpivot (what was codenamed &#8216;Gemini&#8217;) is now available to all.</li>
<li>Visual Studio 2010 beta is now available, which brings along a whole host of templates for all these new features.</li>
<li>Sharepoint 2010 beta was released today and the integration between the development surface and the Sharepoint site looks to be a consistent story and got cheers from the Sharepoint developers in the audience. (I kind of feel sorry for the Sharepoint presenters and demos, they followed on the heels of Steve and then Scott, who had just made the most significant announcements for the conference and given away the best &#8216;goodie&#8217; you could get, how can you follow that?</li>
</ul>
<p>So did this leave anything for the database side of my passion?</p>
<p>Well yes, in a round about way, what is interesting is that the Silverlight is extending to include a trusted model which gives you a far wider access to the underlying OS and this starts bringing it into the realms of local data consumption. They have also allowed calls into the older COM object models to be made from within Silverlight when running in a full trusted mode, this means that technically, Silverlight can make direct calls into the database via the COM ADO libraries, instead of using the system.data namespace and using ADO.Net. Up until now there has been no availability for the platform to connect to a SQL server directly, but this provides a very round about way in which to do it.</p>
<p>That seems puzzling as to why you would allow that scenario but not give SL some form of direct access into the database itself &#8211; at the ask the experts session later in the day we posed the question as to whether a proper data access technology for connecting to SQL Server was being included and the answer indicated there would be something there to do it, but no specifics were mentioned. Also managed to spend some time in the big room and chat to some of the SQL guys at the booth as well as the patterns and practices team. I want to get a chance to go chat more to the SQL Azure team but will have to wait until tomorrow to fit that in.</p>
<br />Posted in PDC 09 Tagged: PDC09 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/242/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/242/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/242/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=242&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/11/19/pdc09-day-2-keynote/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>PDC09 : Day 1 Keynote</title>
		<link>http://sqlfascination.com/2009/11/18/pdc09-day-1-keynote/</link>
		<comments>http://sqlfascination.com/2009/11/18/pdc09-day-1-keynote/#comments</comments>
		<pubDate>Wed, 18 Nov 2009 05:04:08 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[PDC 09]]></category>
		<category><![CDATA[SQL Server]]></category>
		<category><![CDATA['Dallas']]></category>
		<category><![CDATA[PDC09]]></category>
		<category><![CDATA[SQL Azure]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=233</guid>
		<description><![CDATA[As promised, I wanted to only blog about the bits of the PDC that relate to SQL / Database / Data Services, and not every session within the PDC that I am attending. Many of the sessions have been interesting, but I am viewing them with my Architect&#8217;s hat on, and not from the viewpoint [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=233&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>As promised, I wanted to only blog about the bits of the PDC that relate to SQL / Database / Data Services, and not every session within the PDC that I am attending. Many of the sessions have been interesting, but I am viewing them with my Architect&#8217;s hat on, and not from the viewpoint of my personal passion for SQL Server. I feel fortunate to be here and listening to the speakers and chatting to them offline instead of watching the PDC on the released videos after the event.</p>
<p>The keynote today contained a number of very interesting looking prospects on the data side of the fence, primarily &#8216;compered&#8217; by Ray Ozzie, Chief Software Architect at Microsoft. There were also some demos, some of which were quite good, whilst others suffered from over-scripting. I am sure twitter was going wild at times during the keynote as people were giving real-time feedback about what they thought. (Whether that is a good thing or not I am not sure, walking off stage to find a few hundred bad reviews can not be nice.) But this is not about the demos but about the SQL / Data stuff.</p>
<p>A lot of work Microsoft have been doing and the phrase repeated throughout was &#8217;3 screens and a cloud&#8217;, using the 3 screens of mobile, computer and tv to represent 3 different delivery paradigms, but fundamentally using the same technology stack to deliver all 3.</p>
<p>The Azure data centres were announced to be going into production on Jan 1st 2010, and billing for those services will commence on the 1st Feb. However, the European and far eastern data centres were not listed as coming online until late in 2010, so the only data centres that will be up and running will be the Chicago and San Antonio data centres.</p>
<p>This may not seem a big problem, and in fact having 3 pair&#8217;s of data centres around the world is far more ideal and a single centralised resource, but for Europeans there are data protection laws in place that prohibit the movement of personal data outside of the bounds of Europe. In effect, you may not move the data into another jurisdiction where the data laws remove the legal protection the data subject owns. So from a data angle, it will be more interesting when the Dublin / Amsterdam data centre comes online in 2010, at which point storing data in the Azure cloud has a better data protection story.</p>
<p>SQL Azure has clearly been &#8216;beefed&#8217; up and can now be connected to via SQL Server Management Studio just like a normal database, and be administered / interacted with &#8211; even supporting transactions. The disaster recovery and physical administration of the SQL remains out of sight and handled by the cloud, and not the application / vendor. SQL Azure understands TDS, so connecting to the SQL Azure is pretty seamless and appears like a regular SQL server. It has clearly matured as a platform, and rightly so.</p>
<p>Another project, codenamed <a href="http://pinpoint.microsoft.com/en-US/Dallas" target="_blank">&#8216;Dallas&#8217; </a>was announced which forms part of <a href="http://pinpoint.microsoft.com/en-US/about.aspx" target="_blank">pinpoint</a>. Pinpoint is a products / services portal, which instantly made me think of Apple&#8217;s &#8216;AppStore&#8217; but for windows products and companies offering services. The interesting part is the <a href="http://pinpoint.microsoft.com/en-US/Dallas" target="_blank">&#8216;Dallas&#8217; </a>section, which is something like a &#8216;Data Store&#8217; &#8211; allowing the discovery and consumption of centralised data services.</p>
<p>There has always been an issue when consuming data from other sources, that you are required to download it, understand the schema of the data and often ETL it from the format it is being supplied in, such as CSV, XML, Atom etc into a format that you can work with. Each data source often has its own schema and delivery mechanism and handling updates to the data remains an operational issue.</p>
<p>With &#8216;Dallas&#8217; you are buying into the data being held within the cloud and it will auto-generate the proxy class for the data being consumed, so the schema of the data is available to you within code on the development side. This is an awesome concept and if they can tie in some form of micro-payment structure, you could easily visualise a set of data services that you consume within an application on an as needed basis. Without the micro-payments, you would have to have purchased a license, whether that is a one off cost, or a monthly subscription, neither deals with the &#8216;elastic&#8217; nature of the applications that are being placed onto the cloud and one of the key benefits in that the data centres can scale up / down as your apps require. Given the billing of that is based on usage and you specifically want to take advantage of the elasticity of the infrastructure provision, it would make sense to have a similar elasticity in the data service charging arena.</p>
<p>This is definitely a technology to keep a close eye on, and I will be signing up an account to get access to the free data services that they are going to expose.</p>
<br />Posted in PDC 09, SQL Server Tagged: 'Dallas', PDC09, SQL Azure <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/233/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/233/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/233/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/233/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/233/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/233/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/233/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/233/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/233/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/233/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/233/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/233/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/233/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/233/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=233&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/11/18/pdc09-day-1-keynote/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>At the PDC 09</title>
		<link>http://sqlfascination.com/2009/11/17/at-the-pdc-09/</link>
		<comments>http://sqlfascination.com/2009/11/17/at-the-pdc-09/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 12:39:31 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[PDC 09]]></category>
		<category><![CDATA[PDC09]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=226</guid>
		<description><![CDATA[I am at the PDC this week in Los Angeles &#8211; the session selection is massive and covers a wide range of topics, although there are only a few sessions involving SQL Server.  I think this is a testament to how important the PASS conference is now for SQL Server. If I fit a SQL session in I [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=226&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I am at the PDC this week in Los Angeles &#8211; the session selection is massive and covers a wide range of topics, although there are only a few sessions involving SQL Server.  I think this is a testament to how important the PASS conference is now for SQL Server.</p>
<p>If I fit a SQL session in I will blog about it,  or any major announcements, but I will try keep this purely about SQL since that is my passion.</p>
<p>PDC is a great technology event as well as a good networking event, so shoot me a comment if you want to try locate me in the mass of people roaming the halls.</p>
<br />Posted in PDC 09 Tagged: PDC09 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/226/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/226/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/226/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=226&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/11/17/at-the-pdc-09/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Inconsistent Date Literal Parameterization Behaviour</title>
		<link>http://sqlfascination.com/2009/11/12/inconsistent-date-literal-parameterization-behaviour/</link>
		<comments>http://sqlfascination.com/2009/11/12/inconsistent-date-literal-parameterization-behaviour/#comments</comments>
		<pubDate>Thu, 12 Nov 2009 22:20:12 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Indexes]]></category>
		<category><![CDATA[Query Parameterisation]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=211</guid>
		<description><![CDATA[I have mentioned query parameterization before and the process by which SQL extracts literal values from a query and re-writes the query in effect to use parameters to get a query plan cache hit, which negates the need to recompile the plan costing both time and CPU. There are a lot of good articles and [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=211&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I have mentioned query parameterization before and the process by which SQL extracts literal values from a query and re-writes the query in effect to use parameters to get a query plan cache hit, which negates the need to recompile the plan costing both time and CPU. There are a lot of good articles and book chapters that cover the topic.</p>
<p>What has confused me for a while is witnessing date literals within a query being parameterized on one query and not on another, even though both databases have parameterization set to &#8216;Simple&#8217; mode. When a date literal is not parameterized the chances of getting a query plan cache hit is obviously very low which has performance impacts. The problem to date has been that I had been unable to ascertain the commonality and requirements that allowed the query to get parameterized and when it just kept the literal. I was paying too much attention to the query and as it turns out not enough to the table.</p>
<p>Well, after a number of hours getting to know the brick wall very well I finally tracked it down, and oddly it had nothing to do with the query I was submitting, but I could reproduce reliably by using an unrelated non-clustered index, which is confusing to say the least and I can not yet think of any reason why, is it a bug or just &#8217;Weird and Odd&#8217;.</p>
<p>The following steps reproduce the issue in both SQL Server 2005 and 2008.</p>
<p>Create a database, default settings, nothing specific, then the second step is to create our test table, a simple structure is suffice.</p>
<pre><span style="color:#0000ff;">CREATE</span> <span style="color:#0000ff;">TABLE</span> [dbo].[paramtest](  [id] [int] <span style="color:#0000ff;">IDENTITY</span>(1,1)<span style="color:#808080;"> NOT NULL</span>, [somedate] [datetime] <span style="color:#808080;">NOT NULL</span>, [somefield] [varchar](50) <span style="color:#0000ff;">COLLATE</span> SQL_Latin1_General_CP1_CI_AS <span style="color:#808080;">NULL</span>,
 <span style="color:#0000ff;">CONSTRAINT</span> [PK_paramtest] <span style="color:#0000ff;">PRIMARY</span> <span style="color:#0000ff;">KEY</span> <span style="color:#0000ff;">CLUSTERED</span>
(
 [id] <span style="color:#0000ff;">ASC</span>
)<span style="color:#0000ff;">WITH</span> (IGNORE_DUP_KEY = <span style="color:#0000ff;">OFF</span>) <span style="color:#0000ff;">ON</span> [PRIMARY]
) <span style="color:#0000ff;">ON</span> [PRIMARY]</pre>
<p>We need some data to work against just to make sure we are selecting results, and we can insert these trivially as follows: </p>
<pre><span style="color:#0000ff;">insert </span><span style="color:#0000ff;">into</span> [paramtest] <span style="color:#0000ff;">values</span> (<span style="color:#ff00ff;">getdate</span>(),<span style="color:#0000ff;">'a'</span>)
go 10000</pre>
<p>So we now have 10k rows within the table, and a clustered primary index on the identity column.</p>
<p>The test starts with freeing up the procedure cache, and then running the select statement, the datetime I used was roughly in the middle of the range of values I had inserted but is not a deciding factor in the query plan results.</p>
<pre><span style="color:#0000ff;">dbcc</span> freeproccache <span style="color:#0000ff;">select </span>* <span style="color:#0000ff;">from</span> paramtest <span style="color:#0000ff;">where</span> somedate &gt;<span style="color:#ff0000;"> '2009-11-12 21:14:50.000'</span></pre>
<p>Using a standard query plan cache extraction query the specific line of the xml plan we are interested in is the SQL Statement.</p>
<pre><span style="color:#ff0000;">&lt;StmtSimple StatementText="(@1 varchar(8000))SELECT * FROM [paramtest] WHERE [somedate]&gt;@1" StatementId="1" StatementCompId="1" StatementType="SELECT" StatementSubTreeCost="0.0379857" StatementEstRows="3831.48" StatementOptmLevel="TRIVIAL"&gt;</span></pre>
<p>From it you can see the literal varchar value was extracted as @1 with a type of varchar(8000) and the query altered to use this parameter &#8211; this is exactly the behaviour we would expect from parameter sniffing.</p>
<p>Next step is to create a non-clustered index on the varchar &#8217;somefield&#8217; &#8211; completely unrelated to the date literal being used, and should have no impact on the query at all.</p>
<pre><span style="color:#0000ff;">CREATE </span><span style="color:#0000ff;">NONCLUSTERED</span> <span style="color:#0000ff;">INDEX</span> [ix_test] <span style="color:#0000ff;">ON</span> [dbo].[paramtest] ([somefield] <span style="color:#0000ff;">ASC</span>
)<span style="color:#0000ff;">WITH</span> (SORT_IN_TEMPDB = <span style="color:#0000ff;">OFF</span>, DROP_EXISTING = <span style="color:#0000ff;">OFF</span>, IGNORE_DUP_KEY = <span style="color:#0000ff;">OFF</span>, ONLINE = <span style="color:#0000ff;">OFF</span>) <span style="color:#0000ff;">ON</span> [PRIMARY]</pre>
<p>Free the procedure cache up again and rerun the query</p>
<pre><span style="color:#0000ff;">dbcc</span> freeproccache <span style="color:#0000ff;">select </span>* <span style="color:#0000ff;">from</span> paramtest <span style="color:#0000ff;">where</span> somedate &gt; <span style="color:#ff0000;">'2009-11-12 21:14:50.000'</span></pre>
<p>Extract the query plan again from the cache, but this time it is noticable different, the parameterisation has not occurred. The literal has</p>
<pre><span style="color:#ff0000;">&lt;StmtSimple StatementText="select * from paramtest  where somedate &gt; '2009-11-12 21:14:50.000'" StatementId="1" StatementCompId="1" StatementType="SELECT" /&gt;</span></pre>
<p> To revert to the old plan, drop the index and clear the cache again, then run the query once more.</p>
<pre><span style="color:#0000ff;">DROP INDEX</span> [ix_test] <span style="color:#0000ff;">ON</span> [dbo].[paramtest] <span style="color:#0000ff;">WITH</span> (ONLINE = <span style="color:#0000ff;">OFF</span>)</pre>
<p>Then clear the cache again and run the query</p>
<pre><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> paramtest <span style="color:#0000ff;">where</span> somedate &gt; <span style="color:#ff0000;">'2009-11-12 21:14:50.000'</span></pre>
<p>And we are back to being parameterized.</p>
<p>So the application of a single non-clustered index on a separate field to the one being queried is preventing the simple parameterization mode from parameter sniffing the date literal &#8211; this makes absolutely no sense, and you can play around with it a lot more knowing what it causing the effect on the query plan. Even placing the additional non-clustered index on the identity field, which already has a clustered index results in the parameterization failing. If this behaviour is be design, then it makes for an interesting design or limitation on the parameterization.</p>
<p>As soon as the database is in &#8216;Forced&#8217; parameterization mode, the literal was converted each time, so this looks specific to simple mode, but is not explainable, just demonstratable.</p>
<br />Posted in SQL Server Tagged: Indexes, Query Parameterisation, SQL Server 2005, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/211/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/211/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/211/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/211/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/211/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/211/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/211/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/211/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/211/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/211/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/211/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/211/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/211/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/211/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=211&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/11/12/inconsistent-date-literal-parameterization-behaviour/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Can a Covering NC Index be Tipped?</title>
		<link>http://sqlfascination.com/2009/11/07/can-a-covering-nc-index-be-tipped/</link>
		<comments>http://sqlfascination.com/2009/11/07/can-a-covering-nc-index-be-tipped/#comments</comments>
		<pubDate>Sat, 07 Nov 2009 17:56:03 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Indexes]]></category>
		<category><![CDATA[RandomString]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Tipping Point]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=198</guid>
		<description><![CDATA[Non-clustered indexes normally have a &#8216;tipping point&#8217;, which is the point at which the query engine decides to change strategies from seeking the index with a nested loop operator back to a seek on the underlying table or choosing to just scan the underlying table and ignore the index. Kimberley Tripp wrote a great article [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=198&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Non-clustered indexes normally have a &#8216;tipping point&#8217;, which is the point at which the query engine decides to change strategies from seeking the index with a nested loop operator back to a seek on the underlying table or choosing to just scan the underlying table and ignore the index. <a href="http://www.sqlskills.com/blogs/kimberly/">Kimberley Tripp </a>wrote a great article about <a href="http://www.sqlskills.com/BLOGS/KIMBERLY/category/The-Tipping-Point.aspx">&#8216;The Tipping Point</a>&#8216; , and the guidance is at about the 25-33% the query engine will change strategies.</p>
<p>If the non-clustered index is a covering index (it contains all the fields within the query) the query engine does not take the same decision &#8211; it makes sense that if any change in strategy occurs, it would have to be at a far higher figure, and as we are about to see, it will not take that decision and tip.</p>
<p>To test what strategy the engine would use I created a test situation of 2 separate tables, with different page counts, due to the padding column forcing the second table to use far more pages (5953 pages vs 9233)</p>
<pre><span style="color:#0000ff;">CREATE TABLE</span> [dbo].[tblIxTest1]( [PersonnelID] [int] <span style="color:#0000ff;">IDENTITY</span>(1,1)<span style="color:#808080;"> NOT NULL</span>, [FirstName] [char](30) <span style="color:#808080;">NULL</span>, [LastName] [char](30) <span style="color:#808080;">NULL</span>,
   [Department] [char](30) <span style="color:#808080;">NULL</span>, [SomePadding] [char](10) <span style="color:#808080;">NULL</span>
) <span style="color:#0000ff;">ON </span>[PRIMARY]</pre>
<p>And,</p>
<pre><span style="color:#0000ff;">CREATE TABLE</span> [dbo].[tblIxTest2]( [PersonnelID] [int] <span style="color:#0000ff;">IDENTITY</span>(1,1) <span style="color:#808080;">NOT NULL</span>, [FirstName] [char](30) <span style="color:#808080;">NULL</span>, [LastName] [char](30) <span style="color:#808080;">NULL</span>,
   [Department] [char](30) <span style="color:#808080;">NULL</span>, [SomePadding] [char](1000) <span style="color:#808080;">NULL</span>
) <span style="color:#0000ff;">ON</span> [PRIMARY]</pre>
<p>Next step was to insert some data, I needed random data to be able to ensure the index was not unbalanced in some way, so I broke out my useful little random string generation function. I should mention how to create this, a SQL function will not directly support the inclusion of a Rand() call within them, any attempt to do this results in the error:</p>
<pre><span style="color:#ff0000;">Msg 443, Level 16, State 1, Procedure test, Line 13
Invalid use of a side-effecting operator 'rand' within a function.</span></pre>
<p>However, there is nothing stopping a view from using this, and the function from using the view to get around the limitation: </p>
<pre><span style="color:#0000ff;">Create View</span> [dbo].[RandomHelper] <span style="color:#0000ff;">as Select</span> <span style="color:#ff00ff;">Rand</span>() <span style="color:#0000ff;">as</span> r</pre>
<p>And then the function can be generated to use this, it is not necessarily the most efficient random string generation function, but it works nicely.</p>
<pre><span style="color:#0000ff;">CREATE FUNCTION</span> [dbo].[RandomString] (@Length <span style="color:#0000ff;">int</span>) <span style="color:#0000ff;">RETURNS varchar</span>(100)
<span style="color:#0000ff;">WITH EXECUTE AS CALLER
AS
BEGIN
</span>  <span style="color:#0000ff;">DECLARE</span> @Result <span style="color:#0000ff;">Varchar</span>(100)
  <span style="color:#0000ff;">SET</span> @Result = <span style="color:#ff0000;">''
</span>  <span style="color:#0000ff;">DECLARE</span> @Counter <span style="color:#0000ff;">int</span>
  <span style="color:#0000ff;">SET</span> @Counter = 0
  <span style="color:#0000ff;">WHILE</span> @Counter &lt;= @Length
  <span style="color:#0000ff;">BEGIN</span>
     <span style="color:#0000ff;">SET </span>@Result = @Result + <span style="color:#0000ff;">Char</span>(<span style="color:#ff00ff;">Ceiling</span>((<span style="color:#0000ff;">select</span> R <span style="color:#0000ff;">from</span> randomhelper) * 26) + 64)       
     <span style="color:#0000ff;">SET </span>@Counter = @Counter + 1   <span style="color:#0000ff;">END</span>
  <span style="color:#0000ff;">RETURN</span>(@Result)
<span style="color:#0000ff;">END</span></pre>
<p>This now allows me to generate random data and insert it into the tables to get a nice data distribution, and this was run for both of the tables.</p>
<pre><span style="color:#0000ff;">insert into </span>tblIxTest1 <span style="color:#0000ff;">values</span> (dbo.RandomString(20),dbo.RandomString(20),dbo.RandomString(20),<span style="color:#ff0000;">''</span>)
<span style="color:#0000ff;">go </span>1000000</pre>
<p>Two NC indexes are now needed, one for each table and both are identical and cover just the FirstName and PersonnelID fields within the table.</p>
<pre><span style="color:#0000ff;">CREATE NONCLUSTERED INDEX</span> [IX_Test1] <span style="color:#0000ff;">ON</span> [dbo].[tblIxTest1] ( [FirstName] <span style="color:#0000ff;">ASC</span>, [PersonnelID] <span style="color:#0000ff;">ASC</span>
)<span style="color:#0000ff;">WITH</span> (<span style="color:#0000ff;">STATISTICS_NORECOMPUTE</span>  = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">SORT_IN_TEMPDB</span> = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">IGNORE_DUP_KEY</span> = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">DROP_EXISTING</span> = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">ONLINE</span> = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">ALLOW_ROW_LOCKS</span>  = <span style="color:#0000ff;">ON</span>, <span style="color:#0000ff;">ALLOW_PAGE_LOCKS</span>  = <span style="color:#0000ff;">ON</span>) <span style="color:#0000ff;">ON</span> [PRIMARY]
GO
<span style="color:#0000ff;">CREATE NONCLUSTERED INDEX</span> [IX_Test2] <span style="color:#0000ff;">ON</span> [dbo].[tblIxTest2] ( [FirstName] <span style="color:#0000ff;">ASC</span>, [PersonnelID] <span style="color:#0000ff;">ASC</span>
)<span style="color:#0000ff;">WITH</span> (<span style="color:#0000ff;">STATISTICS_NORECOMPUTE</span>  = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">SORT_IN_TEMPDB</span> = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">IGNORE_DUP_KEY</span> = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">DROP_EXISTING</span> = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">ONLINE</span> = <span style="color:#0000ff;">OFF</span>, <span style="color:#0000ff;">ALLOW_ROW_LOCKS</span>  = <span style="color:#0000ff;">ON</span>,<span style="color:#0000ff;"> ALLOW_PAGE_LOCKS</span>  = <span style="color:#0000ff;">ON</span>) <span style="color:#0000ff;">ON</span> [PRIMARY]
GO</pre>
<p>The setup is complete and it is pretty easy to now show the NC covering index is not going to tip, the most extreme where clause is where I am allowing every record to be returned:</p>
<pre><span style="color:#0000ff;">select</span> personnelid , firstname <span style="color:#0000ff;">from</span> tblixtest1 <span style="color:#0000ff;">where</span> firstname &gt;= 'a' <span style="color:#0000ff;">and</span> firstname &lt;= <span style="color:#ff0000;">'zzzzzzzzzzzzzzzzzzzzz'</span></pre>
<p>This still produces a query plan with a seek strategy, regardless of which of my two tables it was executed on:</p>
<pre>select personnelid , firstname from tblixtest1  where firstname &gt;= 'a' and firstname &lt;= 'zzzzzzzzzzzzzzzzzzzzz'  
    |--Index Seek(OBJECT:([FilteredIndexTest].[dbo].[tblIxTest1].[IX_Test1]), SEEK:([FilteredIndexTest].[dbo].[tblIxTest1].[FirstName] &gt;= [@1] AND [FilteredIndexTest].[dbo].[tblIxTest1].[FirstName] &lt;= [@2]) ORDERED FORWARD)</pre>
<p>If we just select the entire table, unsurprisingly at that point it chooses to perform an index scan.</p>
<pre><span style="color:#0000ff;">select</span> personnelid , firstname <span style="color:#0000ff;">from </span>tblixtest1</pre>
<p>Results in the following plan:</p>
<pre>select personnelid , firstname from tblixtest1   |--Index Scan(OBJECT:([FilteredIndexTest].[dbo].[tblIxTest1].[IX_Test1])) </pre>
<p>The row counts on both queries were identical at 1 million. Slightly more interesting is that if I use a Like clause instead of a direct string evaluation, the behaviour alters slightly when selecting all the values:</p>
<pre><span style="color:#0000ff;">select </span>personnelid , firstname <span style="color:#0000ff;">from </span>tblixtest1 <span style="color:#0000ff;">where</span> firstname like '<span style="color:#ff0000;">[a-z]%</span>'</pre>
<p>Gives the query plan:</p>
<pre>select personnelid , firstname from tblixtest1  where firstname like '[a-z]%'  
   |--Index Scan(OBJECT:([FilteredIndexTest].[dbo].[tblIxTest1].[IX_Test1]),  WHERE:([FilteredIndexTest].[dbo].[tblIxTest1].[FirstName] like '[a-z]%'))</pre>
<p>So the query engine is potentially making an optimisation that it knows the like clause covers 100% and adopts an automatic scan, but it is not really very clear why it has this optimisation path. If the like clause changes to [a-y] then it reverts back to a seek, so it looks specific to covering all the values within the like statement. If a between statement is used, it remains a seek regardless.</p>
<p>So the result is that a Non-clustered covering index is very unlikely to tip, you either have to not give it a where clause, or use a like statement across all the values available, it will steadfastly refuse to seek and choose the scan.</p>
<p>Why?</p>
<p>Well the I/O cost of the operation remains the same, it has to read every page in the table and it considered the cost of traversing the B-Tree negligible, so the difference between seek and scan is not very great. Running the seek based query and scan based query in the same batch the relative percentages are 48% vs 52% &#8211; that is the scan scoring 52% even though they read the same number of rows.</p>
<p>Outputting the IO statistics when they are run side by side shows the same number of pages being read, but the seek is clearly being favoured and is slightly faster as far as SQL is concerned &#8211; it is quite weird to consider a seek of an entire index is more efficient than a scan of the index.</p>
<pre>(1000000 row(s) affected)
Table 'tblIxTest1'. Scan count 1, logical reads 5995, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
(1000000 row(s) affected)
Table 'tblIxTest1'. Scan count 1, logical reads 5995, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.</pre>
<p>So if you come across a covering index in a query plan that is scanning, it would be worth investigating as to whether it is intended. The chances are more likely the index field order is not supporting the predicates being used, than engine has chosen to tip the index like it would for the non-covering non-clustered indexes.</p>
<br />Posted in SQL Server Tagged: Indexes, RandomString, SQL Server 2005, SQL Server 2008, Tipping Point <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/198/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/198/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/198/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=198&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/11/07/can-a-covering-nc-index-be-tipped/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>When is Bulk Logged Mode Not What it Says?</title>
		<link>http://sqlfascination.com/2009/11/06/when-is-bulk-logged-mode-not-what-it-says/</link>
		<comments>http://sqlfascination.com/2009/11/06/when-is-bulk-logged-mode-not-what-it-says/#comments</comments>
		<pubDate>Fri, 06 Nov 2009 00:01:33 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Backups]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=193</guid>
		<description><![CDATA[Somewhat of a &#8216;doh&#8217; moment when dealing with a SQL server in bulk logged mode, the transaction log was behaving very weirdly and growing very large for smaller index operations. When performing an additional operation, the actual percentage of log space used went down, and it was a repeatable test. At no point would you [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=193&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Somewhat of a &#8216;doh&#8217; moment when dealing with a SQL server in bulk logged mode, the transaction log was behaving very weirdly and growing very large for smaller index operations. When performing an additional operation, the actual percentage of log space used went down, and it was a repeatable test. At no point would you expect an operation to actually result in the log space showing as less space used. So something was up with the scenario, and I was asked to check it.</p>
<p>In bulk logged mode index creations are classed as a minimally logged operation, whether online or offline, and a drop index is a bit of a mixture, the index page deallocation is fully logged, but the heap rebuild is listed as a minimally logged operation. There was no obvious reason why the database was behaving as it was.</p>
<p>The steps to recreate this situation on your own computrer / server are rather simple, create a database for testing purposes and preallocated the log space as 200Meg. The database was set to bulk logged mode. I then created a basic table:</p>
<pre><span style="color:#0000ff;">create table</span> [tblOnlineIndexTest]
(
  id <span style="color:#0000ff;">int</span>,
  padding <span style="color:#0000ff;">char</span>(500)
)</pre>
<p>The table does not have a primary key so a uniquifier will be created, which we are not worried about. To create a number of rows, I simply ran the following SQL to generate some data, just over 60 meg of it.</p>
<pre><span style="color:#0000ff;">insert into</span> tblOnlineIndexTest (id, padding) <span style="color:#0000ff;">values</span> (1, <span style="color:#ff00ff;">REPLICATE</span>(<span style="color:#ff0000;">'a'</span>,100))
go 100000</pre>
<p>To check the log space, I used the DBCC command:</p>
<pre><span style="color:#0000ff;">dbcc </span>sqlperf(logspace)</pre>
<p>26.28% is used. The next step was the creation of a clustered index on the table:</p>
<pre><span style="color:#0000ff;">create clustered index</span> [PK_tblOnlineIndexTest] on [tblOnlineIndexTest] (id)
WITH (PAD_INDEX  = <span style="color:#0000ff;">OFF</span>, STATISTICS_NORECOMPUTE  = <span style="color:#0000ff;">OFF</span>, SORT_IN_TEMPDB = <span style="color:#0000ff;">OFF</span>,
IGNORE_DUP_KEY = <span style="color:#0000ff;">OFF</span>, ONLINE = <span style="color:#0000ff;">ON</span>, ALLOW_ROW_LOCKS  = <span style="color:#0000ff;">ON</span>, ALLOW_PAGE_LOCKS  = <span style="color:#0000ff;">ON</span>) <span style="color:#0000ff;">ON</span> [primary]</pre>
<p>The log was checked again and the space used was 21.34%. I twigged the underlying issue at this point, but continued to test a bit more to be sure.</p>
<p>Next I dropped the index and the log space jumped up to 53.27% of the total log space, and the log had grown in size, even though it was not at 100% of the space. The log has to reserve undo space in advance of the transaction to make sure it can always record an undo, so that a rollback does not get stopped by the transaction log being full / having no space left.</p>
<p>Last step was to recreate the same index and this time the log space dropped to 2.23% but the actual log file has grown by 66 Meg at this point. So the log got larger, but the actual amount of data in it was smaller.</p>
<p>The problem was a simple one, the database was not actually in bulk logged mode, just like a database set as &#8217;full&#8217; logging, until the first full database backup is taken, the database is still in the simple logging mode. A transaction log has to be replayed against a starting position of a full backup, so the lack of a full backup on the database automatically prevents the bulk logging mode from actually being active. </p>
<p>When I was presented the initial problem to look at, I did not asked whether the initial database backup had been taken &#8211; I assumed it had been which was clearly a bad assumption. Whilst the SQL Server identified itself in bulk logged mode, the actual reality was that it was in simple mode so the transaction log will be automatically clearing down committed transactions from the log and reusing the space.  Just because the database says it is in bulk logged or full modes, does not mean it actually is yet.</p>
<br />Posted in SQL Server Tagged: Backups, SQL Server 2005, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/193/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/193/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/193/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=193&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/11/06/when-is-bulk-logged-mode-not-what-it-says/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Simple vs Forced &#8211; Query Parameterization</title>
		<link>http://sqlfascination.com/2009/10/31/simple-vs-forced-query-parameterization/</link>
		<comments>http://sqlfascination.com/2009/10/31/simple-vs-forced-query-parameterization/#comments</comments>
		<pubDate>Sat, 31 Oct 2009 23:14:54 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Best Practise]]></category>
		<category><![CDATA[Query Parameterisation]]></category>
		<category><![CDATA[SQL Server 2005]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=183</guid>
		<description><![CDATA[On the drive back from a relaxing week off I had decided to write about Query Parameterization &#8211; this is the process where the query optimizer works on ad-hoc queries and chooses to take literal values used within a query predicate, and turn it into a parameter on the fly. This optimised query is then [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=183&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>On the drive back from a relaxing week off I had decided to write about Query Parameterization &#8211; this is the process where the query optimizer works on ad-hoc queries and chooses to take literal values used within a query predicate, and turn it into a parameter on the fly.</p>
<p>This optimised query is then the one used to check the query cache, and not the original query string, so it can result in a far higher percentage of query plan cache hits than would otherwise occur on ad-hoc queries.</p>
<p> There are two modes to this feature, and neither of them is &#8216;off&#8217;, you automatically have &#8216;simple&#8217; parameterization turned on, and can increase this to &#8216;forced&#8217; mode for your database if you desire. The difference in the two modes is that the simple mode will only deal with relatively simple literal values within the predicates, and will not consider every literal a candidate for parameterization.</p>
<p>So, overall it sounds a good thing but there are some known problems to parameterization, or parameter sniffing as it is oft called &#8211; the query plan generated for a given value of the parameter is not necessarily the optimal plan for another. Consider a search on a personnel database where the EmployeeAge is between 95 and 99 &#8211; Assuming an index is in place, you can already expect the data quantities to be low to nill, so the index will not tip and an index seek is the expected behaviour.</p>
<p>If the next time the query is executed the value is from 20 to 60, the number of people matching that value would be very high, yet the query plan in cache has already chosen the method by which the query will be executed, and it is no longer the most optimal.</p>
<p>From a user&#8217;s perspective this will lead what appears to be random / non-deterministic behaviour &#8211; sometimes the query will be very fast and other times it will be slow. There will not necessarily be a pattern to the behaviour because the query might be removed from cache when it is under memory pressure, or for a wide variety of other reasons the plan might be invalidated, the simplest being a data statistics invalidation.</p>
<p>So with the knowledge that parameter sniffing can cause issues, when should you consider investing the time into testing this setting? The time and effort to test could be considerable so I have written how I would come to that decision.</p>
<p> The first thing I check is the CPU utilisation on the SQL Server, in theory there should be some kind of issue forcing this to be considered and one of the issues that can occur from insufficient parameterization is an increase in the CPU utilisation due to a high query compilation rate. There are a lot of other reasons that can cause a high CPU such as incorrect indexing leading to table scans etc, but for the purpose of this explanation, we can assume that has already been checked and was not enough.</p>
<p>The second performance counters I would check are:</p>
<pre>SQLServer : SQL Statistics: Auto-Param Attempts/Sec
SQLServer : SQL Statistics: Batch Request / Sec</pre>
<p>I could also check the plan cache hit ratio, or the SQL Compilations / sec figures, but having the Auto-Params and Batch figure per sec allows you to do a rough calculation on the ratio of incoming batches compared to the number of queries that auto-paramerterization is being attempted on (successfully or not). The higher the ratio, then the more ad-hoc queries are already being affected by the &#8216;simple&#8217; mode parameterization.</p>
<p>If I see a high ratio, the next check would be on the plan cache statistics, primarily:</p>
<pre>SQL Server : Plan Cache : Cache Object Counts for SQL Plans
SQL Server : Plan Cache : Cache Pages for SQL Plans</pre>
<p>This is to try get a judge of how many plans are sitting in the cache / whether the cache is under memory pressure. If you have a high number of SQL plan objects / cached pages then you can calculate whether you are under memory pressure. You can also look at the buffer manager performance counter for the page life expectancy value, but this could be a bit misleading since data pages will also be affecting this.</p>
<p>At this point I would start pulling the query plan cache, and checking the query text for a sample of plans, to check whether there was a high number of literals remaining within the submitted query, since these are the ones that the simple parameterisation failed to convert to parameters. When pulling the query plan cache, sorting them in SQL text order, allows you to easily spot when a near identical queries except for unconverted literals have been submitted but resulted in a different query plan.</p>
<p>There is no prescriptive guidance on how many of these near identical queries it takes before you consider it, but clearly a single duplicate is not an issue, whilst thousands of them could be quite an issue and be resulting in a high number of unnecessary SQL plan compilations, which is increasing the CPU utilisation of the server.</p>
<p>As you can tell, this setting is certainly not one that should be changed without some considerable forethought and then thorough testing to ensure the situation you have got is actually worth the switch, and preferably some very good instrumented testing in a realistic test environment to ensure the benefits you are getting from the increased level of parameterisation, are not being lost by occurences of non-optimal plan cache hits.</p>
<br />Posted in SQL Server Tagged: Best Practise, Query Parameterisation, SQL Server 2005 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/183/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/183/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/183/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=183&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/10/31/simple-vs-forced-query-parameterization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>The Sequence of an Index Uniquifier</title>
		<link>http://sqlfascination.com/2009/10/20/the-sequence-of-an-index-uniquifier/</link>
		<comments>http://sqlfascination.com/2009/10/20/the-sequence-of-an-index-uniquifier/#comments</comments>
		<pubDate>Tue, 20 Oct 2009 20:46:29 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[Uniquifier]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=174</guid>
		<description><![CDATA[During a training session today, I was asked about the structure of the Uniquifier, and whether it was a straight identity column. Off the top of my head I couldn&#8217;t remember the exact structure, I considered it a 4 byte int, but was not sure whether it acted as a pure identity value or acted in a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=174&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>During a training session today, I was asked about the structure of the Uniquifier, and whether it was a straight identity column. Off the top of my head I couldn&#8217;t remember the exact structure, I considered it a 4 byte int, but was not sure whether it acted as a pure identity value or acted in a more complex manner when incrementing, so decided to investigate it tonight.</p>
<p>To start from the beginning, an index uniquifier is the term given to the field that is automatically generated by SQL Server when you create a clustered index, but the index key is not specified as unique. Since each record in the table has to be uniquely identifiable, SQL will automatically assigned a 4 byte field to the row to make it unique, commonly called the &#8216;Uniquifier&#8217;. At this point I am sure English scholars will be frowning, pondering on the nature of the word and whether it qualifies as English; however that the term used so we will run with it.</p>
<p>It is actually quite easy to see this field in action, let&#8217;s create a simple table:</p>
<pre><span style="color:#0000ff;">CREATE TABLE</span> dbo.unique_test  (  
firstname char(20) NOT NULL,  
surname char(20) NOT NULL  
)  <span style="color:#0000ff;">ON</span> [PRIMARY] GO
<span style="color:#0000ff;">CREATE CLUSTERED INDEX</span> [ix_test] <span style="color:#0000ff;">ON</span> [dbo].[unique_test]
 (
[firstname] ASC
)
<span style="color:#0000ff;">ON</span> [PRIMARY]</pre>
<p>The clustered index is not unique, by design, so let&#8217;s start adding duplicate rows to see the effect:</p>
<pre><span style="color:#0000ff;">insert into</span> unique_test <span style="color:#0000ff;">values</span> ('John', 'Smith')
go 10</pre>
<p>The table now contains 10 rows, each with the same details. This does not cause any undue concern, because each row is actually still unique &#8211; the way to show this is using the DBCC INC and DBCC Page commands, I&#8217;ve cut the output down since it is so wide.</p>
<pre>dbcc ind ('testdb','unique_test',1)
PageFID PagePID     IAMFID IAMPID      PageType
------- ----------- ------ ----------- --------
1       41          NULL   NULL        10      
1       174         1      41          1       </pre>
<p>The output shows a data page numbered 174 for my example and the IAM page with an ID of 41. We can crack open the page and view the contents very easily using DBCC Page.</p>
<pre>dbcc dbcc traceon(3604)
dbcc page (idtest,1,174,3)</pre>
<p>The output is quite large, but in essence, the first record is stored with the following details: </p>
<pre>UNIQUIFIER = [NULL]                 
Slot 0 Column 1 Offset 0x4 Length 20
firstname = John                    
Slot 0 Column 2 Offset 0x18 Length 20</pre>
<p> The second record:</p>
<pre>Slot 1 Column 0 Offset 0x33 Length 4
UNIQUIFIER = 1                      
Slot 1 Column 1 Offset 0x4 Length 20
firstname = John                    
Slot 1 Column 2 Offset 0x18 Length 20
surname = Smith  </pre>
<p> The third record:</p>
<pre>Slot 2 Column 0 Offset 0x33 Length 4
UNIQUIFIER = 2                      
Slot 2 Column 1 Offset 0x4 Length 20
firstname = John                    
Slot 2 Column 2 Offset 0x18 Length 20
surname = Smith                     </pre>
<p>And so forth. The first record&#8217;s uniquifier is visible and clearly named within the data page, but set to null. The second copy of the same value receives the uniquifier of one, the third copy receives a 2 etc.  This count is maintained separately for each duplication, so the insert of a new name multiple times will also receive its own counter, beginning at null and working upwards, 1,2,3 etc. So just because the uniquifier is 4 bytes, this does not limit the total number of rows in the table to ~2.1 billion, but does logically limit the total number of duplicates to 2.1 billion. I must confess to not having tested that limit, generating 2.1 billion rows of duplicate data is not trivial and a scrapbook calculation predicts 435 hours of processing on a virtual pc. I suspect the error message it raises when it hits the limit would be interesting.</p>
<p>If we remove all the rows from the table and then add 10 more does the uniquifier reset? Easy to test but the short answer was no, the uniquifier continued to rise, 10 thru 19.</p>
<p>I was a bit suspicious of this since any requirement for the uniquifier to rise / remember what existed before requires it to be stored somewhere &#8211; it has to survive a crash after all, but there is no apparent place the current count is stored. If there was, you wouldn&#8217;t be storing just 1 value, you would be forced to store a value for each record key that had duplicates. This could run into thousands of separate counters being maintained per clustered key so it just doesn&#8217;t make sense that it is stored, it would be a very noticable overhead.</p>
<p>When checking the DBCC Ind for the empty table it insisted it still had a data page, but the only contents of the data page was a single ghost record &#8211; a row that has been marked as deleted. The ghost record was the for the &#8216;John Smith&#8217; with the highest uniquifier before, was this coincidence? The other ghost records had not hung around, so why did this one persist?</p>
<p>I dropped and recreated the table again, inserted 10 rows and then deleted them. Checking DBCC Ind the table still showed a HoBT IAM allocation page for the table and a data page, the data page contained a single ghost record, the one with a Uniquifier of 9 &#8211; the highest given out when 10 duplicates were added. Even waiting some considerable time the ghost record was not cleaned up, so it appears that it will not delete it.</p>
<p>If I added another duplicate row, it picked up the next number in the sequence (10) and shortly after the ghost record was removed from the page. Very convenient and not a coincidence at all &#8211; the memory of the last uniquifier given out  persists as a ghost record, even if all the duplicates for the table have been removed.  What seems strange is this ghost record hanging about, persisting an entire record, to keep the duplicate count for that key, when no instances of it remain on the table.</p>
<p>It can not possibly do this for every key since the space overhead would become very noticable again, so how does it choose what to persist, the last entry? unfortunately it doesn&#8217;t appear that simple at all, after a number of tests it appeared to only be interested in keeping the ghost entry for the record that had the highest key value, so alphabetically, the one closed to &#8216;Z&#8217; for my example.</p>
<p>Conclusion? On the evidence, whilst the other ghost records still persist for a short time, even deleting and then adding more duplicates can see the number continue from where it left of, but given a short time for the ghost records to be removed the uniquifier will restart the sequence back at Null,1,2 etc. Except in the case of the highest entry from the index perspective, that ghost record decides to stick around until there is another entry using the same key, continuing the sequence, at which point the ghost record then finally disappears.</p>
<p>I can not think of any sensible reason why it would do this, can you?</p>
<p>Overall, the uniquifier is a cost overhead of not having a unique index, and at a cost of 4 bytes, an int identity column makes a lot of sense - for all purposes it acts the same and serves the same purpose but in a far more visible manner &#8211; so it really does not make much sense to rely on the uniquifier provided for you, take control and create your own.</p>
<br />Posted in SQL Server Tagged: Internals, SQL Server 2005, Uniquifier <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/174/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/174/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/174/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=174&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/10/20/the-sequence-of-an-index-uniquifier/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Guidance on How to Layout a Partitioned Table across Filegroups</title>
		<link>http://sqlfascination.com/2009/10/15/guidance-on-how-to-layout-a-partitioned-table-across-filegroups/</link>
		<comments>http://sqlfascination.com/2009/10/15/guidance-on-how-to-layout-a-partitioned-table-across-filegroups/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 22:06:19 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Best Practise]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=148</guid>
		<description><![CDATA[SQL Server&#8217;s Table Partitioning was one of my favourite features within the 2005 release. It really brought SQL into the mainstream when it came to holding very large data quantities and allowed us to talk with confidence about large tables containing &#8216;Billions&#8217; of rows and not be limited to the &#8216;Millions&#8217;. From extensive use and [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=148&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>SQL Server&#8217;s Table Partitioning was one of my favourite features within the 2005 release. It really brought SQL into the mainstream when it came to holding very large data quantities and allowed us to talk with confidence about large tables containing &#8216;Billions&#8217; of rows and not be limited to the &#8216;Millions&#8217;. From extensive use and a couple of SQL Labs specifically surrounding the partitioning feature there are some rules and best practises I use to try maximise the benefit / flexibility of the partitioned table, without causing uneccesary drops in performance or increases in the disk space requirements.</p>
<p>Many examples of table partitions focus heavily on using date range based partition functions and schema, with a common layout mechanism of stripping the weeks / months across a number of file groups. The file groups are re-used and in the example picture, you can see 2 years worth of data stripped across 12 filegroups. <img class="aligncenter size-full wp-image-152" title="InitialLayout" src="http://andrewhogg.files.wordpress.com/2009/10/initiallayout2.png?w=600&h=200" alt="InitialLayout" width="600" height="200" /></p>
<p>This is pretty common and has an alluring charm of simplicity, but it is going to hurt when you start rolling the window. The underlying problem is that there is no gap.</p>
<p>For a good number of systems it would be unacceptable to remove a month of data unless you had already inserted the new month sucessfully. So the system starts to roll a new month of data in, and is required to use the same filegroups, the layout transforms into the following:</p>
<p><img class="aligncenter size-full wp-image-153" title="AddedMonth" src="http://andrewhogg.files.wordpress.com/2009/10/addedmonth.png?w=600&h=260" alt="AddedMonth" width="600" height="260" /></p>
<p>The file group has to expand by 50% to accomodate the new data, before the old data can be removed &#8211; and once the old data is removed the filegroups now look like:</p>
<p><img class="aligncenter size-full wp-image-155" title="AfterRemoval" src="http://andrewhogg.files.wordpress.com/2009/10/afterremoval.png?w=600&h=260" alt="AfterRemoval" width="600" height="260" /></p>
<p>So the 50% of space is now wasted unless you use a shrink, which is probably the worst thing you can do to your filegroup and data files at that point in time. Shrink can fragment the data to the extreme and is to be avoided at all costs. Which means you will have to maintain a 50% space penalty for the layout on every filegroup. That might not sound a lot, but on a large database in an enterprise with mirrored SAN&#8217;s,that additional 50% is going to cost a substancial amount.</p>
<p>There are also other issues, SQL allows you to backup at a filegroup level and since the bulk of the data is historic and will not alter, you are forced to re-back up historic data (Jan 08) when you backup the recently inserted Jan 09 data. So there is an impact on backup space, backup times and restore times.</p>
<p>The simplicity of the initial layout makes it seem like a good idea, but the side-effects are not pleasant. You can alter the layout and choose to have 6 Filegroups, each storing 4 months of data, and then the expansion is only from 4 to 5, so a 25% overhead. It is better, but still is a cost overhead, The ultimate extreme is to then place it all in one filegroup, but there are a number of difficulties and contention points with that.</p>
<p>A better approach is to use a different file group per month, but then also create an additional spare filegroup, so that no filegroup is forced to expand, as shown:<img class="aligncenter size-full wp-image-157" title="BetterLayout" src="http://andrewhogg.files.wordpress.com/2009/10/betterlayout1.png?w=600&h=129" alt="BetterLayout" width="600" height="129" /></p>
<p>The difference here is that we have one free filegroup that is not a part of the current partition scheme / function definition, but will be allocated as the &#8216;Next Used&#8217; filegroup for the partition scheme, so that when we split the partition function that filegroup is brought into the scheme and used. The old data that is going to be removed, will provide an empty filegroup that will be used for the next month&#8217;s import. In essence the head of the partition function is chasing the tail and will never catch up.</p>
<p>The expansion of size for this is 1/n where n is the number of partitions to be stored, so for our 24 month example 1/24th expansion &#8211; a considerable saving. Even better is that you can choose to utilise file group backups for the older static months of data.</p>
<p>This though is not perfect for a couple of reasons, primarily data loading and indexing. To make the &#8216;scene&#8217; better there should be a further filegroup dedicated to the data staging / loading that is going to occur to bring the data into the partitioned table.<img class="aligncenter size-full wp-image-158" title="WithStaging" src="http://andrewhogg.files.wordpress.com/2009/10/withstaging.png?w=600&h=116" alt="WithStaging" width="600" height="116" /></p>
<p>The purpose of this is twofold:</p>
<ul>
<li>The disks used for loading are seperate from the disks used for the actual data so that we can maintain Quality of Service on the main table.</li>
<li>The data wants to be loaded as a heap for speed, but is likely to be sorted prior to insertion, an in place sort would expand one of the main file groups which is to be avoided.</li>
</ul>
<p>By loading into a staging filegroup, you give yourself the opportunity to then use a clustered index to sort the data and move it to the appropriate file group, prior to being constrained and switched in to the main partitioned table. If you had loaded the data into FG25 and then tried to apply a clustered index, it would of doubled the size of the filegroup again as it needs to write all the new rows out and commit them, before the heap could be deleted. That would put you back at square one wasting 50% of the disk space.</p>
<p>The staging filegroup does cost us another file group worth of space, so the 24 initial filegroups has grown to 26, which is still a smaller expansion than the potential 50%.</p>
<p>So some simple guidlines are:</p>
<ul>
<li>For a partition function that is going to have N partitions, create N+1 Filegroups. </li>
<li>ETL / Stage your data into a dedicated staging file group</li>
<li>Move your data using a clustered index creation to the &#8216;spare&#8217; filegroup you always have.</li>
<li>Switch new data in, then switch old out creating the new &#8216;spare&#8217;</li>
</ul>
<p>It is not entirely flawless however &#8211; there is a problem in using such a design if the quantity of data per partition is going to vary by a large amount; you then have to provision each filegroup to have enough space to cope with the fluctations, which in itself can result in wasted space and starts chipping away at the gains made.</p>
<p>It works best as a technique on datasets that remain relatively consistent in terms of the number of rows per partition and as the number of partitions goes up, the savings increase.</p>
<br />Posted in SQL Server Tagged: Best Practise, SQL Server 2005, SQL Server 2008, Table Partitioning <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/148/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/148/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/148/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=148&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/10/15/guidance-on-how-to-layout-a-partitioned-table-across-filegroups/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/10/initiallayout2.png" medium="image">
			<media:title type="html">InitialLayout</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/10/addedmonth.png" medium="image">
			<media:title type="html">AddedMonth</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/10/afterremoval.png" medium="image">
			<media:title type="html">AfterRemoval</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/10/betterlayout1.png" medium="image">
			<media:title type="html">BetterLayout</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/10/withstaging.png" medium="image">
			<media:title type="html">WithStaging</media:title>
		</media:content>
	</item>
		<item>
		<title>What is the SQL Server 2008 DateTimeOffset Internal Structure?</title>
		<link>http://sqlfascination.com/2009/10/13/what-is-the-sql-server-2008-datetimeoffset-internal-structure/</link>
		<comments>http://sqlfascination.com/2009/10/13/what-is-the-sql-server-2008-datetimeoffset-internal-structure/#comments</comments>
		<pubDate>Tue, 13 Oct 2009 19:50:12 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[DateTimeOffset]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://sqlfascination.com/?p=137</guid>
		<description><![CDATA[After decoding the DateTime2 internal structure I thought I would take a quick look at the DateTimeOffset structure, since it should not present too many difficulties and post it up quickly, but was surprised at the initial result of a test. It follows the same basic premise that the time portion is followed by the date portion [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=137&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>After decoding the <a title="DateTime2 Internal Structure" href="http://sqlfascination.com/2009/10/11/what-is-the-sql-server-2008-datetime2-internal-structure/">DateTime2 internal structure</a> I thought I would take a quick look at the DateTimeOffset structure, since it should not present too many difficulties and post it up quickly, but was surprised at the initial result of a test. It follows the same basic premise that the time portion is followed by the date portion and the time / date is based on a day count from the epoch time of 0001/01/01 and the time the number of intervals since midnight, where the interval is defined by the accuracy.</p>
<p>I was expecting the datetime offset value itself to occupy the additional 2 bytes quoted, and not affect the other values. Bad assumption, as soon as I cracked open a few examples I could immediately see that setting the offset also alters the underlying time / date component values as well.</p>
<p>Using the same comparison methods as before the underlying time value is clearly being adjusted:</p>
<pre>'0001/01/01 00:00:00 -0:00' =&gt; 0x0700000000000000000000
'0001/01/01 00:00:00 -12:00' =&gt; 0x0700E034956400000030FD</pre>
<p>So, even though an offset is being stored, the underlying time is also being altered to match, and the internal storage is using UTC as the reference point. This makes sense as the most valid reference that you could use.</p>
<pre>'0001-01-01 12:00:00.0000000 +00:00' =&gt; 0x0700E03495640000000000
'0001-01-01 00:00:00.0000000 -12:00' =&gt; 0x0700E034956400000030FD</pre>
<p>The same time / date is generated for the two values, but the last two bytes hold the offset and have stored the offset used when the date was initially stored. The underlying storage of the time though is clearly identical in both,  so UTC is the common ground they each get stored again.</p>
<p>The final 2 bytes for the offset are pretty easy to decode, since the pattern is the same as before with a slight twist. The offset time records the number of minutes for the offset in hex, with the first byte of the two being the least significant as before with the time, so you end up reading the two bytes left-to-right and then decode that byte right-to-left.</p>
<p>The twist is that for positive offsets, the value increments 1,2,3, in hex as appropriate, but for negative values, it starts decrementing by considering -1 equal to &#8216;FFFF&#8217;, I&#8217;ve split the hex output into the individual components by adding some spaces to make it easier to read. (Accuracy Code, Time Value, Date Value, Offset Used)</p>
<pre>'2001-01-01 12:00:00.0000000 +00:01' =&gt; 0x07   009A717164   75250B  0100
'2001-01-01 12:00:00.0000000 +00:00' =&gt; 0x07   00E0349564   75250B  0000
'2001-01-01 12:00:00.0000000 -00:01' =&gt; 0x07   0026F8B864   75250B  FFFF</pre>
<p>Since the offsets supported at only +14 hours to -14 hours, there is no risk of the two ranges overlapping. When I think about this a bit more, it is acting as a signed number, -1 being 11111111111 etc. So the 2 bytes at the end is a signed int of the number of minutes offset.</p>
<p>There are a number of time zones in the world that do not occur at exact hourly intervals from UTC, some are on the half hour mark such as Caracas (-4:30) or Delhi (+5:30) to name a few, whilst Kathmandu (+5:45) is even more specific.  In theory the format allows offsets specified to even greater levels of distinction, although I am not sure as to why you would wish to use it. Do you really want an offset of +3:17 minutes? That is potentially scary to consider that as a valid input to the value.</p>
<p>That has made me wonder as to why the accuracy was set so high, when in reality 15 minute intervals would of been sufficient, with a -14 to +14 range, that is 113 different values inclusively, which could be accomodated within a single byte.</p>
<p>So why spend 2 bytes on the format, when 1 was enough? Was it to just be compatible to the ISO format in some way that required it? Not sure.</p>
<br />Posted in SQL Server Tagged: DateTimeOffset, Internals, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/137/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=137&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/10/13/what-is-the-sql-server-2008-datetimeoffset-internal-structure/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>What is the SQL Server 2008 DateTime2 Internal Structure?</title>
		<link>http://sqlfascination.com/2009/10/11/what-is-the-sql-server-2008-datetime2-internal-structure/</link>
		<comments>http://sqlfascination.com/2009/10/11/what-is-the-sql-server-2008-datetime2-internal-structure/#comments</comments>
		<pubDate>Sun, 11 Oct 2009 22:15:47 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[DateTime2]]></category>
		<category><![CDATA[Internals]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://andrewhogg.wordpress.com/?p=121</guid>
		<description><![CDATA[SQL Server has a number of new date time formats, but the one I am most interested in is DateTime2. The internal format of the SQL DateTime is commonly mistaken as 2&#215;4 byte integers, with the latter integer being milliseconds since midnight. It is in fact the number of 1/300ths of a second since midnight which is [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=121&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>SQL Server has a number of new date time formats, but the one I am most interested in is DateTime2. The internal format of the SQL DateTime is commonly mistaken as 2&#215;4 byte integers, with the latter integer being milliseconds since midnight. It is in fact the number of 1/300ths of a second since midnight which is why the accuracy of the DateTime within SQL Server has historically been 3.33ms. (If you really want to see it, crack it open by converting it to a binary, adding 1 and re-converting, you add 3.33ms, not 1 ms.)</p>
<p>So DateTime2 must use a different format, and as a weekend exercise that had no purpose than understanding the internals I thought I&#8217;d take a look. I have not seen the information in the BoL or posted as yet, so might be of use.  I am starting with the DateTime2(7) and looking at the maximum accuracy structure. The code used to crack it open each time is basically as follows:</p>
<pre><span style="color:#0000ff;">declare</span> @dt <span style="color:#0000ff;">datetime2</span>(7)
<span style="color:#0000ff;">set</span> @dt = <span style="color:#ff0000;">'2000/01/01 00:00:00'
</span><span style="color:#0000ff;">declare</span> @bin <span style="color:#0000ff;">varbinary</span>(<span style="color:#ff00ff;">max</span>)
<span style="color:#0000ff;">set</span> @bin = <span style="color:#ff00ff;">CONVERT</span>(<span style="color:#0000ff;">varbinary</span>(<span style="color:#ff00ff;">max</span>), @dt)</pre>
<p>To make my life easier, SQL conveniently outputs all the values as hexi-decimal numbers. The results are not what you would expect.</p>
<pre>0x07000000000007240B</pre>
<p>The date which traditionally occupied the first 4 bytes, clearly is occupying the last few bytes. So the format is not going to be obvious or simple. Interestingly the returned result is 9 bytes, but the length is quoted as 8. It is returning 8 when checked using the length, that first byte is somewhat odd to make an appearance.  It&#8217;s also suspiciously the accuracy value, and with a few tests using a change of accuracy, it show that value changes. So the first pseudo-byte is the accuracy indicator.</p>
<p>To start figuring out some more, let&#8217;s take the time back to the beginning point, which in this case is not 1900/01/01 but 0001/01/01 which when converted gives us:</p>
<pre>'0001/01/01 00:00:00' =&gt; 0x070000000000000000</pre>
<p>Start incrementing the day portion and there is an obvious pattern, the 6th byte changes.</p>
<pre>'0001/01/02 00:00:00' =&gt; 0x070000000000010000
'0001/01/03 00:00:00' =&gt; 0x070000000000020000
'0001/01/31 00:00:00' =&gt; 0x0700000000001E0000</pre>
<p>As you try the 2nd month, to check where the month is, the same byte alters, so it represents days, not specific date parts. Is it the number of days since the beginning of the year? No.</p>
<pre>'0001/02/01 00:00:00' =&gt; 0x0700000000001F0000</pre>
<p>If it was, there would be an issue since 1 byte does not represent enough values, as we can see, FF occurs on the 13th of September, and then it rolls over and puts a 1 in the 7th Byte position.</p>
<pre>'0001/09/13 00:00:00' =&gt; 0x070000000000FF0000
'0001/09/14 00:00:00' =&gt; 0x070000000000000100
'0001/09/15 00:00:00' =&gt; 0x070000000000010100</pre>
<p>It rolls over, then carries on as before. This immediately suggests the next test, to roll over the year, and the pattern continues.</p>
<pre>'0001/12/31 00:00:00' =&gt; 0x0700000000006C0100  
'0001/12/31 00:00:00' =&gt; 0x0700000000006D0100</pre>
<p>So the format is just counting, we see it in the example as hex, but it is a straight number count going on but the hex values are left-to-right. Only 2 bytes are used so far, which do not represent enough day combinations, add the third byte in by going past 180 years:</p>
<pre>'0180/06/06 00:00:00' =&gt; 0x070000000000FFFF00  
'0180/06/07 00:00:00' =&gt; 0x070000000000000001</pre>
<p>So the final byte is then increased, so the number of combinations becomes 16777215 &#8211; that seems a lot better and certainly going to cover the range required.</p>
<pre>'2001/01/01 00:00:00' =&gt; 0x07000000000075250B</pre>
<p>So that is the final 3 bytes decoded, a simple pattern - and provides the template of how the time is also stored.</p>
<pre>'0001/01/01 00:00:00.0000000' =&gt; 0x070000000000000000
'0001/01/01 00:00:00.0000001' =&gt; 0x070100000000000000
'0001/01/01 00:00:00.0000255' =&gt; 0x07FF00000000000000
'0001/01/01 00:00:00.0065535' =&gt; 0x07FFFF000000000000
'0001/01/01 00:00:00.0065536' =&gt; 0x070000010000000000</pre>
<p>So to check whether the format is the same,</p>
<pre>'0001/01/01 00:00:00.9999999' =&gt; 0x077F96980000000000</pre>
<p>Decode that again and it all matches:</p>
<pre><span style="color:#0000ff;">select </span>(152 * 256 * 256) + (150 * 256) + 127
-----------
9999999</pre>
<p>When we click over into 1 second exactly, we increment the first byte by 1, so the time portion is still represented in 100ns intervals, with the normal system of each byte counting up 1 every time the previous byte rolls over. As we get to the limit of the 3 bytes, it rolls into the 4th and then the 5th.</p>
<pre>'0001/01/01 00:00:01.0000000' =&gt; 0x078096980000000000</pre>
<p>So the internal format of the DateTime2(7) is decoded, not difficult but it is an interesting choice &#8211; it is now a straight binary number, with the Least Significant Byte being on the Left, the Most Significant being on the right (for each section.) Within the byte however, to convert it you must still read it right-to-left.</p>
<p>The first 5 bytes are recording how many time units intervals have passed since midnight, and the last 3 bytes recording how many days have passed since 0001/01/01.</p>
<p>The time unit intervals are dictated by the accuracy of the number, 100ns for DateTime2(7), and 1 Micro second intervals for a DateTime2(6) etc.  The way in which you interpret it does not change, but the units you are multiplying the time portion by, alters based on the accuracy.</p>
<p>You could construct 2 dates that are identical at a binary level, but due to the field meta-data on accuracy, they do not represent the same date time.</p>
<pre>declare @dt1 dt1 datetime2(6)
set @dt1 = '0001/01/01 00:00:00.000001'
declare @dt2 datetime2(7)
set @dt2 = '0001/01/01 00:00:00.0000001'

0x060100000000000000
0x070100000000000000 </pre>
<p> And that is perhaps why on output they automatically have prefixed the binary value with the datetime accuracy, so that they are not entirely identical? I&#8217;m not sure but would be interested to find out.</p>
<br />Posted in SQL Server Tagged: DateTime2, Internals, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/121/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/121/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/121/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=121&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/10/11/what-is-the-sql-server-2008-datetime2-internal-structure/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Finding Next Identity Value, and a Wild Goose Chase.</title>
		<link>http://sqlfascination.com/2009/10/09/finding-next-identity-value-and-a-wild-goose-chase/</link>
		<comments>http://sqlfascination.com/2009/10/09/finding-next-identity-value-and-a-wild-goose-chase/#comments</comments>
		<pubDate>Fri, 09 Oct 2009 21:49:12 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Identity]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://andrewhogg.wordpress.com/?p=114</guid>
		<description><![CDATA[A question asked on stack overflow was to find the next identity value that would occur on a table, without being required to add a record to work it out. The problem lies in that if the highest row is deleted, the number is not reused so any answers using the existing rows can be [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=114&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A question asked on stack overflow was to find the next identity value that would occur on a table, without being required to add a record to work it out. The problem lies in that if the highest row is deleted, the number is not reused so any answers using the existing rows can be incorrect.</p>
<p>Logically the value must be stored, so first place I checked was the DMV&#8217;s. That stores the currently used value, but does not store the next value.</p>
<p>The wild goose chase started there&#8230;</p>
<ul>
<li>I used the dedicated admin console to pull all the system tables, expecting it to be in sys.syshobtcolumns, no joy. Dumped the whole system table contents before and after an insert looking for the difference and didn&#8217;t spot it.</li>
<li>Took a dump of every page in the file before and after inserting a new row and having the textural dumps compared in a text comparison application, still no joy.</li>
<li>Started dumping out log records using the following script and pulled the identity calls:</li>
</ul>
<pre><span style="color:#0000ff;">select</span> *
<span style="color:#0000ff;">from</span> ::fn_dblog(null, null)
<span style="color:#0000ff;">where </span>operation = <span style="color:#ff0000;">'LOP_IDENT_NEWVAL'</span></pre>
<ul>
<li>After a couple of hours running around the pages trying to find it, I realised I should of stuck with the DMV and I really went the wrong way around it.</li>
</ul>
<p> The DMV has the right answer it seems, but as two fields you have to combine to get the answer.</p>
<pre><span style="color:#0000ff;">create table</span> foo (MyID <span style="color:#0000ff;">int identity</span> not null, MyField <span style="color:#0000ff;">char</span>(10))
<span style="color:#0000ff;">insert into</span> foo <span style="color:#0000ff;">values</span> ('test')
go 10

-- Inserted 10 rows
<span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(8),increment_value) <span style="color:#0000ff;">as</span> IncrementValue,
   <span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(8),last_value) <span style="color:#0000ff;">as</span> LastValue
<span style="color:#0000ff;">from </span><span style="color:#008000;">sys.identity_columns</span> <span style="color:#0000ff;">where</span> name ='myid'

-- insert another row
<span style="color:#0000ff;">insert into</span> foo values (<span style="color:#ff0000;">'test'</span>)

-- check the values again
<span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(8),increment_value) <span style="color:#0000ff;">as</span> IncrementValue,
   <span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(8),last_value) <span style="color:#0000ff;">as</span> LastValue
<span style="color:#0000ff;">from</span> <span style="color:#008000;">sys.identity_columns</span> where name =<span style="color:#ff0000;">'myid'</span>

-- delete the rows
<span style="color:#0000ff;">delete from</span> foo

-- check the DMV again
<span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(8),increment_value) <span style="color:#0000ff;">as</span> IncrementValue,
   <span style="color:#ff00ff;">Convert</span>(<span style="color:#0000ff;">varchar</span>(8),last_value) <span style="color:#0000ff;">as</span> LastValue
from <span style="color:#008000;">sys.identity_columns</span> where name ='myid'

-- value is currently 11 and increment is 1, so the next insert gets 12
<span style="color:#0000ff;">insert into</span> foo <span style="color:#0000ff;">values</span> (<span style="color:#ff0000;">'test'</span>)
<span style="color:#0000ff;">select </span>* <span style="color:#0000ff;">from</span> foo

Result:
MyID        MyField
----------- ----------
12          test      

(1 row(s) affected)</pre>
<p>So adding the increment to the last value will predict the next value correctly, assuming someone else does not grab it in the mean time which is why it is not a good idea to use in code, but if you need to investigate a table and want to know what it thinks is next, without actually inserting a row and affecting the table, then it is useful.</p>
<p>All that and the easy way is then:</p>
<p>select ident_current(&#8216;foo&#8217;) + ident_incr(&#8216;foo&#8217;)</p>
<p>Ah well, was fun investigating it &#8211; but what a wild goose chase to find it was an easy answer.</p>
<br />Posted in SQL Server Tagged: Identity, SQL Server 2005, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/114/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/114/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/114/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=114&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/10/09/finding-next-identity-value-and-a-wild-goose-chase/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>Do Filtered Index Have Different Tipping Points?</title>
		<link>http://sqlfascination.com/2009/10/04/do-filtered-index-have-different-tipping-points/</link>
		<comments>http://sqlfascination.com/2009/10/04/do-filtered-index-have-different-tipping-points/#comments</comments>
		<pubDate>Sun, 04 Oct 2009 19:17:57 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[Indexes]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Tipping Point]]></category>

		<guid isPermaLink="false">http://andrewhogg.wordpress.com/?p=102</guid>
		<description><![CDATA[I&#8217;ve been taking a look at filtered indexes and they do initially look particularly useful, but I was wandering what effect the filtered index has on the tipping point. The tipping point was described by Kimberley L. Tripp in an excellent post - and describes the point at which SQL opts to change the query plan [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=102&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been taking a look at filtered indexes and they do initially look particularly useful, but I was wandering what effect the filtered index has on the tipping point. The tipping point was described by <a href="http://www.sqlskills.com/BLOGS/KIMBERLY/">Kimberley L. Tripp</a> in an excellent <a href="http://www.sqlskills.com/BLOGS/KIMBERLY/category/The-Tipping-Point.aspx" target="_blank">post</a> - and describes the point at which SQL opts to change the query plan strategy from a NC index seek to a scan of some kind (Table / Clustered Index)</p>
<p>So I was wondering whether Filtered indexes chose a different strategy, or would they &#8216;tip&#8217; from a targeted index to a table scan in the same way.</p>
<p>First off was to create a dummy table:</p>
<pre><span style="color:#0000ff;">create table</span>tblPersonnel ( 
PersonnelID Int <span style="color:#0000ff;">Identity</span>, 
FirstName <span style="color:#0000ff;">Char</span>(30), 
LastName <span style="color:#0000ff;">Char</span>(30),
Department <span style="color:#0000ff;">Char</span>(30), 
SomePadding <span style="color:#0000ff;">Char</span>(250)
)</pre>
<p>This was deliberately created as a heap, since data had to be generated and loaded:</p>
<pre><span style="color:#0000ff;">insert into </span>tblPersonnel(firstname,lastname,Department) <span style="color:#0000ff;">values</span>( dbo.randomstring(20), dbo.randomstring(20), dbo.randomstring(20)); <span style="color:#0000ff;">go</span> 1000000</pre>
<p>The dbo.randomstring is a user created function I added to generate a random string [A-Z] the length of the function parameter, so the insert statement is inserting random strings of length 20 into the first name, last name and department fields.</p>
<p>Once the data was loaded, a clustered index was applied.</p>
<pre><span style="color:#0000ff;">alter table</span> dbo.tblPersonnel <span style="color:#0000ff;">add constraint</span>
PK_tblPersonnel <span style="color:#0000ff;">primary key clustered</span>(PersonnelID) <span style="color:#0000ff;">with</span>( <span style="color:#0000ff;">statistics_norecompute </span>= <span style="color:#0000ff;">off</span>, <span style="color:#0000ff;">ignore_dup_key</span> = <span style="color:#0000ff;">off</span>,<span style="color:#0000ff;"> allow_row_locks</span> = <span style="color:#0000ff;">on</span>, <span style="color:#0000ff;">allow_page_locks</span> = <span style="color:#0000ff;">on</span>) <span style="color:#0000ff;">on</span> [PRIMARY]
GO</pre>
<p>A quick check of the table:</p>
<pre><span style="color:#0000ff;">select</span> <span style="color:#ff00ff;">COUNT</span>(*) <span style="color:#0000ff;">from</span> tblpersonnel
-----------
1000000
(1 row(s) affected)</pre>
<p>To get a rough idea of the tipping point we need to know the number of pages at the leaf level of the table. </p>
<pre><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> <span style="color:#008000;">sys.objects</span> <span style="color:#0000ff;">where</span> name =<span style="color:#ff0000;">'tblpersonnel'</span>
<span style="color:#008000;">-- grab the object id and query the physical stats.
</span><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> <span style="color:#008000;">sys.dm_db_index_physical_stats</span>(<span style="color:#ff00ff;">DB_ID</span>(<span style="color:#ff0000;">'FilteredIndexTest'</span>), 1117247035 , 1,0,<span style="color:#ff0000;">'DETAILED'</span>)
<span style="color:#008000;">-- 45455 pages shown in the table at leaf level 0 i.e. the clustered index</span></pre>
<p>So create an NC index on the first name, last name.</p>
<pre><span style="color:#0000ff;">create nonclustered index</span>[IX_Normal] ON [dbo].[tblPersonnel] (  [FirstName] <span style="color:#0000ff;">asc</span>,  [LastName] <span style="color:#0000ff;">asc</span>
)<span style="color:#0000ff;"> <span style="color:#0000ff;">with</span>( pad_index = off<span style="color:#0000ff;"> ,</span><span style="color:#0000ff;">statistics_norecompute </span>= <span style="color:#0000ff;">off</span>, <span style="color:#0000ff;">ignore_dup_key</span> = <span style="color:#0000ff;">off</span>,<span style="color:#0000ff;"> allow_row_locks</span> = <span style="color:#0000ff;">on</span>, <span style="color:#0000ff;">allow_page_locks</span> = <span style="color:#0000ff;">on</span>) <span style="color:#0000ff;">on</span> <span style="color:#333333;">[PRIMARY]</span>
GO</span></pre>
<p>For reference, we will also check how many pages the NC index has:</p>
<pre><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> <span style="color:#008000;">sys.dm_db_index_physical_stats</span>(<span style="color:#ff00ff;">DB_ID</span>(<span style="color:#ff0000;">'FilteredIndexTest'</span>), 1117247035 , 2,0,<span style="color:#ff0000;">'DETAILED'</span>)
-- Level 0 of the index shows 8696 pages</pre>
<p>The tipping point is expected between 25% to 33% of the number of pages when expressed as rows, so something between 11363 to 15000 rows is where to expect it. Since the data was random when inserted, a bit of human binary chop is needed to find the tipping point. After some experimentation &#8211; the dividing point was between the firstname like &#8216;a[a-h]%&#8217; and &#8216;a[a-i]%&#8217;</p>
<pre><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> tblpersonnel
<span style="color:#0000ff;">where</span> firstname like 'a[a-h]%'
<span style="color:#008000;">-- 11898 rows - seeks
</span><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> tblpersonnel
<span style="color:#0000ff;">where</span> firstname like 'a[a-i]%'
<span style="color:#008000;">-- 13299 rows - scans</span></pre>
<p>So the tipping point showed up within the expected range, so next I created the filtered index based on FirstName &#8216;&gt;=a&#8217; and &#8216;&lt;= ak&#8217; since I can not use a like clause in the Filtered Index where statement.</p>
<pre><span style="color:#0000ff;">create nonclustered index</span> [IX_Filter] <span style="color:#0000ff;">ON</span>[dbo].[tblPersonnel] (  [FirstName] <span style="color:#0000ff;">asc</span>,  [LastName] <span style="color:#0000ff;">asc</span>
)
<span style="color:#0000ff;">where</span>[FirstName] &gt;= <span style="color:#ff0000;">'a'</span>and firstname &lt;= <span style="color:#ff0000;">'ak'</span>
)<span style="color:#0000ff;"> <span style="color:#0000ff;">with</span>( pad_index = off<span style="color:#0000ff;"> ,</span><span style="color:#0000ff;">statistics_norecompute </span>= <span style="color:#0000ff;">off</span>, <span style="color:#0000ff;">ignore_dup_key</span> = <span style="color:#0000ff;">off</span>,<span style="color:#0000ff;"> allow_row_locks</span> = <span style="color:#0000ff;">on</span>, <span style="color:#0000ff;">allow_page_locks</span> = <span style="color:#0000ff;">on</span>) <span style="color:#0000ff;">on</span> <span style="color:#333333;">[PRIMARY]</span></span>
<span style="color:#0000ff;"> </span>GO</pre>
<p>The filtered index is in play, so I disabled the non-filtered index and re-ran the same queries.</p>
<pre><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> tblpersonnel
<span style="color:#0000ff;">where</span> firstname like 'a[a-h]%'
<span style="color:#008000;">-- 11898 rows - seeks
</span><span style="color:#0000ff;">select</span> * <span style="color:#0000ff;">from</span> tblpersonnel
<span style="color:#0000ff;">where</span> firstname like 'a[a-i]%'
<span style="color:#008000;">-- 13299 rows - scans</span></pre>
<p>Conclusion: The tipping point did not alter based on the filtered index &#8211; it still tips based on the ratio of rows to total pages in the leaf level of the clustered index.</p>
<br />Posted in SQL Server Tagged: Indexes, SQL Server 2008, Tipping Point <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/102/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=102&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/10/04/do-filtered-index-have-different-tipping-points/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>How to Remember the Next Used Filegroup in a Partition Scheme</title>
		<link>http://sqlfascination.com/2009/09/30/how-to-remember-the-next-used-filegroup-in-a-partition-scheme/</link>
		<comments>http://sqlfascination.com/2009/09/30/how-to-remember-the-next-used-filegroup-in-a-partition-scheme/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 20:23:20 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>
		<category><![CDATA[Table Partitioning]]></category>

		<guid isPermaLink="false">http://andrewhogg.wordpress.com/?p=68</guid>
		<description><![CDATA[Within SQL, the partitioned table feature provides an excellent way to store and roll forward data windows across large datasets without incurring huge loading and archiving penalties to the main table. The process of loading, manipulating and decommissioning data from a partitioned table will need posts of their own &#8211; due to the depth of [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=68&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Within SQL, the partitioned table feature provides an excellent way to store and roll forward data windows across large datasets without incurring huge loading and archiving penalties to the main table.</p>
<p>The process of loading, manipulating and decommissioning data from a partitioned table will need posts of their own &#8211; due to the depth of the information available and required as well as the detail needed to understand some of the best practises I have learnt from the field.</p>
<p>This entry is specifically relating to the &#8216;Next Used&#8217; aspect of dealing with a partitioned table &#8211; for some time I wanted to know how could I tell what partition had been set as next used? There seems to be no record of the value &#8211; the BoL lists the Next Used facility as:</p>
<blockquote><p>Specify the filegroup to be marked by the partition scheme as NEXT USED.</p></blockquote>
<p>This is true in a simplistic sense, but the marking can not be at the filegroup level since the relation from partition schemes to filegroups can be many-to-many, so no single marking on a file group could suffice. I had never been able to figure out where to find that marking, and find a way to read what had been set.</p>
<p>I should add, that you really should not set the next used in one location of the code / stored proc and then perform the split of the partition function in another, it would be far safer to do them together, so the need to actually find it out is really borne out of investigative necessity in trying to understand for a given partition scheme that implements a rolling window that started to go bad, where does it think it should be splitting the partition to?</p>
<p>So the problem remained; it was given to <a title="Paul S. Randal" href="http://www.sqlskills.com/blogs/paul" target="_blank">Paul S. Randal </a> to figure out how we could see / infer this information outside of the dedicated admin console. He figured out the starting point in terms of which value in which DMV to start the solution with, and I ran with it from there to create a relatively easy way to get to it. So credit to Paul for finding out where to start on the issue.</p>
<p>So down to the scripts and some test cases:</p>
<p>There are 2 types of partition scheme to test:</p>
<ul>
<li>Partition Function as Left.</li>
<li>Partition Function as Right.</li>
</ul>
<p>When defining the scheme initially, you can also define it in two ways:</p>
<ul>
<li>Partition scheme defined with the next used pre-set value.</li>
<li>Partition scheme defined with no pre-set next used value.</li>
</ul>
<p>So there are 4 combinations to test initially, since we do not need data for this or are worried about the query plan, the main work is pure schema creation and checking.</p>
<p>First script is to just create a test database, nothing significant:</p>
<pre><span style="color:#0000ff;">CREATE DATABASE</span>[NextUsedTest] <span style="color:#0000ff;">ON PRIMARY</span>
( 
<span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'NextUsedTest'</span>, 
<span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\</span><span style="color:#ff0000;">MSSQL</span>
<span style="color:#ff0000;">\DATA\NextUsedTest.mdf'</span> , 
<span style="color:#0000ff;">SIZE</span> = 3072KB , 
<span style="color:#0000ff;">FILEGROWTH</span> = 1024KB 
), 
<span style="color:#0000ff;">FILEGROUP</span>[PartFG1] ( 
<span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'PartFile1'</span>, 
<span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\</span><span style="color:#ff0000;">Microsoft SQL Server\MSSQL.1\MSSQL</span>
<span style="color:#ff0000;">\DATA\PartFile1.ndf'</span> , 
<span style="color:#0000ff;">SIZE</span> = 3072KB , 
<span style="color:#0000ff;">FILEGROWTH</span> = 1024KB ), 
<span style="color:#0000ff;">FILEGROUP</span>[PartFG2] ( 
<span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'PartFile2'</span>, 
<span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL</span>
<span style="color:#ff0000;">\DATA\PartFile2.ndf'</span> , 
<span style="color:#0000ff;">SIZE</span> = 3072KB , 
<span style="color:#0000ff;">FILEGROWTH</span> = 1024KB 
), 
<span style="color:#0000ff;">FILEGROUP</span>[PartFG3] ( 
<span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'PartFile3'</span>, 
<span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL</span>
<span style="color:#ff0000;">\DATA\PartFile3.ndf'</span> , 
<span style="color:#0000ff;">SIZE</span> = 3072KB , 
<span style="color:#0000ff;">FILEGROWTH</span> = 1024KB 
), 
<span style="color:#0000ff;">FILEGROUP</span>[PartFG4] ( <span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'PartFile4'</span>, 
<span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL</span>
<span style="color:#ff0000;">\DATA\PartFile4.ndf'</span> , 
<span style="color:#0000ff;">SIZE</span> = 3072KB , 
<span style="color:#0000ff;">FILEGROWTH</span> = 1024KB 
), 
<span style="color:#0000ff;">FILEGROUP</span>[PartFG5] ( 
<span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'PartFile5</span>', 
<span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL</span>
<span style="color:#ff0000;">\DATA\PartFile5.ndf'</span> , 
<span style="color:#0000ff;">SIZE</span> = 3072KB , 
<span style="color:#0000ff;">FILEGROWTH</span> = 1024KB 
)
 LOG <span style="color:#0000ff;">ON</span>
( <span style="color:#0000ff;">NAME</span> = N<span style="color:#ff0000;">'NextUsedTest_log'</span>, 
<span style="color:#0000ff;">FILENAME</span> = N<span style="color:#ff0000;">'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL</span>
<span style="color:#ff0000;">\DATA\NextUsedTest_log.ldf'</span>, 
SIZE = 1024KB , 
FILEGROWTH = 10%
) GO</pre>
<p>Now the database is in place, lets create the partition functions /schemes:</p>
<pre><span style="color:#0000ff;">CREATE PARTITION FUNCTION </span>[pfLeftNoSpare](<span style="color:#0000ff;">int</span>) 
<span style="color:#0000ff;">AS RANGE</span> LEFT <span style="color:#0000ff;">FOR VALUES</span> (200801, 200802, 200803, 200804)

<span style="color:#0000ff;">CREATE PARTITION SCHEME </span>[psLeftNoSpare] 
<span style="color:#0000ff;">A</span><span style="color:#0000ff;">S PARTITION</span>[pfLeftNoSpare] 
<span style="color:#0000ff;">TO</span>([PartFG1], [PartFG2], [PartFG3], [PartFG4], [Primary])

<span style="color:#0000ff;">CREATE PARTITION FUNCTION </span>[pfLeftWithNextUsedSet](<span style="color:#0000ff;">int</span>) 
<span style="color:#0000ff;">AS RANGE</span> LEFT <span style="color:#0000ff;">FOR VALUES</span> (200801, 200802, 200803, 200804)

<span style="color:#0000ff;">CREATE PARTITION SCHEME </span>[psLeftWithNextUsedSet] 
<span style="color:#0000ff;">AS PARTITION </span>[pfLeftWithNextUsedSet] 
<span style="color:#0000ff;">TO</span>([PartFG1], [PartFG2], [PartFG3], [PartFG4], [Primary], [PartFG5])

<span style="color:#0000ff;">CREATE PARTITION FUNCTION </span>[pfRightNoSpare](int) 
<span style="color:#0000ff;">AS RANGE</span> RIGHT<span style="color:#0000ff;"> FOR VALUES</span> (200801, 200802, 200803, 200804)

<span style="color:#0000ff;">CREATE PARTITION SCHEME </span>[psRightNoSpare] 
<span style="color:#0000ff;">AS PARTITION</span>[pfRightNoSpare] 
<span style="color:#0000ff;">TO</span>([Primary], [PartFG1], [PartFG2], [PartFG3], [PartFG4])

<span style="color:#0000ff;">CREATE PARTITION FUNCTION </span>[pfRightWithSpareNextUsedSet](<span style="color:#0000ff;">int</span>) 
<span style="color:#0000ff;">AS RANGE</span> RIGHT <span style="color:#0000ff;">FOR VALUES</span> (200801, 200802, 200803, 200804)

<span style="color:#0000ff;">CREATE PARTITION SCHEME </span>[psRightWithSpareNextUsedSet] 
<span style="color:#0000ff;">AS PARTITION</span>[pfRightWithSpareNextUsedSet] 
<span style="color:#0000ff;">TO</span>([Primary],[PartFG1], [PartFG2], [PartFG3], [PartFG4],  [PartFG5])
GO</pre>
<p>Four partition functions and schemes, for the left schemes the next used file group appears on the far right when the next used it pre-set. When you use this mechanisms to preset, it issues a message for the operation:</p>
<pre>Partition scheme 'psLeftWithNextUsedSet' has been created successfully. 
'PartFG5' is marked as the next used filegroup 
in partition scheme 'psLeftWithNextUsedSet'. 
Partition scheme 'psRightWithSpareNextUsedSet' has been created successfully. 
'PartFG5' is marked as the next used filegroup 
in partition scheme 'psRightWithSpareNextUsedSet'.</pre>
<p>The trick to finding the next used is to look for the mismatching record &#8211; if you join the partition function values to the partition scheme there are scheme entries that do not have a corresponding partition function value, since it has not been set by performing the split.</p>
<p>Lets start with the Left based partition and compare the two.</p>
<pre> <span style="color:#0000ff;">select</span> FG.<span style="color:#0000ff;">Name</span>as FileGroupName
    , dds.destination_id
    , dds.data_space_id
    , prv.value, ps.<span style="color:#0000ff;">Name</span>
 <span style="color:#0000ff;">from</span> <span style="color:#339966;">sys.partition_schemes</span> PS
 inner join <span style="color:#339966;">sys.destination_data_spaces </span><span style="color:#0000ff;">as</span> DDS 
    <span style="color:#0000ff;">on</span> DDS.partition_scheme_id = PS.data_space_id
 inner join <span style="color:#339966;">sys.filegroups</span> <span style="color:#0000ff;">as</span> FG 
    <span style="color:#0000ff;">on</span> FG.data_space_id = DDS.data_space_ID 
 left join <span style="color:#339966;">sys.partition_range_values</span> <span style="color:#0000ff;">as</span> PRV 
    <span style="color:#0000ff;">o</span><span style="color:#0000ff;">n</span> PRV.Boundary_ID = DDS.destination_id and prv.function_id=ps.function_id 
 <span style="color:#0000ff;">where</span> PS.<span style="color:#0000ff;">name</span> = <span style="color:#ff0000;">'psLeftNoSpare'</span></pre>
<p>The output is:</p>
<table border="1">
<tbody>
<tr>
<th>FileGroupName</th>
<th>destination_id</th>
<th>data_space_id</th>
<th>value</th>
<th>Name</th>
</tr>
<tr>
<td>PartFG1</td>
<td>1</td>
<td>2</td>
<td>200801</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PartFG2</td>
<td>2</td>
<td>3</td>
<td>200802</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PartFG3</td>
<td>3</td>
<td>4</td>
<td>200803</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PartFG4</td>
<td>4</td>
<td>5</td>
<td>200804</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PRIMARY</td>
<td>5</td>
<td>1</td>
<td>NULL</td>
<td>psLeftNoSpare</td>
</tr>
</tbody>
</table>
<p>And check the other Left defined partition:</p>
<pre> <span style="color:#0000ff;">select</span> FG.Name <span style="color:#0000ff;">as</span>FileGroupName
    , dds.destination_id
    , dds.data_space_id
    , prv.value, ps.Name  
<span style="color:#0000ff;">from</span> <span style="color:#339966;">sys.partition_schemes</span> PS
 inner join <span style="color:#339966;">sys.destination_data_spaces</span> <span style="color:#0000ff;">as</span> DDS 
    <span style="color:#0000ff;">on</span> DDS.partition_scheme_id = PS.data_space_id
 inner join<span style="color:#339966;"> sys.filegroups</span> <span style="color:#0000ff;">as</span> FG 
    <span style="color:#0000ff;">on</span> FG.data_space_id = DDS.data_space_ID 
 left join <span style="color:#339966;">sys.partition_range_values</span> <span style="color:#0000ff;">as</span> PRV 
    <span style="color:#0000ff;">on</span> PRV.Boundary_ID = DDS.destination_id and prv.function_id=ps.function_id 
 <span style="color:#0000ff;">where</span> PS.<span style="color:#0000ff;">name</span> = <span style="color:#ff0000;">'psLeftWithNextUsedSet'</span></pre>
<p>Results:</p>
<table border="1">
<tbody>
<tr>
<th>FileGroupName</th>
<th>destination_id</th>
<th>data_space_id</th>
<th>value</th>
<th>Name</th>
</tr>
<tr>
<td>PartFG1</td>
<td>1</td>
<td>2</td>
<td>200801</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PartFG2</td>
<td>2</td>
<td>3</td>
<td>200802</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PartFG3</td>
<td>3</td>
<td>4</td>
<td>200803</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PartFG4</td>
<td>4</td>
<td>5</td>
<td>200804</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PRIMARY</td>
<td>5</td>
<td>1</td>
<td>NULL</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PartFG5</td>
<td>6</td>
<td>6</td>
<td>NULL</td>
<td>psLeftNoSpare</td>
</tr>
</tbody>
</table>
<p>And the difference appears, the next used appears as the highest destination ID, but only when there are two data spaces assigned to the partition that do not have a value. The &#8216;primary&#8217; entry that shows up as null is due to the partition scheme running from -ve infinity to +ve infinity, and whilst 4 lines are drawn on the number line, that divides the number line into 5 sections.</p>
<p>Running the same statements for the partitions declared using the right shows similar results, but the Primary entry is given destination 5 whilst the spare shows on 6 again.</p>
<p>The query is slightly awkward in that it must pick up the second entry of the list if it exists, using &#8216;orderings and tops&#8217; will not pull the result we need.</p>
<pre><span style="color:#0000ff;">select </span>FileGroupName, Destination_ID, Data_Space_ID, Name  <span style="color:#0000ff;">from</span>
(
  <span style="color:#0000ff;">select</span>  FG.Name <span style="color:#0000ff;">as</span> FileGroupName
   , dds.destination_id
   , dds.data_space_id
   , prv.value
   , ps.Name
   , RANK() <span style="color:#0000ff;">OVER</span> (<span style="color:#0000ff;">PARTITION BY</span> ps.name <span style="color:#0000ff;">order by</span> dds.destination_Id) <span style="color:#0000ff;">as</span> dest_rank
  <span style="color:#0000ff;">from</span> <span style="color:#339966;">sys.partition_schemes</span> PS
  inner join <span style="color:#339966;">sys.destination_data_spaces </span><span style="color:#0000ff;">as</span> DDS 
    <span style="color:#0000ff;">on</span> DDS.partition_scheme_id = PS.data_space_id
  inner join <span style="color:#339966;">sys.filegroups</span> <span style="color:#0000ff;">as</span> FG 
    <span style="color:#0000ff;">on</span> FG.data_space_id = DDS.data_space_ID 
  left join <span style="color:#339966;">sys.partition_range_values</span> <span style="color:#0000ff;">as</span> PRV 
    <span style="color:#0000ff;">on</span> PRV.Boundary_ID = DDS.destination_id and prv.function_id=ps.function_id 
  <span style="color:#0000ff;">where</span> prv.Value is null
 ) <span style="color:#0000ff;">as</span> a
 <span style="color:#0000ff;">where</span> dest_rank = 2</pre>
<p>Results:</p>
<table border="1">
<tbody>
<tr>
<th>FileGroupName</th>
<th>destination_id</th>
<th>data_space_id</th>
<th>Name</th>
</tr>
<tr>
<td>PartFG5</td>
<td>6</td>
<td>6</td>
<td>psLeftWithNextUsedSet</td>
</tr>
<tr>
<td>PartFG5</td>
<td>6</td>
<td>6</td>
<td>psRightWithSpareNextUsedSet</td>
</tr>
</tbody>
</table>
<p>To test whether it picks up setting the next used, let&#8217;s set it on the partitions that did not previously have it.</p>
<pre><span style="color:#0000ff;">ALTER PARTITION SCHEME </span>psLeftNoSpare <span style="color:#0000ff;">NEXT USED </span>[PartFG5]
<span style="color:#0000ff;">ALTER PARTITION SCHEME </span>psRightNoSpare <span style="color:#0000ff;">NEXT USED</span> [PartFG5]</pre>
<p>And re-run the query</p>
<table border="1">
<tbody>
<tr>
<th>FileGroupName</th>
<th>destination_id</th>
<th>data_space_id</th>
<th>Name</th>
</tr>
<tr>
<td>PartFG5</td>
<td>6</td>
<td>6</td>
<td>psLeftNoSpare</td>
</tr>
<tr>
<td>PartFG5</td>
<td>6</td>
<td>6</td>
<td>psLeftWithNextUsedSet</td>
</tr>
<tr>
<td>PartFG5</td>
<td>6</td>
<td>6</td>
<td>psRightNoSpare</td>
</tr>
<tr>
<td>PartFG5</td>
<td>6</td>
<td>6</td>
<td>psRightWithSpareNextUsedSet</td>
</tr>
</tbody>
</table>
<p>To make it re-usable, I switched the query into a view</p>
<pre style="padding-left:30px;"><span style="color:#0000ff;">create view NextUseFileGroups</span>
<span style="color:#0000ff;">as</span>
<span style="color:#0000ff;">select </span>FileGroupName, Destination_ID, Data_Space_ID, Name
 <span style="color:#0000ff;">from</span>
 (
   <span style="color:#0000ff;">select</span>  FG.Name <span style="color:#0000ff;">as</span> FileGroupName</pre>
<pre style="padding-left:30px;">    , dds.destination_id</pre>
<pre style="padding-left:30px;">    , dds.data_space_id, prv.value, ps.Name,
   RANK() <span style="color:#0000ff;">OVER</span> (<span style="color:#0000ff;">PARTITION BY</span> ps.name <span style="color:#0000ff;">order by</span> dds.destination_Id) <span style="color:#0000ff;">as</span> dest_rank
   <span style="color:#0000ff;">from</span> <span style="color:#339966;">sys.partition_schemes</span> PS
   inner join <span style="color:#339966;">sys.destination_data_spaces </span><span style="color:#0000ff;">as</span> DDS</pre>
<pre style="padding-left:30px;">     <span style="color:#0000ff;">on</span> DDS.partition_scheme_id = PS.data_space_id
   inner join <span style="color:#339966;">sys.filegroups</span> <span style="color:#0000ff;">as</span> FG</pre>
<pre style="padding-left:30px;">     <span style="color:#0000ff;">on</span> FG.data_space_id = DDS.data_space_ID 
   left join <span style="color:#339966;">sys.partition_range_values</span> <span style="color:#0000ff;">as</span> PRV</pre>
<pre style="padding-left:30px;">     <span style="color:#0000ff;">on</span> PRV.Boundary_ID = DDS.destination_id and prv.function_id=ps.function_id 
   <span style="color:#0000ff;">where</span> prv.Value is null
 ) <span style="color:#0000ff;">as</span> a
 <span style="color:#0000ff;">where</span> dest_rank = 2</pre>
<p>And a final check with removing the setting &#8211; you can blank a set next used value by specifying no value in the statement.</p>
<pre><span style="color:#0000ff;">ALTER PARTITION SCHEME </span>psLeftNoSpare <span style="color:#0000ff;">NEXT USED</span>
<span style="color:#0000ff;">ALTER PARTITION SCHEME</span> psRightNoSpare <span style="color:#0000ff;">NEXT USED</span></pre>
<p>Select from the view and the two file partition schemes / file groups no longer show up in the list as intended.</p>
<p>So finding out the &#8216;next used&#8217; setting is possible, although there really is no need in normal operation of the partition window to have to find out, but as an investigative tool it could be useful.</p>
<p>The scripts were tested on both 2005 and 2008, so are good for both in terms of testing, or using the view.</p>
<br />Posted in SQL Server Tagged: SQL Server 2005, SQL Server 2008, Table Partitioning <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/68/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/68/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/68/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/68/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/68/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/68/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/68/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/68/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/68/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/68/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/68/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/68/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/68/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/68/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=68&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/09/30/how-to-remember-the-next-used-filegroup-in-a-partition-scheme/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>
	</item>
		<item>
		<title>When is MaxDop not MaxDop?</title>
		<link>http://sqlfascination.com/2009/09/27/when-is-maxdop-not-maxdop/</link>
		<comments>http://sqlfascination.com/2009/09/27/when-is-maxdop-not-maxdop/#comments</comments>
		<pubDate>Sun, 27 Sep 2009 17:03:30 +0000</pubDate>
		<dc:creator>Andrew Hogg</dc:creator>
				<category><![CDATA[SQL Server]]></category>
		<category><![CDATA[MaxDop]]></category>
		<category><![CDATA[SQL Server 2005]]></category>
		<category><![CDATA[SQL Server 2008]]></category>

		<guid isPermaLink="false">http://andrewhogg.wordpress.com/?p=3</guid>
		<description><![CDATA[MaxDop is in some sense a bit of a misnomer, in that you would think &#8216;Max Degree of Parallelism&#8217;, set by the system administrator would be the last and final word on the matter; That is your maximum, and there are no choices to be made. However, whilst at the SQL Immersion event in Dublin [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=3&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>MaxDop is in some sense a bit of a misnomer, in that you would think &#8216;Max Degree of Parallelism&#8217;, set by the system administrator would be the last and final word on the matter; That is your maximum, and there are no choices to be made.</p>
<p>However, whilst at the <a title="SQL Immersion" href="https://www.eventznet.com/295/ac/prodata/sie09/default.aspx" target="_blank">SQL Immersion</a> event in Dublin hosted by <a title="ProData" href="http://www.prodata.ie">Prodata</a> I had made an off hand comment about increasing the thread count beyond the MaxDop setting whilst creating an on-line index on an OLTP based system that had MaxDop 1 set.</p>
<p>That gained me some quizzical looks, in that most assume MaxDop is set and has to be adhered to, so surely what I was indicating was not possible? Well &#8211; yes it is, and there is even a KB that relates to it and the difference between SQL Server 2000 /2005 and 2008. <a href="http://support.microsoft.com/default.aspx/kb/329204">http://support.microsoft.com/default.aspx/kb/329204</a></p>
<p>I should also mention that the BoL is less than precise about the situation, against SQL 2008 the BoL states for &#8216;Max Degree of Parallelism Option&#8217;:</p>
<blockquote><p><em> Set <strong>max degree of parallelism</strong> to 1 to suppress parallel plan generation.</em></p></blockquote>
<p> And against the &#8216;Degree of Parallelism Page&#8217; :</p>
<blockquote><p><em>For example, you can use the MAXDOP option to control, by extending or reducing, the number of processors dedicated to an online index operation. In this way, you can balance the resources used by an index operation with those of the concurrent users.</em></p></blockquote>
<p>So that&#8217;s entirely clear&#8230;</p>
<p>You can infer (aside from the KB) that something is not what it seems when  it states  &#8217;by extending or reducing&#8217; &#8211; how can you extend it? that would not make logical sense if it was a hard limit.</p>
<p>So we have a slightly bizarre situation in which you can override the server level setting with your own value. Initial thought is, this is a bit dangerous isn&#8217;t it?</p>
<p>Would you want the average user or developer to start implementing the appropriate query hint and override you? They would get more CPU so they wouldn&#8217;t hesitate (but not guarenteed better performance for it, since they would all be doing it)</p>
<p>To perform a test, I used a large partitioned table since the parallelism that occurs when querying a partitioned table is far easier to predict and engineer specifix scenarios to. The table had ~688 million rows in it and the selection was asking for a simple row count from 2 of the available partitions. The server used was an 8-core.</p>
<pre>Select Count(*) From LargePartitionedTable Where MyPartitionColumn in (200801, 200901)</pre>
<p>The baseline case was setting the server to a maxdop of 1</p>
<pre>sp_configure 'max degree of parallelism', 1
go
reconfigure
go</pre>
<p>The query plan is not particularily complex and the two key parts are that the clustered index scan. </p>
<p><img class="size-medium wp-image-22 alignleft" title="MaxDop1QueryPlan" src="http://andrewhogg.files.wordpress.com/2009/09/maxdop1queryplan.png?w=248&h=112" alt="MaxDop1QueryPlan" width="248" height="112" /><img class="alignnone size-full wp-image-23" title="MaxDop1ScanProperties" src="http://andrewhogg.files.wordpress.com/2009/09/maxdop1scanproperties.png?w=600" alt="MaxDop1ScanProperties"   /></p>
<p> The baseline case of using MaxDop 0 was then set up and the same query run. </p>
<pre>sp_configure 'max degree of parallelism', 0
go
reconfigure
go</pre>
<p><img class="size-medium wp-image-38 alignleft" title="MaxDop0QueryPlan" src="http://andrewhogg.files.wordpress.com/2009/09/maxdop0queryplan4.png?w=273&h=95" alt="MaxDop0QueryPlan" width="273" height="95" /><img class="alignnone size-medium wp-image-39" title="MaxDop0ScanProperties" src="http://andrewhogg.files.wordpress.com/2009/09/maxdop0scanproperties2.png?w=215&h=273" alt="MaxDop0ScanProperties" width="215" height="273" /></p>
<p>As expected, the query ran up 8 threads (8 core machine) and only 2 of the threads processed any work, since SQL 2005 threading model is to assigned 1 thread per partition, when more than 1 partitioned is requested.</p>
<p>So the base scenarios are in place, now the query is altered slightly to override the MaxDop upwards.</p>
<blockquote><p>sp_configure &#8216;max degree of parallelism&#8217;, 1<br />
go<br />
reconfigure<br />
go<br />
Select Count(*) From tblODS Where TimelineID in (200801, 200802) Option (maxdop 4) </p></blockquote>
<p>If the override works, the thread count will go up from 1, and as per the KB article it does.</p>
<p><img class="size-medium wp-image-40 alignleft" title="MaxDop4QueryPlan" src="http://andrewhogg.files.wordpress.com/2009/09/maxdop4queryplan1.png?w=273&h=98" alt="MaxDop4QueryPlan" width="273" height="98" /><img class="alignnone size-full wp-image-41" title="MaxDop4ScanProperties" src="http://andrewhogg.files.wordpress.com/2009/09/maxdop4scanproperties1.png?w=600" alt="MaxDop4ScanProperties"   /></p>
<p>So the MaxDop set by an administrator does not need to be obeyed - but does this apply to everyone? That is really the key issue and unfortunately it does not appear to fall the way you would wish it to. Placing the SQL into Mixed Authentication mode I created a user account, and gave it only the permission to select from the table and the show plan permission, nothing else.</p>
<p>Result?  The query parellelled again, indicating there is no special permission required for a user to overide the setting.</p>
<p>The facility to override it is useful, on an OLTP where you want to increase the indexing speed by using more threads it is essential &#8211; but to not require it as a granted permission to the service accounts / jobs performing those tasks &#8211; that seems slightly bizarre.</p>
<p>These tests were performed using SQL 2005, so I need to test SQL 2008 to see if the lack of controls continue to exist.</p>
<br />Posted in SQL Server Tagged: MaxDop, SQL Server 2005, SQL Server 2008 <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/andrewhogg.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/andrewhogg.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/andrewhogg.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/andrewhogg.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/andrewhogg.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/andrewhogg.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/andrewhogg.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/andrewhogg.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/andrewhogg.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/andrewhogg.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/andrewhogg.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/andrewhogg.wordpress.com/3/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/andrewhogg.wordpress.com/3/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/andrewhogg.wordpress.com/3/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=sqlfascination.com&#038;blog=9662534&#038;post=3&#038;subd=andrewhogg&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://sqlfascination.com/2009/09/27/when-is-maxdop-not-maxdop/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/8215e290861f1c44a457d26c4f24af70?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">andrewhogg</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/09/maxdop1queryplan.png?w=300" medium="image">
			<media:title type="html">MaxDop1QueryPlan</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/09/maxdop1scanproperties.png" medium="image">
			<media:title type="html">MaxDop1ScanProperties</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/09/maxdop0queryplan4.png?w=300" medium="image">
			<media:title type="html">MaxDop0QueryPlan</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/09/maxdop0scanproperties2.png?w=236" medium="image">
			<media:title type="html">MaxDop0ScanProperties</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/09/maxdop4queryplan1.png?w=300" medium="image">
			<media:title type="html">MaxDop4QueryPlan</media:title>
		</media:content>

		<media:content url="http://andrewhogg.files.wordpress.com/2009/09/maxdop4scanproperties1.png" medium="image">
			<media:title type="html">MaxDop4ScanProperties</media:title>
		</media:content>
	</item>
	</channel>
</rss>
