Archive

Archive for January, 2011

SQL Server Denali – Sequences

January 9, 2011 Leave a comment

Another new programmability feature added in the Denali CTP is ‘Sequences’ – a concept very familiar to those of us who already deal with Oracle, but an unusual addition for SQL Server and one that makes me scratch my head thinking – why? We already have the identity column feature available to us within SQL Server but not available within Oracle, thus the need for sequences in Oracle. When using any identity / numeric key / FK mechanism it is important that the actual value for this identity has no actual relation to the data it represents other than an arbitrary number representing the row, if anything it would be a bad design to rely on the identity value be in sequence, or contiguous in any way. In SQL Server it is not guaranteed to be contiguous at all – transaction rollback or a value other than 1 for the increment for example will prevent it.

Sequences are primarily for when you wish to know a number in advance of using it, or perhaps you wish to use the same number for a number of records spread across tables (and thus relate them using that number.)

Looking at the syntax you can see that the SQL Server and Oracle Syntax are very similar and share the same keywords. 

SQL Server Syntax:

CREATE SEQUENCE [schema_name . ] sequence_name [ [ ,…n ] ] [ ; ]
::= {[ AS { built_in_integer_type | user-defined_integer_type}]    
START WITH        
INCREMENT BY        
{ MINVALUE | NO MINVALUE }        
{ MAXVALUE | NO MAXVALUE }        
{ CYCLE | NO CYCLE }        
{ CACHE [ ] | NO CACHE } }

Oracle Syntax:

CREATE SEQUENCE INCREMENT BY
START WITH
MAXVALUE / NOMAXVALUE
MINVALUE / NOMINVALUE
CYCLE / NOCYCLE
CACHE <#> / NOCACHE
ORDER / NOORDER 

Most of the keywords are self-explanatory, and from a comparison of syntax you can see that SQL Server and Oracle are pretty similar in terms of the syntax.

Most of the keywords are pretty self-explanatory, the one that makes me cringe the most is CYCLE – It’s bad enough using a sequence number instead of an identity, but even worse when you consider that it may not be unique. The advice there is to create an additional unique index on the field to prevent an insertion / updated from taking a duplicate – but that seems like a bit of a ‘fudge’, and instead of solving the real problem, works around it.

To add to the weirdness of the construct, you can even ask for a sequence based on an OVER clause, using the adventure works database as an example I created a sequence:

create sequence testSequence as integer
start with 1
increment by 1
minvalue 1
maxvalue 10000

And then used it within a select statement as follows:

select next value for testSequence over (order by Name asc) as id, Name
from Production.Product

The results come back:

1 Adjustable Race 2 All-Purpose Bike Stand 3 AWC Logo Cap 4 BB Ball Bearing ...

In case you were thinking that was relatively useful, when you re-run the command, you of course are returned a different set of numbers, as the sequence does not restart, making this one of the weirdest features I have seen.

505 Adjustable Race 506 All-Purpose Bike Stand 507 AWC Logo Cap 508 BB Ball Bearing ...

If you attempt to place the order on the outside in the following manner, SQL Server will just throw an error. 

select next value for testSequence as id, Name
from Production.Product
order by Name asc
Msg 11723, Level 15, State 1, Line 1
NEXT VALUE FOR function cannot be used directly in a statement that contains 
an ORDER BY clause unless the OVER clause is specified.

And to round off the errors you can expect to see when using this, when you run the sequence out of values, you will get:

Msg 11728, Level 16, State 1, Line 1
The sequence object 'testSequence' has reached its minimum or maximum value. 
Restart the sequence object to allow new values to be generated.

Try create a sequence based on a numeric or decimal with some scale, such as numeric(6,2):

Msg 11702, Level 16, State 2, Line 1
The sequence object 'testSequence' must be of data type int, bigint, smallint,
tinyint, or decimal or numeric with a scale of 0, or any user-defined data type
that is based on one of the above integer data types.

Or if you fail to get your starting value within the min and max boundaries you are setting:

Msg 11703, Level 16, State 1, Line 1
The start value for sequence object 'testSequence' must be between the minimum
and maximum value of the sequence object.

Overall sequences remain a bit of a niche feature for me in SQL Server, I just can not see any normal everyday activity needing to use them, although it would make porting of applications between Oracle and SQL Server a bit easier since they will both be able to use them.

In that kind of situation though I would still prefer the GUID mechanisms that we have available to us. They have the same benefits of being able to know a record ID in advance of using it as well as the ability to be stored in either database. It also has the added advantage of being able to be created whilst offline from the database, something a sequence can not do.

Which User Made That Change?

January 8, 2011 4 comments

If you have spent any time tinkering about in the transaction log, you will of already come across a bit of a problem when trying to decide what was done and by whom – the ‘what part’ I have decoded in a few posts, but the ‘whom’ part is a lot harder. As far as I can tell the log only contains the SPID of the user who opened the transaction, and does not give us any indication as to who that user really was.

From an actual investigative perspective this is a bit of a painful exercise, I can see a row was deleted but to find out who / what did that actual deletion I would have to start examining either the SQL Server logs or the Windows Server Logs. The default behaviour of SQL Server security though is to only log failed login attempts so the successful ones will not show up by default – to get those appearing you need to change your SQL Server security settings. You can access these logs from the SQL management studio using either the xp_readerrorlogs or sp_readerrorlogs procedures although the nature of the log and textural values make it difficult to then combine in a set based manner – I can humanly read the values but machine reading them for any purpose is a bit of a pain – there is also the issue that those logs will be cycled – and the old logs could well be completely offline.

So I would prefer an easier solution, keeping a record of the logins within the database regardless of the SQL Server security settings, and being in a form that allows me to use a bit more of a set based solution against it. To start with, we will need a table to store the information available to us during the logon process:

create table master.dbo.spidArchive (
 LoginTime  datetime2(7)
 ,SPID   integer
 ,ServerName  nvarchar(100)
 ,LoginName  nvarchar(100)
 ,LoginType  nvarchar(100)
 ,LoginSID  nvarchar(100)
 ,ClientHost nvarchar(100)
 ,IsPooled  tinyint
)

The spidArchive table here is created in the master database so that it can cover the connections for any of the databases. You can see we have access to a lot of useful information, not just who executed the command, but from which machine they logged in from. The next step is to get SQL Server to add a row to the table every time a login occurs – from SQL Server 2005 onwards we have had access to DDL triggers as well as DML triggers and have the ability to intercept a number of non-DML events.

create trigger spidLogin on all server
after logon
as
 declare @eventdata xml;
 set @eventdata = EVENTDATA();

 INSERT INTO master.dbo.spidArchive
 (
  LoginTime
  ,SPID   
  ,ServerName  
  ,LoginName  
  ,LoginType  
  ,LoginSID  
  ,ClientHost 
  ,IsPooled
 )
 VALUES 
 (
  @eventdata.value('(/EVENT_INSTANCE/PostTime)[1]','datetime2(7)')
  ,@eventdata.value('(/EVENT_INSTANCE/SPID)[1]','nvarchar(100)')
  ,@eventdata.value('(/EVENT_INSTANCE/ServerName)[1]','nvarchar(100)')
  ,@eventdata.value('(/EVENT_INSTANCE/LoginName)[1]','nvarchar(100)')
  ,@eventdata.value('(/EVENT_INSTANCE/LoginType)[1]','nvarchar(100)')
  ,@eventdata.value('(/EVENT_INSTANCE/SID)[1]','nvarchar(100)')
  ,@eventdata.value('(/EVENT_INSTANCE/ClientHost)[1]','nvarchar(100)')
  ,@eventdata.value('(/EVENT_INSTANCE/IsPooled)[1]','tinyint')
 )

During the login process, the EventData() function returns a fixed format XML fragment from which we can extract the values we seek and simply insert into our spidArchive table. Now we have a log being taken of all connections being established to the server, we can start using this to translate from a SPID to a user, even when the user is no longer connected – as long as we know the SPID and the time, we just need to look for the closest entry in the past for that SPID, and that will indicate which user was currently logged on at the time. This function should go in the master database again.

CREATE FUNCTION dbo.ConvertSpidToName(@SPID integer, @Date datetime2(7)) RETURNS nvarchar(100) AS
BEGIN
 DECLARE @name nvarchar(100)
 SELECT TOP(1) @name = LoginName
 FROM master.dbo.spidArchive
 WHERE SPID = @SPID AND LoginTime <= @Date
 ORDER BY LoginTime DESC;
 RETURN @name;
END

This function just performs the logic stated above  and converts the SPID and DateTime into the login name for the user. Once this infrastructure is in place we can now directly use that in a call to ::fn_dblog(null,null) to translate the SPID column

select master.dbo.ConvertSpidToName(log.SPID, log.[Begin Time]) as UserName, log.* from ::fn_dblog(null,null) log

What you will notice is that for the majority of log lines, there is no user name displayed – this is because the SPID is only recorded against the LOP_BEGIN_XACT entry, the beginning of the transaction. This doesn’t really present a problem, from previous experiments we know all the entries for an individual transaction are given a unique Tansaction ID which we can use to group them together. It becomes pretty trivial to join back to the log, and connect any transaction entries to the LOP_BEGIN_XACT record and produce the name on every row possible.

select master.dbo.ConvertSpidToName(log2.SPID, log2.[Begin Time]) as UserName, log.*
from ::fn_dblog(null,null) log
left join ::fn_dblog(null,null) log2 on log.[Transaction ID] = log2.[Transaction ID] and log2.Operation = 'LOP_BEGIN_XACT'

So overall it is not too hard to get the log entries attributed to the accounts that generated them.

A couple of final notes / caveats:

  • If your application is using a trusted sub-system approach this of course will not work as a technique, since all the users will be logged into the application through an internal mechanism (such as a users table) and then the application service connects using it’s own credentials – always a good thing since then the user’s have no direct access to the database. In that kind of situation this is of no value, every connection will be shown up as the same user/ source.
  • Within my code I chose to use datetime2(7), to be as accurate as possible on the connections and timings, you could drop to just datetime for SQL Server 2005 but with only 1/300ths of a second accuracy there is a chance on a very busy server that you could see two entries for a single SPID at the same datetime – which would pose a bit of a problem.
  • The spidArchive table can not be allowed to grow unconstrained – I have not included anything here for clearing down the table, but it is not difficult to conceive of it being archived off, or cleaned up weekly via a SQL Agent job.