|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Stored Procedure Problem (Bug?)We're using SQL Server 2000 SP3 on Windows Server 2003 (no SP) with 24 Windows XP Pro SP2 clients. We have a stored procedure which allocates sequence numbers from a table. It's called frequently, and concurrently, by code running on clients and on the server. Normally, it works fine but very occasionally the stored procedure will fail. When it fails, it returns a "lock request timeout" error immediately, with no delay. This happens even though the connection being used has a lock timeout of 30-240 seconds. We have Profiler traces logged for several of the times the failure has occurred. Each time, two clients called the SP at exactly the same time, to the millisecond. Does anyone know if this is a known problem that is fixed by SQL Server 2000 SP4? Regards Gary Gary Harding (nospam@please.com) writes:
Show quote > We're using SQL Server 2000 SP3 on Windows Server 2003 (no SP) with 24 It's not an issue that I recognize, and I would suspect that this is> Windows XP Pro SP2 clients. > We have a stored procedure which allocates sequence numbers from a > table. It's called frequently, and concurrently, by code running on > clients and on the server. > Normally, it works fine but very occasionally the stored procedure will > fail. When it fails, it returns a "lock request timeout" error > immediately, with no delay. > This happens even though the connection being used has a lock timeout of > 30-240 seconds. > We have Profiler traces logged for several of the times the failure has > occurred. Each time, two clients called the SP at exactly the same time, > to the millisecond. > Does anyone know if this is a known problem that is fixed by SQL Server > 2000 SP4? an issue in your application. When you say that the connection has a lock time of 30-240 seconds how do you set it? Keep in mind that on the connection on client level, you can set a command timeout, and if you don't set it, it is 30 seconds by default, and a timeout of 0 means "no timeout". A lock timeout is set on the SQL Server side. This timeout has a default of -1, meaning "no timeout", and 0 means "timeout immediately". This timeout is measured in milliseconds, so if you say "SET LOCK_TIMEOUT 30" this means a 30 ms timeout. You can of course issue a "SET LOCK_TIMEOUT 30000" as soon as you connect, in which case the setting will be effect for the rest of the connection. But the setting can be overridden in a stored procedure. So I would suggest that you verify that: 1) You are indeed setting the lock timeout, and not command timeout. 2) That oy are actually setting the lock timeout to a 5-6 digit number. 3) There are no local overrisds of SET LOCK_TIMEOUT in your stored procedures. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx In additional to Erlands comments make sure you are owner qualifying the sp
when you call it. -- Show quoteAndrew J. Kelly SQL MVP "Gary Harding" <nospam@please.com> wrote in message news:Bc6dndoNSvbQyuvenZ2dnUVZ8qGdnZ2d@pipex.net... > Hello > > We're using SQL Server 2000 SP3 on Windows Server 2003 (no SP) with 24 > Windows XP Pro SP2 clients. > We have a stored procedure which allocates sequence numbers from a table. > It's called frequently, and concurrently, by code running on clients and > on the server. > Normally, it works fine but very occasionally the stored procedure will > fail. > When it fails, it returns a "lock request timeout" error immediately, with > no delay. > This happens even though the connection being used has a lock timeout of > 30-240 seconds. > We have Profiler traces logged for several of the times the failure has > occurred. > Each time, two clients called the SP at exactly the same time, to the > millisecond. > Does anyone know if this is a known problem that is fixed by SQL Server > 2000 SP4? > > Regards > Gary > > Erland, Andrew
Thanks for your help. All our connections do indeed have "SET LOCK_TIMEOUT 30000" executed straight after the connect. The connection used for the stored procedure later has the setting increased to "240000". (This is to avoid contention for reasons unconnected with acquiring the sequence number). The SP itself does not amend the lock timeout. At no point do we set lock timeouts as low as the few milliseconds that the SP takes to fail, when it fails. In addition, we normally set client-side command timeouts to zero, i.e. infinite. We call the SP through ADO's Command.Execute mechanism, so I'm not sure if it has owner qualification. Could that be an issue? For further information, the SP starts a transaction, takes an update lock on the required row, updates the row then commits (or rolls back on error). I now appreciate that this can be done more efficiently, after seeing some sample code while searching for other people's experiences in this area. But I suspect that something more fundamental is causing the error. Regards Gary What command type are you using? Make sure it is a commandtype of Stored
Procedure so that the call is an RPC call and not a Batch. You can see which is being done via profiler. Also if when you set the command object witht he name of the sp if you don't have it owner qualified there then it is not getting qualified. It should be something like this: dbo.YourSp Why does the sp take an update lock? You should be able to generate a new sequence number and update the row in a single statement making the Explicit transaction unnecessary altogether. If you post the code we can probably make some suggestions. But in any case I think you need more info to get to the root of the issue. You may want to collect some perfmon, locking and profiler data to see where you stand. -- Show quoteAndrew J. Kelly SQL MVP "Gary Harding" <nospam@please.com> wrote in message news:r8edndiEC_PwgerenZ2dnUVZ8qqdnZ2d@pipex.net... > Erland, Andrew > > Thanks for your help. > All our connections do indeed have "SET LOCK_TIMEOUT 30000" executed > straight after the connect. > The connection used for the stored procedure later has the setting > increased to "240000". > (This is to avoid contention for reasons unconnected with acquiring the > sequence number). > The SP itself does not amend the lock timeout. > At no point do we set lock timeouts as low as the few milliseconds that > the SP takes to fail, when it fails. > In addition, we normally set client-side command timeouts to zero, i.e. > infinite. > We call the SP through ADO's Command.Execute mechanism, so I'm not sure if > it has owner qualification. > Could that be an issue? > For further information, the SP starts a transaction, takes an update lock > on the required row, updates the > row then commits (or rolls back on error). I now appreciate that this can > be done more efficiently, after > seeing some sample code while searching for other people's experiences in > this area. But I suspect that > something more fundamental is causing the error. > > Regards > Gary > > Andrew
The CommandType is adStoredProc, and Profiler shows the call is being made as an RPC. The exec doesn't have the "dbo." prefix, though I can't see how this would matter overly. The SP works most of the time, even when called by two clients at exactly the same time. It's just a few of these simultaneous calls that are failing in the manner described. The stored procedure has been used without problems under NT/SQL 7 for several years. We've only seen these problems since migrating our system to Server 2003/XP/SQL 2000. But it's possible that the SP is being called simultaneously more now than it was in the past. The updated row is only ever locked momentarily, as each caller uses a fresh connection. Unfortunately, we don't really have time to gather detailed information on the failures. Instead, we'll probably install a workaround which will... a) Retry on failure so that the simultaneity is no longer a factor. b) Rework the stored procedure so it doesn't use a transaction. c) Add that "dbo." prefix to the call. Thanks for your help, but I only raised the problem in the hope it was a known "feature". Regards Gary Gary Harding (nospam@please.com) writes:
> The exec doesn't have the "dbo." prefix, though I can't see how this would See http://support.microsoft.com/kb/q263889/.> matter overly. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx Erland
Most interesting! It looks like MSDN article 263889 *might* describe the problem we're having. Though the failure does seem to occur within our SP rather than at the compile stage. The SP is returning a code that corresponds to 1222 (lock timeout period exceeded). That return code mapping can only take place if the SP gets executed. I imagine if there's Compile-lock blocking the procedure wouldn't be executed at all. It's difficult to check because the problem only occurs in the operational environment. And as it's a defence system, the SP text is encrypted making it hard to trace properly. In any case, I shall certainly be adding the "dbo." prefix to all stored procedure calls. Thanks for all the tips Gary Harding Gary Harding (nospam@please.com) writes:
> It looks like MSDN article 263889 *might* describe the problem we're Well, this is a little blurred, since a stored procedure can be recompiled> having. > Though the failure does seem to occur within our SP rather than at the > compile stage. > The SP is returning a code that corresponds to 1222 (lock timeout period > exceeded). > That return code mapping can only take place if the SP gets executed. > I imagine if there's Compile-lock blocking the procedure wouldn't be > executed at all. in the middle of execution. This frequently happens if temp tables are involved. Then again, if the return value of the procedure is 1222, because you have something like: SELECT @err = @@error IF @err <> 0 RETURN @err A lock-timeout that happens during recompilation would not cause 1222 to be returned, as errors during recompilation aborts execution of the procedure directly. Then if SP1 calls SP2 this could happen. Provided that is, that for some reason compile errors would lead to immediate lock-timeout. Also, if you are using the Profiler and add lock timeouts to the events, you will see a lot more lock timeouts that really happens. This is apparently used internally for some stuff. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx Erland
Thanks again for your continued assistance. The stored procedure doesn't use temporary tables, so recompilation may not be an issue. I'm rather stymied by an inability to reproduce the problem reliably in our development or test environments. I will try to obtain detailed Profiler traces at times when the problem seems more likely to occur. I can't post the SP code, but it looks like this... create procedure fred @input int, @output1 int, @output2 int as declare @err int, @rowcount int, @ret int set nocount on begin transaction save transaction blah -- table1 provides a fixed component of the sequence number select @output1 = column1 from table1 with(nolock) <error check> -- table2 provides the incrementing part of the sequence number select @output2 = column2 from table2 with(updlock,rowlock) where PKcolumn = @input <error check> update table2 with(rowlock) set column2 = column2 + 1 where PKcolumn = @input <error check> commit transaction <error check> set @ret = 0 return @ret The <error check> is basically... select @err = @@error, @rowcount = @@rowcount if @err <> 0 goto ERROR_CHECK if @rowcount <> 1 begin set @ret = 4 goto ROLL_BACK end At ERROR_CHECK we map SQL errors (1222, etc.) to our own return codes (1,2,3,4). At ROLL_BACK we execute: rollback transaction blah, commit transaction. The SP is returning error code 2, which corresponds to 1222 (lock timeout). If anything looks likely to cause the problem then shout, but nothing seems obviously wrong. I *do* appreciate we can do without the transaction and the select-for-update by instead using... update table2 with(rowlock) set @output2 = column2 = column2 + 1 where PKcolumn = @input Regards Gary Gary,
Obviously this is not the actual sp code but I have a few comments that should help. First off I see no need for an Explicit Transaction much less a Save Tran. The first part of the sp is simply a Select and does not need to be inside the Begin - Commit. The second part gets a value and updates a table with the incremented value. This can all be done with a single statement and as such does not need an explicit tran either. Every statement is ATOMIC in and of itself so it will work or it wont. No need to wrap it in a tran. There also should be no need for the locking hints. If you have a proper index (PK in the case of the second table) it will only lock one row anyway. So it should look more like this: -- table1 provides a fixed component of the sequence number select @output1 = column1 from table1 <error check> -- table2 provides the incrementing part of the sequence number -- and updates the table UPDATE table2 SET @output2 = column2 = (column2 + 1) WHERE PKcolumn = @input <error check> Here is a simple example of how you can generate generic ID's. CREATE TABLE [dbo].[NEXT_ID] ( [ID_NAME] [varchar] (20) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL , [NEXT_VALUE] [int] NOT NULL , CONSTRAINT [PK_NEXT_ID_NAME] PRIMARY KEY CLUSTERED ( [ID_NAME] ) WITH FILLFACTOR = 100 ON [PRIMARY] ) ON [PRIMARY] GO CREATE PROCEDURE get_next_id @ID_Name VARCHAR(20) , @ID int OUTPUT AS SET NOCOUNT ON UPDATE NEXT_ID SET @ID = NEXT_VALUE = (NEXT_VALUE + 1) WHERE ID_NAME = @ID_Name -- Show quoteAndrew J. Kelly SQL MVP "Gary Harding" <nospam@please.com> wrote in message news:ELCdnfHSz_2U5ebenZ2dnUVZ8qqdnZ2d@pipex.net... > Erland > > Thanks again for your continued assistance. > The stored procedure doesn't use temporary tables, so recompilation may > not be an issue. > I'm rather stymied by an inability to reproduce the problem reliably in > our development or test environments. > I will try to obtain detailed Profiler traces at times when the problem > seems more likely to occur. > > I can't post the SP code, but it looks like this... > > create procedure fred @input int, @output1 int, @output2 int as > declare @err int, @rowcount int, @ret int > set nocount on > begin transaction > save transaction blah > -- table1 provides a fixed component of the sequence number > select @output1 = column1 from table1 with(nolock) > <error check> > -- table2 provides the incrementing part of the sequence number > select @output2 = column2 from table2 with(updlock,rowlock) where PKcolumn > = @input > <error check> > update table2 with(rowlock) set column2 = column2 + 1 where PKcolumn = > @input > <error check> > commit transaction > <error check> > set @ret = 0 > return @ret > > The <error check> is basically... > select @err = @@error, @rowcount = @@rowcount > if @err <> 0 goto ERROR_CHECK > if @rowcount <> 1 begin set @ret = 4 goto ROLL_BACK end > > At ERROR_CHECK we map SQL errors (1222, etc.) to our own return codes > (1,2,3,4). > At ROLL_BACK we execute: rollback transaction blah, commit transaction. > The SP is returning error code 2, which corresponds to 1222 (lock > timeout). > > If anything looks likely to cause the problem then shout, but nothing > seems obviously wrong. > I *do* appreciate we can do without the transaction and the > select-for-update by instead using... > update table2 with(rowlock) set @output2 = column2 = column2 + 1 where > PKcolumn = @input > > Regards > Gary > > |
|||||||||||||||||||||||