|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Performance ProblemHi,
I am having a table which consists of 8 lakh records in it. I have removed 5 lakh rows from the table which are unused. Before removing the rows the pages are 8943 and after deleting the 5 lakh rows in the table the pages are 4520. After deleting the rows i have done the index fragmentation using DBCC INDEXDEFRAG and now the number of pages decreased to 3940. But still my query is taking time for scanning the table. The performance hasn't changed. Could anybody please help whether i have done anything wrong in the process of deleting the rows in the table. Thanks, Ramnadh. Ok, after discovering what "lakh" means (thanks, wikipedia.org!), I still
would like to see the DDL and some sample data - less than 1 lakh, if you please. ;) Where are your indexes? What does the query look like? ML ramnadh nalluri (ramnadh.nall***@gmail.com) writes:
> I am having a table which consists of 8 lakh records in it. I'm afraid that this question is not answerable without:> I have removed 5 lakh rows from the table which are unused. > Before removing the rows the pages are 8943 and after deleting the 5 > lakh rows in the table the pages are 4520. After deleting the rows i > have done the index fragmentation using DBCC INDEXDEFRAG > and now the number of pages decreased to 3940. But still my query is > taking time for scanning the table. The performance hasn't changed. > > Could anybody please help whether i have done anything wrong in the > process of deleting the rows in the table. o CREATE TABLE and CREATE INDEX statments for the table(s) in question. o The query you are using. o Explanation of what a "lakh" is. o The query plan for the query. (Precede the query with SET STATISTICS PROFILE ON). As a shot in the dark, try UPDATE STATISTICS WITH FULLSCAN. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp Hi,
Sorry for not giving the proper information. The below are my tables (Message,MessageRecipient) schemas. CREATE TABLE [dbo].[Message] ( [Id] [int] IDENTITY (1, 1) NOT NULL , [SenderId] [int] NULL , [SentOn] [datetime] NOT NULL , [Subject] [Title] NOT NULL , [PriorityTypeId] [int] NULL , [TypeId] [int] NOT NULL , [Message] [text] NULL , [SentFlag] [bit] NULL , [SentState] [int] NULL ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] GO ALTER TABLE [dbo].[Message] WITH NOCHECK ADD CONSTRAINT [PK_Message] PRIMARY KEY CLUSTERED ( [Id] ) WITH FILLFACTOR = 90 ON [PRIMARY] GO ALTER TABLE [dbo].[Message] ADD CONSTRAINT [DF__Message__SentFla__29C39D7E] DEFAULT (0) FOR [SentFlag] GO CREATE INDEX [IX_Message_SentState] ON [dbo].[Message]([SentState]) WITH FILLFACTOR = 90 ON [PRIMARY] GO CREATE TABLE [dbo].[MessageRecipient] ( [MessageId] [int] NOT NULL , [RecipientId] [int] NULL , [ExtRecipient] [Email] NULL , [BelongsTo] [bit] NULL , [IsNew] [bit] NOT NULL , [FolderId] [int] NULL , [MessageState] [int] NULL ) ON [PRIMARY] GO CREATE CLUSTERED INDEX [CI_MessageRecipient_RecipientId] ON [dbo].[MessageRecipient]([RecipientId]) WITH FILLFACTOR = 90 ON [PRIMARY] GO ALTER TABLE [dbo].[MessageRecipient] ADD CONSTRAINT [DF_MessageRecipient_IsNew] DEFAULT (1) FOR [IsNew], CONSTRAINT [DF_MessageRecipient_MessageState] DEFAULT (887) FOR [MessageState] GO CREATE INDEX [IX_MessageRecipient_MessageId] ON [dbo].[MessageRecipient]([MessageId]) WITH FILLFACTOR = 90 ON [PRIMARY] GO CREATE INDEX [IX_MessageRecipient_MessageState] ON [dbo].[MessageRecipient]([MessageState]) WITH FILLFACTOR = 90 ON [PRIMARY] GO Prior deleting the rows from Message table it is having 800,000 rows (1 lakh = 100,000). MessageRecipient is having 500,000 rows. I found there are 500,000 invalid rows in the Message table, i deleted those rows in the Message table. Before deleting the rows in the message table the rows are arranged in above 8000 pages. After deleting those rows those are arranged in 4120 pages. I thought of index fragmentation and i have done the index fragmentation on Message table using DBCC INDEXDEFRAG. After doing fragmentation message table rows are arranged in 3900 pages. After deleting above 500,000 rows from the table also my query doesn't increased much. What may be the reason. Have i done anything wrong in deleting those rows in the table.Could you please help me. This is my query SELECT MSGS.Id , MSGS.SentOn , MSGS.Subject , MSGR.MessageState , MSGS.Message , MSGR.FolderId , MSGR.IsNew FROM dbo.Message MSGS (NOLOCK) INNER JOIN dbo.MessageRecipient MSGR (NOLOCK) ON MSGS.Id = MSGR.MessageId AND MSGR.MessageState <> 891 AND MSGR.RecipientId = 32 ORDER BY MSGS.SentOn DESC This is the execution plan i got while executing this query |--Compute Scalar(DEFINE:([MSGS].[Message]=[MSGS].[Message])) MERGE:([MSGS].[Id])=([MSGR].[MessageId]),|--Sort(ORDER BY:([MSGS].[SentOn] DESC)) |--Merge Join(Inner Join, RESIDUAL:([MSGR].[MessageId]=[MSGS].[Id])) |--Clustered Index Scan(OBJECT:([Viper511].[dbo].[Message].[PK_Message] AS [MSGS]), ORDEREDFORWARD) |--Sort(ORDER BY:([MSGR].[MessageId] ASC)) Seek(OBJECT:([Viper511].[dbo].[MessageRecipient].[CI_MessageRecipient_Re|--Clustered Index cipientId] AS [MSGR]), SEEK:([MSGR].[RecipientId]=32), WHERE:([MSGR].[MessageState]<891 OR [MSGR].[MessageState]>891) ORDERED FORWARD) Thanks in advance for the help. Ramnadh. *** Sent via Developersdex http://www.developersdex.com *** MessageState <> 891
....might cause trouble. Try moving it to the where clause. Also consider analyzing the actual values of the MessageState column: are there any values that are larger than 891? There's also no primary key on the MessageRecipient table. Other than the above, the execution plan looks ok. How many data files are used by the primary file group? ML 1. 887,888,889,890,891,892 are the only possible values for the
MessageState.what is the difference between having MessageState <> 891 condition while joining the table and keeping condition in where clause. Sql query analyzer will build the best possible query execution plan right. I checked it by keeping the MessageState <> 891 in the where clause but there is no difference in the performance or in the query plan. 2. The MessageId,RecipientId,ExtRecipient combination will be unique. So i can declare Primary key on the combination of MessageId,RecipientId,ExtRecipient columns. But in the MessageRecipient table RecipientId and ExtRecipient are NULL able columns. Is it feasible to keep composite of three columns which are having nullable values as primary key ? Present we are having Clustered Index on RecipientId column, which is a nullable column. When i see the execution plan for this query it is doing Clustered Index Scan on all the rows of the Message Table. It is taking 92% of the query time. We haven't created the filegroups for the files. There is only one default filegroup for the database. *** Sent via Developersdex http://www.developersdex.com *** One reason that you don't see a big performance improved might be the
combination of the following two observations: 1) Your current query plan looks very efficient 2) The Message table occupied some 8000 pages. I'll assume that MessageRecipient occupied some 5000 pages. Together it takes SQL-Server only 100 MByte to keep the entire tables in memory. If that is the case, then removing rows will not improve the performance with this query plan The real question is: what performance are you currently getting, and what performance do you consider acceptable? What are the current statistics for the query? How many seconds does it take to run (elapsed time), how much CPU is used, and how many physical reads are performed? How many rows are returned by the query, and is the result used on the database server, or transferred over a network and processed by a client system? HTH, Gert-Jan ramnadh nalluri wrote: [snip] Show quote > Prior deleting the rows from Message table it is having > 800,000 rows (1 lakh = 100,000). MessageRecipient is having 500,000 > rows. I found there are 500,000 invalid rows in the Message table, i > deleted those rows in the Message table. Before deleting the rows in the > message table the rows are arranged in above 8000 pages. After deleting > those rows those are arranged in 4120 pages. I thought of index > fragmentation and i have done the index fragmentation on Message table > using DBCC INDEXDEFRAG. After doing fragmentation message table rows are > arranged in 3900 pages. After deleting above 500,000 rows from the table > also my query doesn't increased much. What may be the reason. Have i > done anything wrong in deleting those rows in the table.Could you please > help me. > > This is my query > > SELECT MSGS.Id > , MSGS.SentOn > , MSGS.Subject > , MSGR.MessageState > , MSGS.Message > , MSGR.FolderId > , MSGR.IsNew > FROM dbo.Message MSGS (NOLOCK) > INNER JOIN dbo.MessageRecipient MSGR (NOLOCK) > ON MSGS.Id = MSGR.MessageId > AND MSGR.MessageState <> 891 > AND MSGR.RecipientId = 32 > ORDER BY MSGS.SentOn DESC > > This is the execution plan i got while executing this query > > |--Compute Scalar(DEFINE:([MSGS].[Message]=[MSGS].[Message])) > |--Sort(ORDER BY:([MSGS].[SentOn] DESC)) > |--Merge Join(Inner Join, > MERGE:([MSGS].[Id])=([MSGR].[MessageId]), > RESIDUAL:([MSGR].[MessageId]=[MSGS].[Id])) > |--Clustered Index > Scan(OBJECT:([Viper511].[dbo].[Message].[PK_Message] AS [MSGS]), ORDERED > FORWARD) > |--Sort(ORDER BY:([MSGR].[MessageId] ASC)) > |--Clustered Index > Seek(OBJECT:([Viper511].[dbo].[MessageRecipient].[CI_MessageRecipient_Re > cipientId] AS [MSGR]), SEEK:([MSGR].[RecipientId]=32), > WHERE:([MSGR].[MessageState]<891 OR [MSGR].[MessageState]>891) ORDERED > FORWARD) > > Thanks in advance for the help. > > Ramnadh. > > *** Sent via Developersdex http://www.developersdex.com *** Thanks for the reply.
After deleting the 500,000 rows in the message table and doing index fragmentation the message table occupied only 3940 Pages. The performance i am getting is 19sec for getting the result set of 3576 rows. What bothers me most is that if the number of results increases the performance is even worse. I am expecting to bring down the performace to 3 or 4 sec, as we are getting only the result set of 3576 rows. The result is used on the database server itself. These are the current statistics for the query Table 'MessageRecipient'. Scan count 1, logical reads 20, physical reads 0, read-ahead reads 18. Table 'Message'. Scan count 1, logical reads 4797, physical reads 0, read-ahead reads 4593. SQL Server Execution Times: CPU time = 954 ms, elapsed time = 19996 ms. SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. *** Sent via Developersdex http://www.developersdex.com *** Thanks for the information.
I don't know if you cleared the buffer cache before running the query (if so, then please post back), but if these are the typical stats, then the query is I/O bound. Unfortunately, reading 4800 pages in 20 seconds is not a bad performance. If these are the stats, then your system does not have enough memory (or not enough memory assigned to SQL-Server). Adding more memory would definitely help. It is a borderline decision, but the optimizer's choice to scan the Message table seems wrong, because it needed 4797 reads for 3576 rows. You can test if it helps to rewrite the query to this: SELECT MSGS.Id , MSGS.SentOn , MSGS.Subject , MSGR.MessageState , MSGS.Message , MSGR.FolderId , MSGR.IsNew FROM dbo.MessageRecipient MSGR (NOLOCK) INNER LOOP JOIN dbo.Message MSGS (NOLOCK) ON MSGS.Id = MSGR.MessageId WHERE MSGR.MessageState <> 891 AND MSGR.RecipientId = 32 ORDER BY MSGS.SentOn DESC Another thing I notices, is that you are select a column of data type "text". I will assume that the user defined data type "Title" is derived from "char" or "varchar" (not "text"). So another approach could be to change the behavior of the "text in row" option. If it is on, you could try disabling it. If it is off, you could try enabling it. Make sure your reindex the table afterwards. With that new configuration, you could retry the query above and your original query. BTW: what is the primary key of table MessageRecipient? Hope this helps, Gert-Jan ramnadh nalluri wrote: Show quote > > Thanks for the reply. > > After deleting the 500,000 rows in the message table and doing index > fragmentation the message table occupied only 3940 Pages. > > The performance i am getting is 19sec for getting the result set of 3576 > rows. What bothers me most is that if the number of results increases > the performance is even worse. I am expecting to bring down the > performace to 3 or 4 sec, as we are getting only the result set of 3576 > rows. > The result is used on the database server itself. > > These are the current statistics for the query > > Table 'MessageRecipient'. Scan count 1, logical reads 20, physical reads > 0, read-ahead reads 18. > Table 'Message'. Scan count 1, logical reads 4797, physical reads 0, > read-ahead reads 4593. > > SQL Server Execution Times: CPU time = 954 ms, elapsed time = 19996 > ms. > SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. > > *** Sent via Developersdex http://www.developersdex.com *** Gert-Jan Strik (sorry@toomuchspamalready.nl) writes:
> BTW: what is the primary key of table MessageRecipient? I would place my bets on (MessageId, RecipentID), but that informationis not there in the table. Which maybe is why it picks the merge join, and not the loop join, which I think should be better. By the way, 4800 pages in 20 seconds does not sound very fantastic in my ears. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp Erland Sommarskog wrote:
> You're right. For a one disk system (like a laptop) it is to be> Gert-Jan Strik (sorry@toomuchspamalready.nl) writes: > > BTW: what is the primary key of table MessageRecipient? > > I would place my bets on (MessageId, RecipentID), but that information > is not there in the table. Which maybe is why it picks the merge join, > and not the loop join, which I think should be better. > > By the way, 4800 pages in 20 seconds does not sound very fantastic in > my ears. expected, but with a RAID configuration of just a few disks (a server) it should do much better... Gert-Jan Thanks for the valuable information for optimizing this query.
I have done the text column in the Message table to enable the Text In Row option by using these statements. EXEC sp_tableoption 'orders', 'text in row', 'ON' GO EXEC sp_tableoption 'orders', 'text in row', '1000' So from now onwards whatever text data i insert below 1000 will be inserted into Data Pages rather than Text Pages. Is there any mechanism for updating the old Rows Text Pages to Convert into Data Pages.How can i give this option for the old rows ? By keeping LOOP JOIN my query performance decreased to half the time. Can you explain me what is the use of LOOP JOIN here, how does it optimizes my query ? This is the MessageRecipient table schema CREATE TABLE [dbo].[MessageRecipient] ( [MessageId] [int] NOT NULL , [RecipientId] [int] NULL , [ExtRecipient] [Email] NULL , [BelongsTo] [bit] NULL , [IsNew] [bit] NOT NULL , [FolderId] [int] NULL , [MessageState] [int] NULL ) ON [PRIMARY] GO CREATE CLUSTERED INDEX [CI_MessageRecipient_RecipientId] ON [dbo].[MessageRecipient]([RecipientId]) WITH FILLFACTOR = 90 ON [PRIMARY] GO ALTER TABLE [dbo].[MessageRecipient] ADD CONSTRAINT [DF_MessageRecipient_IsNew] DEFAULT (1) FOR [IsNew], CONSTRAINT [DF_MessageRecipient_MessageState] DEFAULT (887) FOR [MessageState] GO CREATE INDEX [IX_MessageRecipient_MessageId] ON [dbo].[MessageRecipient]([MessageId]) WITH FILLFACTOR = 90 ON [PRIMARY] GO CREATE INDEX [IX_MessageRecipient_MessageState] ON [dbo].[MessageRecipient]([MessageState]) WITH FILLFACTOR = 90 ON [PRIMARY] GO There is no Primary Key on this table, because we can deduct unique value only by the combination of MessageId,RecipientId and ExtRecipient. Whereas RecipientId can be nullable, MessageId is the foreign key reference to the Message table Id and ExtRecipient is also a nullable column. So cannot create Primary Key on MessageRecipient table. i had create Clustered Index on RecipientId column. But i am not sure whether it is good thing to specify clustered index on recipientId column. Could you please check the Messagerecipient table schema and suggest me what are the keys should be... Thanks for your help. ramnadh. *** Sent via Developersdex http://www.developersdex.com *** See inline
ramnadh nalluri wrote: > I hope you meant "Message" instead of "orders"...> Thanks for the valuable information for optimizing this query. > > I have done the text column in the Message table to enable the Text In > Row option by using these statements. > > EXEC sp_tableoption 'orders', 'text in row', 'ON' > GO > EXEC sp_tableoption 'orders', 'text in row', '1000' > So from now onwards whatever text data i insert below 1000 will be I guess you could simply do this:> inserted into Data Pages rather than Text Pages. Is there any mechanism > for updating the old Rows Text Pages to Convert into Data Pages.How can > i give this option for the old rows ? UPDATE Message SET Message = Message > By keeping LOOP JOIN my query performance decreased to half the time. The query optimizer has several methods to join two sets: Loop Joins,> Can you explain me what is the use of LOOP JOIN here, how does it > optimizes my query ? Merge Joins and Hash Joins. In your case it chose the Merge method, which was suboptimal. You can find more information about these methods in Books Online (BOL). Show quote > This is the MessageRecipient table schema This is an indication that your design is not correct (not properly> > CREATE TABLE [dbo].[MessageRecipient] ( > [MessageId] [int] NOT NULL , > [RecipientId] [int] NULL , > [ExtRecipient] [Email] NULL , > [BelongsTo] [bit] NULL , > [IsNew] [bit] NOT NULL , > [FolderId] [int] NULL , > [MessageState] [int] NULL > ) ON [PRIMARY] > GO > > CREATE CLUSTERED INDEX [CI_MessageRecipient_RecipientId] ON > [dbo].[MessageRecipient]([RecipientId]) WITH FILLFACTOR = 90 ON > [PRIMARY] > GO [snip] > There is no Primary Key on this table, because we can deduct unique > value only by the combination of MessageId,RecipientId and ExtRecipient. > Whereas RecipientId can be nullable, MessageId is the foreign key > reference to the Message table Id and ExtRecipient is also a nullable > column. So cannot create Primary Key on MessageRecipient table. normalized). For example: why would one need an ExtRecipient value to uniquely identify the "Recipient" if there is already a RecipientID? Another example: if the table describes the relation between an Message and a Recipient, then how can it be that there are rows without Recipient? It doesn't make sense to allow NULLs in column RecipientID. Having said all that: you could still create a unique constraint on (MessageId, RecipientID, ExtRecipient) as an alternative for a primary key. > i had Well, it is good for the query you are trying to optimize... There is no> create Clustered Index on RecipientId column. But i am not sure whether > it is good thing to specify clustered index on recipientId column. clear formula which index you should make clustered. > Could you please check the Messagerecipient table schema and suggest me That is probably beyond the scope of a newsgroup. Maybe other people can> what are the keys should be... give you suggestions on that. Gert-Jan Show quote Gert-Jan Strik (sorry@toomuchspamalready.nl) writes:
> This is an indication that your design is not correct (not properly I can guess that: RecipentID is a known and registered recipent, > normalized). For example: why would one need an ExtRecipient value to > uniquely identify the "Recipient" if there is already a RecipientID? ExtRecipent is a one off, that we will not send mail to very often, if ever again. > Another example: if the table describes the relation between an Message There should probably be a constraint requiring that exactly one of> and a Recipient, then how can it be that there are rows without > Recipient? It doesn't make sense to allow NULLs in column RecipientID. RecpipentID and ExtRecipent to be non-NULL. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server SP3 at http://www.microsoft.com/sql/techinfo/productdoc/2000/books.asp I tried using Text in row option. I has given
EXEC sp_tableoption 'Message', 'text in row', 'ON' GO EXEC sp_tableoption 'Message', 'text in row', '1000' for enabling that option and tried to update the Message text column in the Message table with UPDATE Message SET Message = Message It updated the text column. But before using the text in row option the table is distributed over 4000 pages but after text in row option the number of pages increased by 200%. How can it happen. I have seen in the table for 85% of the text column values having datalength <=350 and 15% are having <=2300. Does it make any difference in arranging the value in the data pages. I have kept the clustered index on the combination of MessageId,RecipientId and gave the local join hint as LOOP JOIN it gave the query plan as below |--Compute Scalar(DEFINE:([MSGS].[Message]=[MSGS].[Message])) REFERENCES:([MSGS].[Id]) WITH PREFETCH)|--Parallelism(Gather Streams, ORDER BY:([MSGS].[SentOn] DESC)) |--Sort(ORDER BY:([MSGS].[SentOn] DESC)) |--Nested Loops(Inner Join, OUTER |--Clustered Index Scan(OBJECT:([Viper511].[dbo].[Message].[PK_Message] AS [MSGS]), ORDEREDFORWARD) |--Clustered Index Seek(OBJECT:([Viper511].[dbo].[MessageRecipient].[IX_MessageRecipient_MessageId] AS [MSGR]), SEEK:([MSGR].[MessageId]=[MSGS].[Id] AND [MSGR].[RecipientId]=32), WHERE:([MSGR].[MessageState]<>891) ORDERED FORWARD) now it was optimized to 50% (8 sec) of the average time. Is there any other thing that i can try to optimize this query. Thanks for giving the valuable information and helping in fixing the performance of this query.I appreciate your help. *** Sent via Developersdex http://www.developersdex.com *** |
|||||||||||||||||||||||