|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Need help optimizing a stored procedure - SQL2kexplain, to improve performance and take down the whole duration. I have a table with about 5.000.000 rows: TABLE1: ID bigint, ID_PM bigint, DATE datetime, ...other fields... and another table with abount 5.000 rows: TABLE2: ID_PM bigint, ID_SE bigint ...other fields... At the end of every month, for each row in the 2nd table, i have to perform calculations on the first table, to write down some reports. Actually we are using user defined functions that acts on the first table, for example to extract the first row of every ID_PM of every month of this year, something like: CREATE FUNCTION [dbo].[MyFunction] (@GETDATE smalldatetime) RETURNS TABLE AS RETURN ( SELECT MIN(S.ID) AS ID, S.ID_PM FROM dbo.SMS_Dati S WHERE YEAR(DATE) = YEAR(@GETDATE) GROUP BY MONTH(Data), S.ID_PM ) this is not the only function, we have 2 or 3 of them that perform different regroups and calculations. And we use this function in a stored procedure several times, in queryes that shows like SELECT COUNT(*) FROM dbo.MyFunction(GETDATE()) S LEFT OUTER JOIN TABLE2 P ON S.ID_PM=S.ID_PM WHERE P.ID_SE=@ID_SE So i was asking if there's a better way to do this, for example to build a temporary table #temptable at the start of the stored procedure, and perform counts on that, and drop the table at the end of the stored. Or perhaps other ways. The whole process right now is taking about 3 hours to complete (not only this calculation i wrote above, obviously), so it will be great if we can reduce the execution time using temporary tables or something else. Thank you in advance. Marco. Consider eliminating the UDF and going with a view. Perhaps, you might be
able to index the view. You didn't post your DDL, so I can't tell. -- Show quoteTom ---------------------------------------------------- Thomas A. Moreau, BSc, PhD, MCSE, MCDBA SQL Server MVP Columnist, SQL Server Professional Toronto, ON Canada t**@cips.ca www.pinpub.com "M. Simioni" <m.simioniREMOVET***@TOCONTACTMEgmail.com> wrote in message news:%236$ZHdY$FHA.1600@TK2MSFTNGP11.phx.gbl... > Hi, i was thinking about a way to rewrite the procedure i'm going to > explain, to improve performance and take down the whole duration. > > I have a table with about 5.000.000 rows: > TABLE1: ID bigint, ID_PM bigint, DATE datetime, ...other > fields... > and another table with abount 5.000 rows: > TABLE2: ID_PM bigint, ID_SE bigint ...other fields... > > At the end of every month, for each row in the 2nd table, i have to > perform calculations on the first table, to write down some reports. > > Actually we are using user defined functions that acts on the first table, > for example to extract the first row of every ID_PM of every month of this > year, something like: > > CREATE FUNCTION [dbo].[MyFunction] (@GETDATE smalldatetime) > RETURNS TABLE AS > RETURN ( > SELECT MIN(S.ID) AS ID, S.ID_PM > FROM dbo.SMS_Dati S > WHERE YEAR(DATE) = YEAR(@GETDATE) > GROUP BY MONTH(Data), S.ID_PM > ) > > this is not the only function, we have 2 or 3 of them that perform > different regroups and calculations. > And we use this function in a stored procedure several times, in queryes > that shows like > > SELECT COUNT(*) FROM dbo.MyFunction(GETDATE()) S > LEFT OUTER JOIN TABLE2 P ON S.ID_PM=S.ID_PM > WHERE P.ID_SE=@ID_SE > > So i was asking if there's a better way to do this, for example to build a > temporary table #temptable at the start of the stored procedure, and > perform counts on that, and drop the table at the end of the stored. Or > perhaps other ways. > > The whole process right now is taking about 3 hours to complete (not only > this calculation i wrote above, obviously), so it will be great if we can > reduce the execution time using temporary tables or something else. > > Thank you in advance. > Marco. > Here is the TABLE1 definition:
******************************************************************************* CREATE TABLE [SMS_Dati] ( [ID_SMSDATI] [bigint] IDENTITY (1, 1) NOT NULL , [ID_PUNTOMISURA] [bigint] NULL CONSTRAINT [DF_SMS_Dati_ID_PUNTOMISURA] DEFAULT (1), [DataOraCreazione] [datetime] NULL CONSTRAINT [DF_SMS_Dati_DataOraCreazione] DEFAULT (getdate()), [Data] [datetime] NULL CONSTRAINT [DF_SMS_Dati_Data] DEFAULT (getdate()), [MIN1] [float] NULL , [MED1] [float] NULL , [MAX1] [float] NULL , [SQM1] [float] NULL , [NAL1] [int] NULL , [TFL1] [int] NULL , [MIN2] [float] NULL , [MED2] [float] NULL , [MAX2] [float] NULL , [SQM2] [float] NULL , [NAL2] [int] NULL , [TFL2] [int] NULL , [CH1] [varchar] (5) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_SMS_Dati_CH1] DEFAULT (''), [CH2] [varchar] (5) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_SMS_Dati_CH2] DEFAULT (''), [TGC1] [int] NULL , [TGC2] [int] NULL , [NomeOperatore] [varchar] (40) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_SMS_Dati_NomeOperatore] DEFAULT ('Automatico'), [Conformita] [bit] NULL CONSTRAINT [DF_SMS_Dati_Conformita] DEFAULT (0), [TipologiaMisura] [tinyint] NULL CONSTRAINT [DF_SMS_Dati_TipologiaMisura] DEFAULT (0), [TGCColor] [tinyint] NULL CONSTRAINT [DF_SMS_Dati_TGCColor] DEFAULT (0), CONSTRAINT [PK_SMS_Dati] PRIMARY KEY CLUSTERED ( [ID_SMSDATI] ) ON [PRIMARY] ) ON [PRIMARY] GO ******************************************************************************* and here the TABLE2 definition: ******************************************************************************* CREATE TABLE [PuntiMisura] ( [ID_PUNTOMISURA] [bigint] IDENTITY (1, 1) NOT NULL , [ID_SISTEMAEL] [bigint] NULL CONSTRAINT [DF_PuntiMisura_ID_SISTEMAEL] DEFAULT (1), [ID_COMUNE] [bigint] NULL , [ID_DISPOSITIVO] [bigint] NULL CONSTRAINT [DF_PuntiMisura_ID_DISPOSITIVO] DEFAULT (1), [Descrizione] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_PuntiMisura_Descrizione] DEFAULT (''), [TipologiaDispositivo] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_PuntiMisura_TipologiaDispositivo] DEFAULT (''), [Attributo] [tinyint] NULL , [TipoMisura] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_PuntiMisura_TipoMisura] DEFAULT (''), [ConformitaMisura] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_PuntiMisura_ConformitaMisura] DEFAULT (''), [NomeOperatore] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_PuntiMisura_NomeOperatore] DEFAULT (''), [Label] [varchar] (100) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_PuntiMisura_Label] DEFAULT (''), [Status] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_PuntiMisura_Status] DEFAULT (''), [DataApertura] [smalldatetime] NULL CONSTRAINT [DF_PuntiMisura_DataApertura] DEFAULT (getdate()), [Chiuso] [bit] NULL CONSTRAINT [DF_PuntiMisura_Chiuso] DEFAULT (0), [DataChiusura] [smalldatetime] NULL , [CodiceAutority] [varchar] (20) COLLATE Latin1_General_CI_AS NOT NULL CONSTRAINT [DF_PuntiMisura_CodiceAutority] DEFAULT (''), [Diario] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT [DF_PuntiMisura_Diario] DEFAULT (''), CONSTRAINT [PK_Dispositivo] PRIMARY KEY CLUSTERED ( [ID_PUNTOMISURA] ) ON [PRIMARY] ) ON [PRIMARY] GO ******************************************************************************* here is the function: ******************************************************************************* CREATE FUNCTION [dbo].[spReport_RCA_A7_Monthly] (@GETDATE smalldatetime) RETURNS TABLE AS RETURN ( SELECT MIN(S.ID_SMSDATI) AS ID_SMSDATI, S.ID_PUNTOMISURA FROM dbo.SMS_Dati S WHERE YEAR(Data) = YEAR(@GETDATE) GROUP BY MONTH(Data), S.ID_PUNTOMISURA ) ******************************************************************************* and how i use it in a stored procedure: SET @TOT1 = ( SELECT COUNT(*) FROM dbo.spReport_RCA_A7_MonthlyCON(@GETDATE) S LEFT OUTER JOIN PuntiMisura P ON S.ID_PUNTOMISURA = P.ID_PUNTOMISURA LEFT OUTER JOIN SistemiEl E ON P.ID_SISTEMAEL=E.ID_SISTEMAEL WHERE E.ID_IMPIANTO=@ID_IMPIANTO AND P.TipologiaDispositivo='<img src="Dispositivi\man.png">' ) ******************************************************************************* should i try with indexed views instead of udf? I'd try a plain view first and then index it, if necessary. Also, I'd
change the WHERE clause to: WHERE Data > = convert (char(4), @GETDATE, 112) + '0101' AND Data < dateadd (yy, 1, convert (char(4), @GETDATE, 112) + '0101') Try that before indexing the view. Also, on SMS_Data, clustered it on Date and change the PK to nonclustered. Let us know how that performs and we can take it from there. -- Show quoteTom ---------------------------------------------------- Thomas A. Moreau, BSc, PhD, MCSE, MCDBA SQL Server MVP Columnist, SQL Server Professional Toronto, ON Canada t**@cips.ca www.pinpub.com "M. Simioni" <m.simioniREMOVET***@TOCONTACTMEgmail.com> wrote in message news:eif9InY$FHA.2812@TK2MSFTNGP09.phx.gbl... > Here is the TABLE1 definition: > > ******************************************************************************* > CREATE TABLE [SMS_Dati] ( > [ID_SMSDATI] [bigint] IDENTITY (1, 1) NOT NULL , > [ID_PUNTOMISURA] [bigint] NULL CONSTRAINT [DF_SMS_Dati_ID_PUNTOMISURA] > DEFAULT (1), > [DataOraCreazione] [datetime] NULL CONSTRAINT > [DF_SMS_Dati_DataOraCreazione] DEFAULT (getdate()), > [Data] [datetime] NULL CONSTRAINT [DF_SMS_Dati_Data] DEFAULT (getdate()), > [MIN1] [float] NULL , > [MED1] [float] NULL , > [MAX1] [float] NULL , > [SQM1] [float] NULL , > [NAL1] [int] NULL , > [TFL1] [int] NULL , > [MIN2] [float] NULL , > [MED2] [float] NULL , > [MAX2] [float] NULL , > [SQM2] [float] NULL , > [NAL2] [int] NULL , > [TFL2] [int] NULL , > [CH1] [varchar] (5) COLLATE Latin1_General_CI_AS NULL CONSTRAINT > [DF_SMS_Dati_CH1] DEFAULT (''), > [CH2] [varchar] (5) COLLATE Latin1_General_CI_AS NULL CONSTRAINT > [DF_SMS_Dati_CH2] DEFAULT (''), > [TGC1] [int] NULL , > [TGC2] [int] NULL , > [NomeOperatore] [varchar] (40) COLLATE Latin1_General_CI_AS NULL > CONSTRAINT [DF_SMS_Dati_NomeOperatore] DEFAULT ('Automatico'), > [Conformita] [bit] NULL CONSTRAINT [DF_SMS_Dati_Conformita] DEFAULT (0), > [TipologiaMisura] [tinyint] NULL CONSTRAINT [DF_SMS_Dati_TipologiaMisura] > DEFAULT (0), > [TGCColor] [tinyint] NULL CONSTRAINT [DF_SMS_Dati_TGCColor] DEFAULT (0), > CONSTRAINT [PK_SMS_Dati] PRIMARY KEY CLUSTERED > ( > [ID_SMSDATI] > ) ON [PRIMARY] > ) ON [PRIMARY] > GO > ******************************************************************************* > > and here the TABLE2 definition: > > ******************************************************************************* > CREATE TABLE [PuntiMisura] ( > [ID_PUNTOMISURA] [bigint] IDENTITY (1, 1) NOT NULL , > [ID_SISTEMAEL] [bigint] NULL CONSTRAINT [DF_PuntiMisura_ID_SISTEMAEL] > DEFAULT (1), > [ID_COMUNE] [bigint] NULL , > [ID_DISPOSITIVO] [bigint] NULL CONSTRAINT [DF_PuntiMisura_ID_DISPOSITIVO] > DEFAULT (1), > [Descrizione] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT > [DF_PuntiMisura_Descrizione] DEFAULT (''), > [TipologiaDispositivo] [varchar] (50) COLLATE Latin1_General_CI_AS NULL > CONSTRAINT [DF_PuntiMisura_TipologiaDispositivo] DEFAULT (''), > [Attributo] [tinyint] NULL , > [TipoMisura] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT > [DF_PuntiMisura_TipoMisura] DEFAULT (''), > [ConformitaMisura] [varchar] (50) COLLATE Latin1_General_CI_AS NULL > CONSTRAINT [DF_PuntiMisura_ConformitaMisura] DEFAULT (''), > [NomeOperatore] [varchar] (50) COLLATE Latin1_General_CI_AS NULL > CONSTRAINT [DF_PuntiMisura_NomeOperatore] DEFAULT (''), > [Label] [varchar] (100) COLLATE Latin1_General_CI_AS NULL CONSTRAINT > [DF_PuntiMisura_Label] DEFAULT (''), > [Status] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT > [DF_PuntiMisura_Status] DEFAULT (''), > [DataApertura] [smalldatetime] NULL CONSTRAINT > [DF_PuntiMisura_DataApertura] DEFAULT (getdate()), > [Chiuso] [bit] NULL CONSTRAINT [DF_PuntiMisura_Chiuso] DEFAULT (0), > [DataChiusura] [smalldatetime] NULL , > [CodiceAutority] [varchar] (20) COLLATE Latin1_General_CI_AS NOT NULL > CONSTRAINT [DF_PuntiMisura_CodiceAutority] DEFAULT (''), > [Diario] [varchar] (50) COLLATE Latin1_General_CI_AS NULL CONSTRAINT > [DF_PuntiMisura_Diario] DEFAULT (''), > CONSTRAINT [PK_Dispositivo] PRIMARY KEY CLUSTERED > ( > [ID_PUNTOMISURA] > ) ON [PRIMARY] > ) ON [PRIMARY] > GO > ******************************************************************************* > > here is the function: > > ******************************************************************************* > CREATE FUNCTION [dbo].[spReport_RCA_A7_Monthly] (@GETDATE smalldatetime) > RETURNS TABLE AS > RETURN ( > SELECT MIN(S.ID_SMSDATI) AS ID_SMSDATI, S.ID_PUNTOMISURA > FROM dbo.SMS_Dati S > WHERE YEAR(Data) = YEAR(@GETDATE) > GROUP BY MONTH(Data), S.ID_PUNTOMISURA > ) > ******************************************************************************* > > and how i use it in a stored procedure: > > SET @TOT1 = ( > SELECT COUNT(*) FROM dbo.spReport_RCA_A7_MonthlyCON(@GETDATE) S > LEFT OUTER JOIN PuntiMisura P ON S.ID_PUNTOMISURA = P.ID_PUNTOMISURA > LEFT OUTER JOIN SistemiEl E ON P.ID_SISTEMAEL=E.ID_SISTEMAEL > WHERE E.ID_IMPIANTO=@ID_IMPIANTO > AND P.TipologiaDispositivo='<img src="Dispositivi\man.png">' > ) > > ******************************************************************************* > > should i try with indexed views instead of udf? > > > > > > > > M. Simioni (m.simioniREMOVET***@TOCONTACTMEgmail.com) writes:
> CREATE FUNCTION [dbo].[spReport_RCA_A7_Monthly] (@GETDATE smalldatetime) Tom suggested to drop the UDF, but since this is a inline UDF, it does> RETURNS TABLE AS > RETURN ( > SELECT MIN(S.ID_SMSDATI) AS ID_SMSDATI, S.ID_PUNTOMISURA > FROM dbo.SMS_Dati S > WHERE YEAR(Data) = YEAR(@GETDATE) > GROUP BY MONTH(Data), S.ID_PUNTOMISURA > ) not do any harm as such. However, this is bad: WHERE YEAR(Data) = YEAR(@GETDATE) IF there is an index on Data, it cannot be used. Rewrite as Tom suggested. Making this the clustered index, can also be a good idea. But there is another thing that I don't understand: GROUP BY MONTH(Data), S.ID_PUNTOMISURA You group by month, but the month is not included in the result set. Furthermore: > SET @TOT1 = ( You do not use ID_SMSDATI either, so it appears that all the UDF does> SELECT COUNT(*) FROM dbo.spReport_RCA_A7_MonthlyCON(@GETDATE) S > LEFT OUTER JOIN PuntiMisura P ON S.ID_PUNTOMISURA = P.ID_PUNTOMISURA > LEFT OUTER JOIN SistemiEl E ON P.ID_SISTEMAEL=E.ID_SISTEMAEL > WHERE E.ID_IMPIANTO=@ID_IMPIANTO > AND P.TipologiaDispositivo='<img src="Dispositivi\man.png">' > ) is to produce a row for each month ID_PUNTOMISURA appears in. You have LEFT JOIN in your query, but with the WHERE conditions, it is obvious that you mean an inner join, so you should rewrite the query as such. And, finally you should add indexes to PuntiMisura.TipologiaDispositivo and SistemiEl.ID_IMPIANTO. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx > Tom suggested to drop the UDF, but since this is a inline UDF, it does Ok, i'll try to rewrite this> not do any harm as such. However, this is bad: > > WHERE YEAR(Data) = YEAR(@GETDATE) > > IF there is an index on Data, it cannot be used. Rewrite as Tom suggested. > Making this the clustered index, can also be a good idea. > Show quote > But there is another thing that I don't understand: I want to extract the first row for every month, if there is one.> > GROUP BY MONTH(Data), S.ID_PUNTOMISURA > > You group by month, but the month is not included in the result set. > Furthermore: > > > SET @TOT1 = ( > > SELECT COUNT(*) FROM dbo.spReport_RCA_A7_MonthlyCON(@GETDATE) S > > LEFT OUTER JOIN PuntiMisura P ON S.ID_PUNTOMISURA = P.ID_PUNTOMISURA > > LEFT OUTER JOIN SistemiEl E ON P.ID_SISTEMAEL=E.ID_SISTEMAEL > > WHERE E.ID_IMPIANTO=@ID_IMPIANTO > > AND P.TipologiaDispositivo='<img src="Dispositivi\man.png">' > > ) > > You do not use ID_SMSDATI either, so it appears that all the UDF does > is to produce a row for each month ID_PUNTOMISURA appears in. I don't use the ID_PUNTOMISURA out of this funcion, you're right, but i didn't know another way to count down the rows, considering a maximum of 1 per month for each ID_PUNTOMISURA. > Ok i'l try to do that.> You have LEFT JOIN in your query, but with the WHERE conditions, it is > obvious that you mean an inner join, so you should rewrite the query > as such. > > And, finally you should add indexes to PuntiMisura.TipologiaDispositivo > and SistemiEl.ID_IMPIANTO. Thank you for your advice, i'll start from this and let you know. |
|||||||||||||||||||||||