Home All Groups Group Topic Archive Search About

Views and CTEs performance problem

Author
20 Jul 2006 7:34 PM
Rich
I have a hierarchical structure to contain objects which maps users (belong
to roles) to an ACL table for permissioning on these objects.

I then need to create hierarchical XML based on the data hierarchy with the
ACL access included. My lookup on the data is peachy, but I'm noticing that
as soon as I add a JOIN to my ACL view (which is a recursive CTE), the query
time goes from a few milliseconds to 14 seconds.

Here is some code stripped to its essence for clarity:

Table "data" : guid uniqueidentifier, parent_guid uniqueidentifier, label
varchar(50)
Table "acl" : acl_id int, role_guid uniqueidentifier, object_guid
uniqueidentifier, acl tinyint, is_rootnode_for_role bit

The ACL table is then expanded into a full view for each object for each
role for quick lookup hierarchically:

CREATE VIEW [myschema].[vACL] AS
WITH hierarchy (guid, role_guid, acl) AS (
    -- Anchor member
    SELECT d.guid, a.role_guid, a.acl
        FROM myschema.data d, myschema.acl a
        WHERE d.guid = a.object_guid AND a.is_rootnode_for_role=1

    UNION ALL

    -- Recursive member
    SELECT d.guid, h.role_guid,
        CASE WHEN EXISTS (SELECT a.acl_id FROM myschema.acl a
                WHERE a.role_guid=h.role_guid AND a.object_guid=d.guid AND NOT
a.acl=h.acl)
        THEN (SELECT a.acl FROM myschema.acl a WHERE a.object_guid=d.guid AND
a.role_guid=h.role_guid AND NOT a.acl=h.acl)
            ELSE h.acl END as acl

        FROM myschema.data d
        JOIN hierarchy h ON h.guid = d.parent_guid
)

SELECT * FROM hierarchy WHERE acl>0

Now I have a hierarchy with ACL for each Role. Now, I need to produce a
hierarchical XML structure to represent the data hierarchy, with its
corresponding ACL. This first recursion runs super fast but does not return
my needed ACL value:

CREATE FUNCTION [myschema].[fsHierarchy]
(
    @guid uniqueidentifier,
    @role uniqueidentifier
)
RETURNS xml
AS
BEGIN RETURN
    (SELECT d.guid as '@guid', d.label as '@label',
myschema.fsHierarchy(d.guid, @role)
        FROM myschema.data d WHERE d.parent_guid=@guid
        FOR XML PATH('node'), TYPE)
END

Example result:

<node guid="F8BAA99D-4B6B-4EF1-8E6C-379837604F6F" label="node1">
  <node guid="C809425B-C1AB-452B-8D8C-3961BC9E5B6F" label="another node">
    <node guid="D46DA255-F2DE-4725-81CA-AA14BF9E8156" label="leaf node" />
  </node>
  <node guid="C107706B-5065-4E3A-A36E-8D90D4CF76C3" label="child of node 1" />
</node>
<node guid="5B91C1E4-8120-45B7-9C4E-59A4E5AA7014" label="node2">
  <node guid="3F756D47-09AC-4215-80BB-0959592F26A1" label="...">
    <node guid="08C84B70-DB2D-44F0-81A7-41A105EE9340" label="......" />
  </node>
  <node guid="C7EA4344-D909-48A1-931A-B73415CDC86D" label="node2 child">
    <node guid="2210FAF6-E7D2-456C-8677-F00E5F34A7B8" label="...." />
  </node>
</node>
<node guid="0DCB2423-E725-442B-8577-A1061001F091" label="node3">
  <node guid="D29E12C9-EF04-4819-BD37-141DE87E0636" label="3child">
    <node guid="E866C828-BBA8-4836-B17E-084206A2700F" label="3childchild" />
    <node guid="34750E36-5FB7-42DB-8FA3-5974485E5DF7" label="another
3childchild" />
  </node>
</node>

Exactly what I need except is missing my much needed ACL. So, I add in a
JOIN to my vACL:

ALTER FUNCTION [myschema].[fsHierarchy]
(
    @guid uniqueidentifier,
    @role uniqueidentifier
)
RETURNS xml
AS
BEGIN RETURN
    (SELECT a.guid as '@guid', d.label as '@label', a.acl as '@acl',
myschema.fsHierarchy(a.guid, @group)
        FROM myschema.data d
        JOIN myschema.vACL a ON a.guid = d.guid
        WHERE a.role_guid=@role AND d.parent_guid=@guid
        FOR XML PATH('node'), TYPE)
END

This works great and adds my acl="x" to each node in the above XML precisely
as I need it. However, its taking 14 seconds to run where it ran instantly
before. I've analyzed the recursive calls to fsHierarchy in Profiler, and its
crawling the exact same number of rows, however, the CPU, Reads and Duration
are dramatically increased.

I'm thinking this may be due to invoking a recursive function that is using
a recursive CTE and perhaps my lack of any indexes is cumbersome.

Any help would be greatly appreciated - hope everyone has a fantastic day!

Cheers,
    Rich

Author
20 Jul 2006 10:14 PM
Erland Sommarskog
Rich (R***@discussions.microsoft.com) writes:
> This works great and adds my acl="x" to each node in the above XML
> precisely as I need it. However, its taking 14 seconds to run where it
> ran instantly before. I've analyzed the recursive calls to fsHierarchy
> in Profiler, and its crawling the exact same number of rows, however,
> the CPU, Reads and Duration are dramatically increased.
>
> I'm thinking this may be due to invoking a recursive function that is
> using a recursive CTE and perhaps my lack of any indexes is cumbersome.

Without knowledge about keys and indexes, what the column really means,
and no sample data, it is difficult to assist.

But I believe this should be a bit faster, but still yield the same result:

   CREATE VIEW [myschema].[vACL] AS
   WITH hierarchy (guid, role_guid, acl) AS (
      -- Anchor member
      SELECT d.guid, a.role_guid, a.acl
      FROM myschema.data d, myschema.acl a
      WHERE d.guid = a.object_guid AND a.is_rootnode_for_role=1

      UNION ALL

      -- Recursive member
      SELECT d.guid, h.role_guid,
             coalesce ((SELECT a.acl
                        FROM   myschema.acl a
                        WHERE  a.object_guid=d.guid
                          AND  a.role_guid=h.role_guid
                           AND NOT a.acl=h.acl)), h.acl) AS acl
      FROM myschema.data d
      JOIN hierarchy h ON h.guid = d.parent_guid
   )

This may be even better, but I'm less sure that it returns the correct
result:

   CREATE VIEW [myschema].[vACL] AS
   WITH hierarchy (guid, role_guid, acl) AS (
      -- Anchor member
      SELECT d.guid, a.role_guid, a.acl
      FROM myschema.data d, myschema.acl a
      WHERE d.guid = a.object_guid AND a.is_rootnode_for_role=1

      UNION ALL

      -- Recursive member
      SELECT d.guid, h.role_guid, coalesce(a.acl, h.acl) AS acl
      FROM myschema.data d
      JOIN hierarchy h ON h.guid = d.parent_guid
      LEFT JOIN myschema.acl a ON a.object_guid=d.guid
                              AND a.role_guid=h.role_guid
                              AND NOT a.acl=h.acl
   )



--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
21 Jul 2006 5:22 AM
Omnibuzz
We need more info. And how did that second function work?
You haven't declared @group. Or am I missing something obvious?
--
-Omnibuzz (The SQL GC)

http://omnibuzz-sql.blogspot.com/
Author
25 Jul 2006 9:01 PM
Rich
Dear Omnibuzz, Erland,

    I have uploaded a bak of the DB here with test data and the
functionality:   http://r9x.com/_sql/temp1.zip

    Also note, I created a test stored procedure which demonstrates a call
to both functions: EXEC silo.testdemo

- the first runs quickly but does not include the "acl" column I need to join
- the second runs very slowly but includes my needed "acl" field

    Once again, thank you very much for all your help and insights! Please
just let me know if there is anything else you need me to clarify or provide
info on.

Cheers,
    Rich

Show quote
"Omnibuzz" wrote:

> We need more info. And how did that second function work?
> You haven't declared @group. Or am I missing something obvious?
> --
> -Omnibuzz (The SQL GC)
>
> http://omnibuzz-sql.blogspot.com/
>
>
>
Author
26 Jul 2006 7:44 PM
Erland Sommarskog
Rich (R***@discussions.microsoft.com) writes:
>     I have uploaded a bak of the DB here with test data and the
> functionality:   http://r9x.com/_sql/temp1.zip
>
>     Also note, I created a test stored procedure which demonstrates a call
> to both functions: EXEC silo.testdemo
>
> - the first runs quickly but does not include the "acl" column I need to
> join
> - the second runs very slowly but includes my needed "acl" field
>
>     Once again, thank you very much for all your help and insights! Please
> just let me know if there is anything else you need me to clarify or
> provide info on.

I can't say that I have done a through analysis of why it takes so long
time, but the recursive function that invokes a recursive view gives me
some headache. And I'm that well versed in the FOR XML stuff. On top of
that, I have about zero understanding of what the code is supposed to
achieve.

Nevertheless I was able to speed up the query with:

   SELECT * INTO silo.tblData FROM silo.vData

and then replaced all references of silo.vData with silo.tblData. This cut
down execution time to a second or two.

In practice if you were to do this, you would have a trigger on silo.data
that feeds the table with the shreded data.

Since this solution is a little cheesy, I will ask people in our internal
MVP forum to have a look, and see if they can up with something better.

--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
26 Jul 2006 8:26 PM
Rich
Thank you very much Erland! That is an interesting solution - I'm not sure I
understand why this makes a difference (table vs view).

To clarify the purpose of the function, I need to retrieve a hierarchical
XML representation of the data based on parent/child relationship with an ACL
value for each item (guid) from the vACL view.

Thanks for your help and I look forward to anything your team can offer.

Best regards,
     Rich


Show quote
"Erland Sommarskog" wrote:

> Rich (R***@discussions.microsoft.com) writes:
> >     I have uploaded a bak of the DB here with test data and the
> > functionality:   http://r9x.com/_sql/temp1.zip
> >
> >     Also note, I created a test stored procedure which demonstrates a call
> > to both functions: EXEC silo.testdemo
> >
> > - the first runs quickly but does not include the "acl" column I need to
> > join
> > - the second runs very slowly but includes my needed "acl" field
> >
> >     Once again, thank you very much for all your help and insights! Please
> > just let me know if there is anything else you need me to clarify or
> > provide info on.

> I can't say that I have done a through analysis of why it takes so long
> time, but the recursive function that invokes a recursive view gives me
> some headache. And I'm that well versed in the FOR XML stuff. On top of
> that, I have about zero understanding of what the code is supposed to
> achieve.
>
> Nevertheless I was able to speed up the query with:
>
>    SELECT * INTO silo.tblData FROM silo.vData
>
> and then replaced all references of silo.vData with silo.tblData. This cut
> down execution time to a second or two.
>
> In practice if you were to do this, you would have a trigger on silo.data
> that feeds the table with the shreded data.
>
> Since this solution is a little cheesy, I will ask people in our internal
> MVP forum to have a look, and see if they can up with something better.
>
> --
> Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>
Author
26 Jul 2006 10:25 PM
Erland Sommarskog
Rich (R***@discussions.microsoft.com) writes:
> Thank you very much Erland! That is an interesting solution - I'm not
> sure I understand why this makes a difference (table vs view).

Apparently it takes a lot of time to shred those XML documents over
and over again. By expanding that data to a table, that moment is not
performed. There were a lot of joins to d.parent_guid, which was in
the XML document - so for each access to the table, the table had
to be scanned and each XML document unpacked.

You can define index on XML data, but I will have to admit that I've
not really grasped how that works yet. But maybe that is something you
can play with.


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
27 Jul 2006 2:42 PM
Rich
Thanks Erland. Actually, my queries against the vData view seem really fast;
its only when I do a JOIN with vACL that things get really slow.

This may be a silly question, but how do views work in the SQL 2005 engine?
I was under the impression they are only rebuilt / updated when the
underlying data changed. Or do views that contain CTE's actually get rebuilt
with each query against the view? If so, that may explain the slowness.

Thanks and have a super day!

Cheers,
    Rich

Show quote
"Erland Sommarskog" wrote:

> Rich (R***@discussions.microsoft.com) writes:
> > Thank you very much Erland! That is an interesting solution - I'm not
> > sure I understand why this makes a difference (table vs view).
>
> Apparently it takes a lot of time to shred those XML documents over
> and over again. By expanding that data to a table, that moment is not
> performed. There were a lot of joins to d.parent_guid, which was in
> the XML document - so for each access to the table, the table had
> to be scanned and each XML document unpacked.
>
> You can define index on XML data, but I will have to admit that I've
> not really grasped how that works yet. But maybe that is something you
> can play with.
>
>
> --
> Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>
Author
27 Jul 2006 5:00 PM
Steve Kass
Rich wrote:

>Thanks Erland. Actually, my queries against the vData view seem really fast;
>its only when I do a JOIN with vACL that things get really slow.
>
>This may be a silly question, but how do views work in the SQL 2005 engine?
>I was under the impression they are only rebuilt / updated when the
>underlying data changed. Or do views that contain CTE's actually get rebuilt
>with each query against the view? If so, that may explain the slowness.

>

Views are not even built in the first place.  A view is just "syntactic
sugar" for its definition.
When you issue a query that refers to a view, the view definition is
immediatly substituted in
for the name of the view, and then the optimizer does what it can to
optimize the query.

Only in the case of an indexed (hence materialized) view, which is an
option for a restricted
category of views, is a view materialized and updated only upon changes
to the underlying
tables.  Even then, the view definition is substituted in and the index
is not routinely used
by the optimizer (specifying the NOEXPAND hint will coax the optimizer
into using the
views index/materialization).

Steve Kass
Drew University
www.stevekass.com

Show quote
>Thanks and have a super day!
>
>Cheers,
>    Rich
>
>"Erland Sommarskog" wrote:
>

>
>>Rich (R***@discussions.microsoft.com) writes:
>>   
>>
>>>Thank you very much Erland! That is an interesting solution - I'm not
>>>sure I understand why this makes a difference (table vs view).
>>>     
>>>
>>Apparently it takes a lot of time to shred those XML documents over
>>and over again. By expanding that data to a table, that moment is not
>>performed. There were a lot of joins to d.parent_guid, which was in
>>the XML document - so for each access to the table, the table had
>>to be scanned and each XML document unpacked.
>>
>>You can define index on XML data, but I will have to admit that I've
>>not really grasped how that works yet. But maybe that is something you
>>can play with.
>>
>>
>>--
>>Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se
>>
>>Books Online for SQL Server 2005 at
>>http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
>>Books Online for SQL Server 2000 at
>>http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>>
>>   
>>
Author
27 Jul 2006 9:00 PM
Erland Sommarskog
Rich (R***@discussions.microsoft.com) writes:
> Thanks Erland. Actually, my queries against the vData view seem really
> fast; its only when I do a JOIN with vACL that things get really slow.

And vACL in its turn refers to vData, and you to one of the fields
in the XML document. And then there are several levels of recursion.
For each row in silo.acl, you need to scan all 155 rows in silo.data
to see if you have the right parent_guid and also retrieve object_name.

> This may be a silly question, but how do views work in the SQL 2005
> engine?

Regular views are essentially macros that are pasted into the query.

> I was under the impression they are only rebuilt / updated when the
> underlying data changed. Or do views that contain CTE's actually get >
> rebuilt with each query against the view? If so, that may explain the
> slowness.

There is not even nothing to rebuild.

As Steve mention you have index a view to materialise, but it does not
work in your case. I tried to index your view, but it does not meet
the criteria.


I played with XML indexes, but that did not give any effect. I did
however find an alternative to having a complete shadow table. I
added two computed and persisted columns to silo.data for the crucial
columns parent_guid and object_name, and then I replaced these in
the view. Here is the script:

   CREATE FUNCTION p_guid_func(@content xml) RETURNS uniqueidentifier
   WITH SCHEMABINDING AS
   BEGIN
      RETURN @content.value('(/silo/@parent_guid)[1]','uniqueidentifier')
   END  
   go
   ALTER FUNCTION o_name_func(@content xml) RETURNS varchar(100)
   WITH SCHEMABINDING AS
   BEGIN
      RETURN @content.value(
      'data(//*[@name=data(/silo/@instance_name_field)])[1]','varchar(100)')
   END  
   go
   ALTER TABLE [silo].[data] ADD
       p_guid AS  dbo.p_guid_func(content) PERSISTED
   ALTER TABLE [silo].[data] ADD
       o_name AS  dbo.o_name_func(content) PERSISTED

The functions are needed because you cannot use the XML methods directly
in a computed column expression.

The view:

ALTER VIEW [silo].[vData] AS
SELECT guid,
    [type]            = [content].value('data(/silo/@type)[1]','varchar(10)'), 
    template_guid        = [content].value('data(/silo/@template_guid)[1]','uniqueidentifier'), 
    parent_guid        = p_guid, -- [content].value('(/silo/@parent_guid)[1]','uniqueidentifier'), 
    lang_code        = [content].value('data(/silo/@lang_code)[1]','char(2)'), 
    is_system        = [content].value('data(/silo/@is_system)[1]','bit'), 
    start_stamp        = [content].value('data(/silo/@start_stamp)[1]','datetime'), 
    end_stamp        = [content].value('data(/silo/@end_stamp)[1]','datetime'),
    [order]        = [content].value('data(/silo/@order)[1]','varchar(5)'),
    last_modified_stamp    = [content].value('data(/silo/@last_modified_stamp)[1]','datetime'),
    last_modified_guid  = [content].value('data(/silo/@last_modified_guid)[1]','varchar(40)'),
--    instance_name_field = [content].value('data(/silo/@instance_name_field)[1]','varchar(40)'),
    [object_name]        = o_name -- [content].value('data(//*[@name=data(/silo/@instance_name_field)])[1]','varchar(100)')
FROM silo.data


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

AddThis Social Bookmark Button