|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
More on querying remote serverSELECT TransactionNumber FROM servertest.CMSArchiveTraining.dbo.Transactions_90 WHERE transactionGUID = '4956E240-8B6E-437E-B9B0-5E83FC25E3F0' Issues the remote query: SOURCE:(servertest), QUERY:( SELECT Tbl1001."TransactionGUID" Col1003,Tbl1001."TransactionNumber" Col1004 FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 ) And tries to bring back 8.5M rows - and takes minutes to return, when really there is only one row. Question #1: Why doesn't it include the WHERE clause in the remote query? If i change the query to: SELECT TOP 1 TransactionNumber FROM servertest.CMSArchiveTraining.dbo.Transactions_90 WHERE transactionGUID = '4956E240-8B6E-437E-B9B0-5E83FC25E3F0' It still issues the remote query: SOURCE:(servertest), QUERY:( SELECT Tbl1001."TransactionGUID" Col1003, Tbl1001."TransactionNumber" Col1004 FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 ) But this time it only brings over 10 rows (somehow), performs a top 1 filter, and returns instantly. Question #2: How is it bringing over 10 rows only? (hint: cursor) If i change the query to: SELECT TransactionNumber FROM servertest.CMSArchiveTraining.dbo.Transactions_90 WHERE TransactionNumber = 9679 It issues the remote query: SOURCE:(servertest), QUERY:( SELECT Tbl1001."TransactionNumber" Col1004 FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 WHERE Tbl1001."TransactionNumber"=(9679) ) which returns only one row, and returns instantly. Obviously SQL Server can *sometimes* do the optimization, othertimes it won't. Question #3: Why does it include the WHERE clause in the remote query. If i change the query to: SELECT TransactionNumber FROM servertest.CMSArchiveTraining.dbo.Transactions_90 WHERE TransactionDate = '2002-06-10 08:19:10.513' It issues the remote query: SOURCE:(servertest), QUERY:( SELECT Tbl1001."TransactionNumber" Col1004 FROM "cmsarchivetraining"."dbo"."Transactions_90" Tbl1001 WHERE Tbl1001."TransactionDate"='2002-06-10T08:19:10.513' ) Question #4: Why does it include the WHERE clause in the remote query? Now i create a view: CREATE VIEW CMSArchiveTranasctions AS SELECT * FROM SERVERTEST.CMSArchiveTraining.dbo.Transactions_90 and issue the query: SELECT TransactionNumber FROM CMSArchiveTransactions WHERE TransactionNumber = 9679 It wants to issues the remote query: SOURCE:(SERVERTEST), QUERY:( SELECT Tbl1001."TransactionNumber" Col1005 FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 ) And tries to bring back 8.5M rows - and takes minutes to return, when really there is only one row. Question #5: Why does it not include the WHERE clause in the remote query? If i change the query to: SELECT TOP 1 TransactionNumber FROM CMSArchiveTransactions WHERE TransactionNumber = 9679 it issues the remote query: SOURCE:(SERVERTEST), QUERY:( SELECT TOP 1 Col1004 FROM ( SELECT Tbl1001."TransactionNumber" Col1004 FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 WHERE Tbl1001."TransactionNumber"=(9679) ) Qry1018 ) Question #6: Why is it now including the TOP inside the remote query, when my earlier issue of a TOP query didn't include the TOP limiter? Question #7: Why is it now including the WHERE clause in the remote query, when not including a TOP 1 it won't include the where clause limiter? Question #8: Why does it include the WHERE clause when i have any TOP limiter (even TOP 999999999, or TOP 100 PERCENT), but does not include the where clause when i don't? So i try changing my query to: SELECT TransactionNumber FROM ( SELECT * FROM SERVERTEST.CMSArchiveTraining.dbo.Transactions_90) t WHERE TransactionNumber = 9679 And it issues the remote query: SOURCE:(SERVERTEST), QUERY:( SELECT Tbl1001."TransactionNumber" Col1004 FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 WHERE Tbl1001."TransactionNumber"=(9679) ) So querying through a view it will not include the WHERE clause, but if i include the view as a derived table, it can include the WHERE clause. Question #9: Why does it include the WHERE clause when i query through a derived table, and not through a view? So i try changing the query to SELECT TransactionNumber FROM ( SELECT * FROM CMSArchiveTransactions) t WHERE TransactionNumber = 9679 So here we are, pay attention to this one. If i attempt to query the view directly, it doesn't get optimized. Now i am going to query my view THROUGH a derived table. What do you think it will do? SOURCE:(SERVERTEST), QUERY:( SELECT Tbl1001."TransactionNumber" Col1004 FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 WHERE Tbl1001."TransactionNumber"=(9679) ) It in fact now DOES include the where clause, but if i try to query my view directly, it can't optimize it. This is a stunning development. It means the rules are random, which makes it difficult to optimize distributed queries. Rules for including where clause: Filter on uniqueidentifer: Query includes TOP 1: Yes* Query includes TOP 2: No Query includes TOP n: No Query includes TOP n PERCENT: No Query doesn't include TOP: No Filter on integer: Query linked SQL Server: Yes Query view that queries linked SQL Server: Query view directly: Query includes TOP 1: Yes Query includes TOP 2: Yes Query includes TOP 100: Yes Query includes TOP 999999999: Yes Query includes TOP 100 PERCENT: Yes Query doesn't include TOP: No Query view through derived table: Query includes TOP 1: Yes Query includes: TOP 2: Yes Query includes: TOP n: Yes Query includes: TOP 100 PERCENT: Yes Query includes TOP: Yes You haven't posted any DDL. Are there any indexes on the remote table? Which
columns are indexed? ML --- http://milambda.blogspot.com/ "ML" <M*@discussions.microsoft.com> wrote in message Before answering that, let me ask you this: why does it matter? It doesn't news:E89D60ED-EAEF-46F9-9F31-1D56842B277E@microsoft.com... > You haven't posted any DDL. Are there any indexes on the remote table? > Which > columns are indexed? seem to matter *sometimes*. And even if there aren't an indexes on the remote tables, everyone can agree that sending any filtering criteria to a remote server is faster than trying to filter rows after they've crossed a link. Even a table scan on a remote server is better than bringing the rows over a network to be filtered. Additionally, my post is not dealing with query performance. When SQL Server does what it should do, the queries run fine. If SQL Server doesn't do what it's supposed to do, the queries do not run fine. Indexes are not the performance limitation here. That having been said: yes. If you think you know your data better than the optimizer does (as might be
the case - especially with outdated statistics), you could use the REMOTE join hint to tell the optimizer to process the join on the remote server. Basically, the optimizer can only rely on statistics. If they are missing on the remote site, he prefers "staying local" although filtering data remotely might have yielded better performance. Why put a stress on a remote server if poor performance is expected anyway? But when statistics are up to date on both sites, then the optimizer still has full control over the execution of queries (unless specified otherwise through the use of hints) which basically means that he can still make the wrong assumptions - and I'm as much in the dark here as you are. ML --- http://milambda.blogspot.com/ > Basically, the optimizer can only rely on statistics. If they are missing i guess this is where common sense gets to take a back seat.> on > the remote site, he prefers "staying local" although filtering data > remotely > might have yielded better performance. Why put a stress on a remote server > if > poor performance is expected anyway? > But when statistics are up to date on both sites, then the optimizer still Is there anyone who could explain the bewildering optimizer choices?> has full control over the execution of queries (unless specified otherwise > through the use of hints) which basically means that he can still make the > wrong assumptions - and I'm as much in the dark here as you are. I would hazard a guess that the local SQL server has no knowledge of the
statistics on the remote server and can only determine which query to send across to the remote server based on the existence of keys (possibly indexes or unique indexes). The question would be what indexes exist on the remote table. I believe if you look at these closely your results will turn out to be much more consistent than you think. Show quote "ML" <M*@discussions.microsoft.com> wrote in message news:064A71B2-9B67-4F97-8635-6854A4B9551C@microsoft.com... > If you think you know your data better than the optimizer does (as might be > the case - especially with outdated statistics), you could use the REMOTE > join hint to tell the optimizer to process the join on the remote server. > > Basically, the optimizer can only rely on statistics. If they are missing on > the remote site, he prefers "staying local" although filtering data remotely > might have yielded better performance. Why put a stress on a remote server if > poor performance is expected anyway? > > But when statistics are up to date on both sites, then the optimizer still > has full control over the execution of queries (unless specified otherwise > through the use of hints) which basically means that he can still make the > wrong assumptions - and I'm as much in the dark here as you are. > > > ML > > --- > http://milambda.blogspot.com/ "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message Performs badly:news:ug9G18ZWGHA.3972@TK2MSFTNGP02.phx.gbl... >I would hazard a guess that the local SQL server has no knowledge of the > statistics on the remote server and can only determine which query to send > across to the remote server based on the existence of keys (possibly > indexes > or unique indexes). > > The question would be what indexes exist on the remote table. I believe > if > you look at these closely your results will turn out to be much more > consistent than you think. SELECT * FROM ViewOfTransactions WHERE TransactionNumber = 1234 Performs goodly: SELECT * FROM (SELECT * FROM ViewOfTransactions) WHERE TransactionNumber = 1234 Same statistics on the remote server. Same indexes on the remote server. Same query. Returning same results. Indexes and statistics are not the problem. I don't think SQL Server will necessarily handle a view identically to an
inline view as far as optimization goes, even for local tables. In this case I would agree that it should (and may very well be a bug), but I believe using views instead of inline views does affect the optimizer and change the priority of it's processes. The question of keys and indexes is in reference to why it sends the filter to the remote database for TransactionNumber and TransactionDate, but not transactionGUID. If the indexes on all three of these fields are identical, then you are correct, but I would be surprised to find that they are identical. On the other hand, data types may matter as well, although I can't think of a reason why they would. Show quote "Ian Boyd" <ian.msnews***@avatopia.com> wrote in message news:OnxleGaWGHA.4924@TK2MSFTNGP05.phx.gbl... > > "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message > news:ug9G18ZWGHA.3972@TK2MSFTNGP02.phx.gbl... > >I would hazard a guess that the local SQL server has no knowledge of the > > statistics on the remote server and can only determine which query to send > > across to the remote server based on the existence of keys (possibly > > indexes > > or unique indexes). > > > > The question would be what indexes exist on the remote table. I believe > > if > > you look at these closely your results will turn out to be much more > > consistent than you think. > > > Performs badly: > SELECT * FROM ViewOfTransactions > WHERE TransactionNumber = 1234 > > Performs goodly: > SELECT * FROM (SELECT * FROM ViewOfTransactions) > WHERE TransactionNumber = 1234 > > Same statistics on the remote server. > Same indexes on the remote server. > Same query. > Returning same results. > > Indexes and statistics are not the problem. > > > The question of keys and indexes is in reference to why it sends the i will confess, that i knew the answer to filtering on uniqueidentifier > filter > to the remote database for TransactionNumber and TransactionDate, but not > transactionGUID. columns before posting. Question #1 was a plant, to see who knows what they're talking about. You can see my previous posts on remote uniqueidentifier filtering: http://tinyurl.com/rrwtp http://tinyurl.com/owgr9 http://tinyurl.com/ppybe http://tinyurl.com/pwb26 You should remember that I am speaking not from what I know, but from what I
can deduce based on the information provided (hence my referencing to making guesses on what was happening). I have found that the optimizers are too complex to ever fully understand, and that the best we can do is find general strategies that can sometimes make a difference in the execution plan. In this case, looking at the indexes is a necessary first step to determining what the difference could be. Excluding that as a difference, looking at the datatypes is a logical next step. If you are looking for answers as to exactly what the code does under the covers, then I apologize for cluttering up the thread with theory and guesses. However, if you are looking for idea that might help explain what is happening, then posting all of the information available will contribute to the discussion (including indexes and applicable links). Show quote "Ian Boyd" <ian.msnews***@avatopia.com> wrote in message news:e22t4eaWGHA.3800@TK2MSFTNGP03.phx.gbl... > > The question of keys and indexes is in reference to why it sends the > > filter > > to the remote database for TransactionNumber and TransactionDate, but not > > transactionGUID. > > i will confess, that i knew the answer to filtering on uniqueidentifier > columns before posting. > > Question #1 was a plant, to see who knows what they're talking about. > > > You can see my previous posts on remote uniqueidentifier filtering: > http://tinyurl.com/rrwtp > http://tinyurl.com/owgr9 > http://tinyurl.com/ppybe > http://tinyurl.com/pwb26 > > > posting all of the information available will contribute 1. Setting up databases on multiple servers, and linked servers, with > to the discussion (including indexes and applicable links). required volumes is something nobody here would attempt - only PSS could be convinced to try it. Even then, making it fail in the way would be iffy. (Just because it fails the exact same time for me, every time, every database i have tried it on - doesn't make it repeatable). 2. The nature of the symptoms is beyond everyone except those familiar with SQL Sever itself - not just people used to using it. None of those people patrols these groups. 3. The only reliable way for said person to test it would be to connect to our databases servers and run the queries for themselves - not very likely 4. People like to answer the question that wasn't being asked. My question: "Should these queries have different execution plans?" What they answer: "What kind of indexes and statistics can i use to ensure that these queries don't have different execution plans?" > If you think you know your data better than the optimizer does It seems to have no problem figuring out better data access strategies if i rewrite my queries to something identical. > you could use the REMOTE i'm doing a select, not a join. Plus, this isn't available in 2000.> join hint to tell the optimizer to process the join on the remote server. > Basically, the optimizer can only rely on statistics. Again, why does it then work sometimes, and not others - for the same query - same statistics - complete or not. > But when statistics are up to date on both sites, then the optimizer still i do appreciate the effort.> has full control over the execution of queries (unless specified otherwise > through the use of hints) which basically means that he can still make the > wrong assumptions - and I'm as much in the dark here as you are. Is anyone out there that knows the secret way to sucessfully query remote SQL Servers? This whole distributed federated sales pitch fails miserably whenever anyone uses it. And, as i've been asking for 3 or 4 years, is it a failing in the 2000 optimizer? Has it been fixed in 2005? If it has been fixed in 2005 then you have to admit it was broken in 2000. I dont know if it is improved in 2005, but I have always considered linked
servers (DB Links in Oracle) to be miserable for performance. They have their uses, but performance does not apear to be one of them. Show quote "Ian Boyd" <ian.msnews***@avatopia.com> wrote in message news:OUeVDFaWGHA.1900@TK2MSFTNGP04.phx.gbl... > > And, as i've been asking for 3 or 4 years, is it a failing in the 2000 > optimizer? Has it been fixed in 2005? If it has been fixed in 2005 then you > have to admit it was broken in 2000. > > FYI: the REMOTE join hint is available in SQL 2000 as well.
Described under "FROM" in Books Online. And, yes it's a join hint which can only be declared when joining local and remote objects. ML --- http://milambda.blogspot.com/ > Described under "FROM" in Books Online. Yeah, i was silly. i tried finding it on the web verses in the BOL.> And, yes it's a join hint which can only be declared when joining local Excellent, i can definetly make use of it when trying to join 30 local rows > and > remote objects. to 8million remote rows - and SQL Server insists on bringing 8M over here first. It's more like a finger than an arm, isn't it...? :)
The improvement for the view in SQL 2005 is refreshing news, though. If you don't have a blog up already, this whole experiment is definitely something to put up. Thanks! ML --- http://milambda.blogspot.com/ Can anyone
ever imagine any situation under any circumstances in any manner in any capacity in any way with any product running on any kind of data link fast or slow on any database setup existing in any universe moving at any velocity at any time since the big bang itself where it would be better to filter rows after fetching them? Absolutely. The same way there are cases when a full table scan is better
than using an index. If you have 1,000 rows in a table and you add a filter on and indexed field that returns 999 records, then SQL server will do a full table scan instead of using the index. If you have 1,000 rows in a remote table and you add a filter that returns 999 records, let the database that is actually going to use the data spend the cycles filtering it. Let the remote server just do a quick IO and send the data over the network with minimal CPU usage. Now, if the local server has statistics on the tables in the remote database, then it could make much better decisions regarding when to send the filter and when to not. Because it doesn't know the remote database, it has no idea how many rows will be returned (unless querying on a key field maybe?) and makes a (rather useless) judgement call. Show quote "Ian Boyd" <ian.msnews***@avatopia.com> wrote in message news:%23fsEP7ZWGHA.2064@TK2MSFTNGP03.phx.gbl... > Can anyone > ever imagine > any situation > under any circumstances > in any manner > in any capacity > in any way > with any product > running on any kind of data link > fast or slow > on any database setup > existing in any universe > moving at any velocity > at any time since the big bang itself > where it would be better to filter rows after fetching them? > > > If you have 1,000 rows in a remote table and you add a filter that returns i disagree. If the table contains only 1 row, perform the filter.> 999 records, let the database that is actually going to use the data spend > the cycles filtering it. Let the remote server just do a quick IO and > send > the data over the network with minimal CPU usage. Even if the table contains a million rows, and your statistics tell you that every row matches the filter, do the filter. You're going to have to serialize each row anyway - so don't send the non-matching ones. What you keep forgetting is that there are no statistics available until
after the query gets sent to the remote server. The local SQL server has to make a guess as to whether or not passing the parameter will be more efficent than processing the entire set. When it chooses (incorrectly in this case) to not send the parameter, the statistics are never taken into consideration. Show quote "Ian Boyd" <ian.msnews***@avatopia.com> wrote in message news:OvYnbhaWGHA.4148@TK2MSFTNGP03.phx.gbl... > > If you have 1,000 rows in a remote table and you add a filter that returns > > 999 records, let the database that is actually going to use the data spend > > the cycles filtering it. Let the remote server just do a quick IO and > > send > > the data over the network with minimal CPU usage. > > i disagree. If the table contains only 1 row, perform the filter. > > Even if the table contains a million rows, and your statistics tell you that > every row matches the filter, do the filter. > > You're going to have to serialize each row anyway - so don't send the > non-matching ones. > > > What you keep forgetting is that there are no statistics available until i see your point: always send over as much filtering criteria as possible.> after the query gets sent to the remote server. > The local SQL server has to It is always the incorrect choice to not send the parameters.> make a guess as to whether or not passing the parameter will be more > efficent than processing the entire set. When it chooses (incorrectly in > this case) to not send the parameter, the statistics are never taken into > consideration. "Ian Boyd" <ian.msnews***@avatopia.com> wrote in message On this point we differ. Although I would agree that it would benews:%23ImnTzaWGHA.196@TK2MSFTNGP04.phx.gbl... > It is always the incorrect choice to not send the parameters. > appropriate almost every time, and as a result sending over the criteria by default would be the logical thing to do, there would be rare cases when it might be more efficent to do it locally. Of course you don't write the code to cater to the rare occurances by sacrificing the performance on the vast majorty. I agree that is should be sending the parameters, assuming there are not compatability issues with them. However, soemtimes SQL server may not be smart enough to properly break down a query and determine which parameters can be passed to the server and which must be applied locally. Here is some SQL to create a table named "Transactions", and will contain
some fields including "TransactionDate", "TransactionNumber", "TransactionGUID" USE pubs go DROP TABLE Transactions go CREATE Table Transactions ( [TransactionID] [int] NOT NULL IDENTITY, TransactionGUID uniqueidentifier NOT NULL default newid(), [title] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [type] [char] (12) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [pub_id] [char] (4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL , [price] [money] NULL , [advance] [money] NULL , [royalty] [int] NULL , [ytd_sales] [int] NULL , [notes] [varchar] (200) NULL , [TransactionDate] [datetime] NOT NULL DEFAULT (getdate()) ) go INSERT INTO Transactions (title, type, pub_id, price, advance, royalty, ytd_sales, notes, TransactionDate) SELECT [title], [type], [pub_id], [price], [advance], [royalty], [ytd_sales], [notes], CAST(CAST([pubdate] as real) + RAND(3234)*2000 - 1000 AS datetime) FROM Titles CROSS JOIN ( SELECT (a.Number * 256) + b.Number AS Number FROM master..spt_values a, master..spt_values b WHERE a.Type = 'p' AND b.Type = 'p') numbers You'll have to create the linked server yourself, you'll need two servers. You'll also have to substitute your own values for TransactionGUID, TransactionDate, TransactionNumber I am guessing to a fair extent here, but some possible (and I think likely)
explanations for this behavior... The main thing is, I think, that SQL Server has no statistics available when accessing a remote database and makes a best guess based on the existence of keys. Incidently, when working with Oracle 8 I found similar issues with their "database links". >Question #1: Why doesn't it include the WHERE clause in the remote query? No Index on transactionGUID?>Question #2: How is it bringing over 10 rows only? (hint: cursor) 10 rows is the size of the chunks retrieved. The top 1 is in the first tenrows, so SQL Server doesnt bother retrieving any further data from the remote database. >Question #3: Why does it include the WHERE clause in the remote query. Index or PK on TransactionNumber?>Question #4: Why does it include the WHERE clause in the remote query? Index or PK on TransactionDate?>Question #5: Why does it not include the WHERE clause in the remote query? Entire view is retrieved. The optimizer will sometimes do this with localviews as well, depending on how they are used. >Question #6: Why is it now including the TOP inside the remote query, when Top says return only X rows from the view. Because it needs to order the>my earlier issue of a TOP query didn't include the TOP limiter? >Question #7: Why is it now including the WHERE clause in the remote query, >when not including a TOP 1 it won't include the where clause limiter? >Question #8: Why does it include the WHERE clause when i have any TOP >limiter (even TOP 999999999, or TOP 100 PERCENT), but does not include the >where clause when i don't? results, a different execution plan is returned. This will happen with local views also, depending on statistics and how they are used. >Question #9: Why does it include the WHERE clause when i query through a The local SQL server is rewriting your select to directly access the table.>derived table, and not through a view? This particular SQL may look different to a human, but logically, and to SQL Server, it is identical to your 3rd example. This is perfectly consistent. Show quote "Ian Boyd" <ian.msnews***@avatopia.com> wrote in message news:O$7N6IZWGHA.4424@TK2MSFTNGP05.phx.gbl... > i'll just keep posting the problems, as my own archive. > > SELECT TransactionNumber > FROM servertest.CMSArchiveTraining.dbo.Transactions_90 > WHERE transactionGUID = '4956E240-8B6E-437E-B9B0-5E83FC25E3F0' > > Issues the remote query: > SOURCE:(servertest), > QUERY:( > SELECT Tbl1001."TransactionGUID" > Col1003,Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > ) > > And tries to bring back 8.5M rows - and takes minutes to return, when really > there is only one row. > > Question #1: Why doesn't it include the WHERE clause in the remote query? > > If i change the query to: > > SELECT TOP 1 TransactionNumber > FROM servertest.CMSArchiveTraining.dbo.Transactions_90 > WHERE transactionGUID = '4956E240-8B6E-437E-B9B0-5E83FC25E3F0' > > It still issues the remote query: > SOURCE:(servertest), > QUERY:( > SELECT > Tbl1001."TransactionGUID" Col1003, > Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > ) > > But this time it only brings over 10 rows (somehow), performs a top 1 > filter, and returns instantly. > > Question #2: How is it bringing over 10 rows only? (hint: cursor) > > If i change the query to: > SELECT TransactionNumber > FROM servertest.CMSArchiveTraining.dbo.Transactions_90 > WHERE TransactionNumber = 9679 > > It issues the remote query: > SOURCE:(servertest), > QUERY:( > SELECT > Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionNumber"=(9679) > ) > > which returns only one row, and returns instantly. Obviously SQL Server can > *sometimes* do the optimization, othertimes it won't. > > Question #3: Why does it include the WHERE clause in the remote query. > > If i change the query to: > SELECT TransactionNumber > FROM servertest.CMSArchiveTraining.dbo.Transactions_90 > WHERE TransactionDate = '2002-06-10 08:19:10.513' > > It issues the remote query: > SOURCE:(servertest), > QUERY:( > SELECT > Tbl1001."TransactionNumber" Col1004 > FROM "cmsarchivetraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionDate"='2002-06-10T08:19:10.513' > ) > > Question #4: Why does it include the WHERE clause in the remote query? > > Now i create a view: > CREATE VIEW CMSArchiveTranasctions AS > SELECT * > FROM SERVERTEST.CMSArchiveTraining.dbo.Transactions_90 > > and issue the query: > > SELECT TransactionNumber > FROM CMSArchiveTransactions > WHERE TransactionNumber = 9679 > > It wants to issues the remote query: > SOURCE:(SERVERTEST), > QUERY:( > SELECT > Tbl1001."TransactionNumber" Col1005 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > ) > > And tries to bring back 8.5M rows - and takes minutes to return, when really > there is only one row. > > Question #5: Why does it not include the WHERE clause in the remote query? > > If i change the query to: > SELECT TOP 1 TransactionNumber > FROM CMSArchiveTransactions > WHERE TransactionNumber = 9679 > > it issues the remote query: > SOURCE:(SERVERTEST), > QUERY:( > SELECT TOP 1 Col1004 > FROM ( > SELECT > Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionNumber"=(9679) > ) Qry1018 > ) > > Question #6: Why is it now including the TOP inside the remote query, when > my earlier issue of a TOP query didn't include the TOP limiter? > > Question #7: Why is it now including the WHERE clause in the remote query, > when not including a TOP 1 it won't include the where clause limiter? > > Question #8: Why does it include the WHERE clause when i have any TOP > limiter (even TOP 999999999, or TOP 100 PERCENT), but does not include the > where clause when i don't? > > So i try changing my query to: > SELECT TransactionNumber > FROM ( > SELECT * > FROM SERVERTEST.CMSArchiveTraining.dbo.Transactions_90) t > WHERE TransactionNumber = 9679 > > And it issues the remote query: > SOURCE:(SERVERTEST), > QUERY:( > SELECT > Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionNumber"=(9679) > ) > > So querying through a view it will not include the WHERE clause, but if i > include the view as a derived table, it can include the WHERE clause. > > Question #9: Why does it include the WHERE clause when i query through a > derived table, and not through a view? > > So i try changing the query to > SELECT TransactionNumber > FROM ( > SELECT * > FROM CMSArchiveTransactions) t > WHERE TransactionNumber = 9679 > > So here we are, pay attention to this one. If i attempt to query the view > directly, it doesn't get optimized. Now i am going to query my view THROUGH > a derived table. What do you think it will do? > > SOURCE:(SERVERTEST), > QUERY:( > SELECT Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionNumber"=(9679) > ) > > It in fact now DOES include the where clause, but if i try to query my view > directly, it can't optimize it. This is a stunning development. > > It means the rules are random, which makes it difficult to optimize > distributed queries. > > > Rules for including where clause: > Filter on uniqueidentifer: > Query includes TOP 1: Yes* > Query includes TOP 2: No > Query includes TOP n: No > Query includes TOP n PERCENT: No > Query doesn't include TOP: No > Filter on integer: > Query linked SQL Server: Yes > Query view that queries linked SQL Server: > Query view directly: > Query includes TOP 1: Yes > Query includes TOP 2: Yes > Query includes TOP 100: Yes > Query includes TOP 999999999: Yes > Query includes TOP 100 PERCENT: Yes > Query doesn't include TOP: No > Query view through derived table: > Query includes TOP 1: Yes > Query includes: TOP 2: Yes > Query includes: TOP n: Yes > Query includes: TOP 100 PERCENT: Yes > Query includes TOP: Yes > > None of this is index or statistics related. Rewriting the queries into a
different form causes SQL Server to just "figure it out." It is an optimizer failing. >>Question #1: Why doesn't it include the WHERE clause in the remote query? At least for this one i have documentation on how it's the optimizer's > No Index on transactionGUID? fault: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/optimsql/odp_tun_1a_6oxf.asp aka http://tinyurl.com/ekfah Read from the 2nd paragraph under "Remote Query Execution" So that explains Question#1. Can anyone explain questions #2 through #9? >>Question #2: How is it bringing over 10 rows only? (hint: cursor) Question #10: Why does it not do that for a TOP 2?> 10 rows is the size of the chunks retrieved. The top 1 is in the first > ten > rows, so SQL Server doesnt bother retrieving any further data from the > remote database. > Index or PK on ... Question #12: Why does it sometimes work well, and not others, when i don't Indexed, fullscanned. change indexes or statistics. >>Question #5: Why does it not include the WHERE clause in the remote query? The "entire view is returned", and yet it can also sometimes work without > Entire view is retrieved. >>Question #6: Why is it now including the TOP inside the remote query, when >>my earlier issue of a TOP query didn't include the TOP limiter? >>Question #7: Why is it now including the WHERE clause in the remote query, >>when not including a TOP 1 it won't include the where clause limiter? >>Question #8: Why does it include the WHERE clause when i have any TOP >>limiter (even TOP 999999999, or TOP 100 PERCENT), but does not include the >>where clause when i don't? > Top says return only X rows from the view. Because it needs to order the > results, a different execution plan is returned. This will happen with > local views also, depending on statistics and how they are used. returning the entire view. e.g. Bad SELECT * FROM View WHERE ... Good: SELECT * FROM (SELECT * FROM View) dt WHERE ... > The local SQL server is rewriting your select to directly access the They are all identical, yet SQL Server performs then differently.> table. > This particular SQL may look different to a human, but logically, and to > SQL > Server, it is identical to your 3rd example. Query#1 SELECT * FROM SERVERTEST.dbo.CMSArchiveTraining.Transactions_90 WHERE TransactionNumber = 1234 is identical to Query#2 SELECT * FROM ( SELECT * FROM SERVERTEST.dbo.CMSArchiveTraining.Transactions_90) AS DerivedTable WHERE TransactionNumber = 1234 is identical to CREATE VIEW Transactions AS SELECT * FROM SERVERTEST.dbo.CMSArchiveTraining.Transactions_90 Query#3 SELECT * FROM Transactions WHERE TransactionNumber = 1234 is identical to CREATE VIEW Transactions AS SELECT * FROM SERVERTEST.dbo.CMSArchiveTraining.Transactions_90 Query#4 SELECT * FROM ( SELECT * FROM Tranasctions) AS DerivedTable WHERE TransactionNumber = 1234 And yet they all perform differently. Some perform exactly as you would expect, and others are unusable. And this wide variation all happens without changing anything on the remote server. Indexes are there, statistics are there (or equivalently indexes are not there, statistics are not there). It's not like i change anything on the remote server between each query attempt. Query 1, 2, 3 and 4 above are all the same. Why do they perform differently? Can i get pointed to a section in the BOL that explains how to why they could possibly perform differently. Then, i need the documentation that explains the right one to use. Generally, any construct involved a distributed view is bad. You have to rewrite your queries to not us a view, but you have to queries the remote servers yourself and join the results. Hence the half-dozen posts i've made on the subject in the last 4 years. Hello Ian.
I'm sorry that you've been having trouble with the distributed query feature. I'll attempt to answer some of your questions, but I'll preface this by saying that the reasons are not easy to explain. First, SQL 2005 now properly remotes filters over the uniqueidentifier type. The reasons for this are partially because we did not have statistical support and partially due to the fact that people weren't generally doing comparisons over these types when SQL 2000 was shipped. We've added statistical support for GUIDs in SQL 2005, and this allows us to make better choices for remoting and we enabled this functionality. In regards to some of your other questions, there are sometimes functional reasons why we can not remote things. (ie we would not return correct results in all cases if we did remote this element of the query). This can cause the query to always be executed locally. In other cases, we didn't have the resources to get every element remoted (and tested) before shipping. I know quite well that this causes customer pain, and there isn't a better answer on this. Without debugging, it's hard to make official answers on your other problems. I would suspect that the TOP case would retrieve a network packet full of rows and then perform the TOP if it is not remoted. This might be ~10 rows in your case. The interaction of several features is likely the cause of this, and 10 rows vs. 1 row isn't a huge performance issue if they always fit into 1 logical network packet. We wrote a paper that described some of the technicals about how the distributed query feature is implemented in SQL Server. Here's a link to it. I'll warn everyone that it's a research conference paper, and it doesn't go into the practical details at the level that you want for specific questions like "why doesn't it remote X?", but it should give an overview about how the system works and this might help you answer some of those specific questions you have the next time you see an issue and want to get more insight into why.. http://citeseer.ist.psu.edu/733756.html I'll try to post up a link to this on the Optimizer's team blog later today: http://blogs.msdn.com/QueryOptTeam I hope that this helps answer some of your questions. Thanks, Conor Show quote "Ian Boyd" <ian.msnews***@avatopia.com> wrote in message news:ODrpebaWGHA.4972@TK2MSFTNGP02.phx.gbl... > None of this is index or statistics related. Rewriting the queries into a > different form causes SQL Server to just "figure it out." It is an > optimizer failing. > >>>Question #1: Why doesn't it include the WHERE clause in the remote query? >> No Index on transactionGUID? > > At least for this one i have documentation on how it's the optimizer's > fault: > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/optimsql/odp_tun_1a_6oxf.asp > > aka http://tinyurl.com/ekfah > Read from the 2nd paragraph under "Remote Query Execution" > > So that explains Question#1. Can anyone explain questions #2 through #9? > >>>Question #2: How is it bringing over 10 rows only? (hint: cursor) >> 10 rows is the size of the chunks retrieved. The top 1 is in the first >> ten >> rows, so SQL Server doesnt bother retrieving any further data from the >> remote database. > > Question #10: Why does it not do that for a TOP 2? > >> Index or PK on ... > Indexed, fullscanned. > > Question #12: Why does it sometimes work well, and not others, when i > don't change indexes or statistics. > >>>Question #5: Why does it not include the WHERE clause in the remote >>>query? >> Entire view is retrieved. > >>>Question #6: Why is it now including the TOP inside the remote query, >>>when >>>my earlier issue of a TOP query didn't include the TOP limiter? >>>Question #7: Why is it now including the WHERE clause in the remote >>>query, >>>when not including a TOP 1 it won't include the where clause limiter? >>>Question #8: Why does it include the WHERE clause when i have any TOP >>>limiter (even TOP 999999999, or TOP 100 PERCENT), but does not include >>>the >>>where clause when i don't? >> Top says return only X rows from the view. Because it needs to order the >> results, a different execution plan is returned. This will happen with >> local views also, depending on statistics and how they are used. > > The "entire view is returned", and yet it can also sometimes work without > returning the entire view. > > e.g. Bad > SELECT * FROM View > WHERE ... > Good: > SELECT * FROM (SELECT * FROM View) dt > WHERE ... > >> The local SQL server is rewriting your select to directly access the >> table. >> This particular SQL may look different to a human, but logically, and to >> SQL >> Server, it is identical to your 3rd example. > > They are all identical, yet SQL Server performs then differently. > > Query#1 > SELECT * > FROM SERVERTEST.dbo.CMSArchiveTraining.Transactions_90 > WHERE TransactionNumber = 1234 > > is identical to > > Query#2 > SELECT * > FROM ( > SELECT * > FROM SERVERTEST.dbo.CMSArchiveTraining.Transactions_90) AS DerivedTable > WHERE TransactionNumber = 1234 > > is identical to > > CREATE VIEW Transactions AS > SELECT * > FROM SERVERTEST.dbo.CMSArchiveTraining.Transactions_90 > > Query#3 > SELECT * > FROM Transactions > WHERE TransactionNumber = 1234 > > is identical to > > CREATE VIEW Transactions AS > SELECT * > FROM SERVERTEST.dbo.CMSArchiveTraining.Transactions_90 > > Query#4 > SELECT * > FROM ( > SELECT * FROM Tranasctions) AS DerivedTable > WHERE TransactionNumber = 1234 > > And yet they all perform differently. Some perform exactly as you would > expect, and others are unusable. And this wide variation all happens > without changing anything on the remote server. Indexes are there, > statistics are there (or equivalently indexes are not there, statistics > are not there). It's not like i change anything on the remote server > between each query attempt. > > Query 1, 2, 3 and 4 above are all the same. Why do they perform > differently? Can i get pointed to a section in the BOL that explains how > to why they could possibly perform differently. Then, i need the > documentation that explains the right one to use. Generally, any construct > involved a distributed view is bad. You have to rewrite your queries to > not us a view, but you have to queries the remote servers yourself and > join the results. > > Hence the half-dozen posts i've made on the subject in the last 4 years. > > First, SQL 2005 now properly remotes filters over the uniqueidentifier i tested this, and posted my results as a respose to my OP. It doesn't > type. remotely filter on a uniqueidentifier. i should preface that by saying that SQL 2005 does not remote it - i don't know if it is important that the remote server is not SQL 2005 (i wouldn't think it would be important) > The reasons for this are partially because we did not have statistical Is that why SQL 2005 isn't remotely filtering on uniqueidentifiers, the > support and partially due to the fact that people weren't generally doing > comparisons over these types when SQL 2000 was shipped. We've added > statistical support for GUIDs in SQL 2005, and this allows us to make > better choices for remoting and we enabled this functionality. remote data source does not have statistics available on a uniqueidentifier column? Does SQL Server just not collect them, even if you think you have them? Are statistics important in order to remote a filter? > In regards to some of your other questions, there are sometimes functional Would it be possible that any of my examples fit into this category? i > reasons why we can not remote things. (ie we would not return correct > results in all cases if we did remote this element of the query). This > can cause the query to always be executed locally. cannot imagine how remoting more of the query can affect the results - especially since when it does perform remoting - the results are fine. > In other cases, we didn't have the resources to get every element remoted Remoting is partially done in 2000. i've been fighting with it for a few > (and tested) before shipping. I know quite well that this causes customer > pain, and there isn't a better answer on this. years, and it's not done in 2005? For shame. Show quote > Without debugging, it's hard to make official answers on your other Any free mirrors?> problems. I would suspect that the TOP case would retrieve a network > packet full of rows and then perform the TOP if it is not remoted. This > might be ~10 rows in your case. The interaction of several features is > likely the cause of this, and 10 rows vs. 1 row isn't a huge performance > issue if they always fit into 1 logical network packet. > > We wrote a paper that described some of the technicals about how the > distributed query feature is implemented in SQL Server. > > Here's a link to it. I'll warn everyone that it's a research conference > paper, and it doesn't go into the practical details at the level that you > want for specific questions like "why doesn't it remote X?", but it should > give an overview about how the system works and this might help you answer > some of those specific questions you have the next time you see an issue > and want to get more insight into why.. > > http://citeseer.ist.psu.edu/733756.html > I'll try to post up a link to this on the Optimizer's team blog later Still left trying to infer things...and decipher SQL's thinking by trial and > today: > http://blogs.msdn.com/QueryOptTeam > I hope that this helps answer some of your questions. error. OP was for 2000 talking to 2000
Now testing 2005 talking to 2000. Test#1 > SELECT TransactionNumber 2005: Same, fails.> FROM servertest.CMSArchiveTraining.dbo.Transactions_90 > WHERE transactionGUID = '4956E240-8B6E-437E-B9B0-5E83FC25E3F0' > > QUERY:( > SELECT Tbl1001."TransactionGUID" > Col1003,Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > ) Test#2 > SELECT TOP 1 TransactionNumber 2005: Same, acceptable fail.> FROM servertest.CMSArchiveTraining.dbo.Transactions_90 > WHERE transactionGUID = '4956E240-8B6E-437E-B9B0-5E83FC25E3F0' > > QUERY:( > SELECT > Tbl1001."TransactionGUID" Col1003, > Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > ) > But this time it only brings over 10 rows (somehow), performs a top 1 > filter, and returns instantly. Test#3 > SELECT TransactionNumber 2005: Same, pass.> FROM servertest.CMSArchiveTraining.dbo.Transactions_90 > WHERE TransactionNumber = 9679 > > QUERY:( > SELECT > Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionNumber"=(9679) > ) Test#4 > SELECT TransactionNumber Same: Same, pass.> FROM servertest.CMSArchiveTraining.dbo.Transactions_90 > WHERE TransactionDate = '2002-06-10 08:19:10.513' > > QUERY:( > SELECT > Tbl1001."TransactionNumber" Col1004 > FROM "cmsarchivetraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionDate"='2002-06-10T08:19:10.513' > ) Test#5 > CREATE VIEW CMSArchiveTranasctions AS 2005: FIXED!> SELECT * > FROM SERVERTEST.CMSArchiveTraining.dbo.Transactions_90 > > SELECT TransactionNumber > FROM CMSArchiveTransactions > WHERE TransactionNumber = 9679 > > QUERY:( > SELECT > Tbl1001."TransactionNumber" Col1005 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > ) QUERY:( SELECT Tbl1001."TransactionNumber" Col1005 FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 WHERE Tbl1001."TransactionNumber"=? ) It parameterized the query, interesting. Test#6 > SELECT TOP 1 TransactionNumber 2005: Same, pass.> FROM CMSArchiveTransactions > WHERE TransactionNumber = 9679 > > QUERY:( > SELECT TOP 1 Col1004 > FROM ( > SELECT > Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionNumber"=(9679) > ) Qry1018 > ) Test#7 > SELECT TransactionNumber 2005: Same, pass.> FROM ( > SELECT * > FROM SERVERTEST.CMSArchiveTraining.dbo.Transactions_90) t > WHERE TransactionNumber = 9679 > > QUERY:( > SELECT > Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionNumber"=(9679) > ) Test#8 > SELECT TransactionNumber 2005: Same, pass.> FROM ( > SELECT * > FROM CMSArchiveTransactions) t > WHERE TransactionNumber = 9679 > > QUERY:( > SELECT Tbl1001."TransactionNumber" Col1004 > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > WHERE Tbl1001."TransactionNumber"=(9679) > ) Fixed: querying a remote server through a view. Still broken: Filtering by uniqueidentifier (and presumably also bit) Still hack: Using TOP 1 to select a single row when filtering by uniqueidentifier. But the view thing alone is nice. i hope it keeps working.
Show quote
> Test#5 Nice. Have you tried the view with the guid maybe?> > CREATE VIEW CMSArchiveTranasctions AS > > SELECT * > > FROM SERVERTEST.CMSArchiveTraining.dbo.Transactions_90 > > > > SELECT TransactionNumber > > FROM CMSArchiveTransactions > > WHERE TransactionNumber = 9679 > > > > QUERY:( > > SELECT > > Tbl1001."TransactionNumber" Col1005 > > FROM "CMSArchiveTraining"."dbo"."Transactions_90" Tbl1001 > > ) > > 2005: FIXED! ML --- http://milambda.blogspot.com/ > Nice. Have you tried the view with the guid maybe? That is an excellent idea.Unfortunaly, when i try to query the linked server from 2005 all i get is "An error occurred while executing batch. Error message is: Internal connection fatal error." Yet i can query it fine via a linked server from 2000. So, if 2005 decides to start working again i will have an answer for you. Otherwise, we will never know. At ML's suggestion, i've testing querying by uniqueidentifier from a 2005
machine to 2000. (i.e. 2005 to something remote) NOTES 1. A "pass" is if the WHERE clause is passed to the remote machine. 2. A "fail" is if the 8 million rows are transferred across the network in order to find the one matching 3. The view is defined as: CREATE VIEW CMSArchiveTransactions AS SELECT * FROM servertest.CMSArchiveTraining.dbo.Transactions_90 4. Collation compatable option on the linked server does not affect the results. 5. It is claimed that 2005 supports remoting of uniqueidentifiers (where 2000 documented that it did not support it). http://blogs.msdn.com/QueryOptTeam/ Test#1 SELECT TransactionNumber FROM servertest.CMSArchiveTraining.dbo.Transactions_90 WHERE TransactionGUID = '3A753496-CEB5-4DF6-8855-833BFDB22094' Result: FAIL Test#2 SELECT TOP 1 TransactionNumber FROM servertest.CMSArchiveTraining.dbo.Transactions_90 WHERE TransactionGUID = '3A753496-CEB5-4DF6-8855-833BFDB22094' Result: FAIL* (didn't remote filter, but stopped after bringing 9 rows over) Test#3 SELECT TransactionNumber FROM (SELECT * FROM servertest.CMSArchiveTraining.dbo.Transactions_90) dt WHERE TransactionGUID = '3A753496-CEB5-4DF6-8855-833BFDB22094' Result: FAIL Test#4 SELECT TransactionNumber FROM CMSArchiveTransactions WHERE TransactionGUID = '3A753496-CEB5-4DF6-8855-833BFDB22094' Result: FAIL Test#5 SELECT TransactionNumber FROM (SELECT * FROM CMSArchiveTransactions) dt WHERE TransactionGUID = '3A753496-CEB5-4DF6-8855-833BFDB22094' Result: FAIL Test#6 SELECT TOP 1 TransactionNumber FROM (SELECT * FROM CMSArchiveTransactions) dt WHERE TransactionGUID = '3A753496-CEB5-4DF6-8855-833BFDB22094' Result: FAIL* (didn't remote filter, but stopped after bringing 9 rows over) SUMMARY SQL 2005 will not remote uniqueidentifiers. They don't remote queries from SQL 2005 to SQL 2000 that contain filters comparing columns to constant uniqueidentifier values as well. The game is: What should the remote query be?
Get ready. Go! NOTE: These can be solved by inspection. No knowledge of DDL, indexes or statistics is required. Question#1: Querying a linked server SELECT * FROM servertest.CMSArchiveTraining.dbo.Transactions_90 WHERE TransactionGUID = '9B2D0E34-F91C-45E8-A904-1BDA0E72802B' Answer#1: QUERY:( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE TransactionGUID = '9B2D0E34-F91C-45E8-A904-1BDA0E72802B' ) Question#2: Querying a linked server through a derived table SELECT * FROM (SELECT * FROM servertest.CMSArchiveTraining.dbo.Transactions_90) CMSArchiveTransactions WHERE TransactionGUID = '9B2D0E34-F91C-45E8-A904-1BDA0E72802B' Answer#2: QUERY:( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE TransactionGUID = '9B2D0E34-F91C-45E8-A904-1BDA0E72802B' ) Question#3: Querying a linked server through a view CREATE VIEW CMSArchiveTransactions AS SELECT * FROM servertest.CMSArchiveTraining.dbo.Transactions_90 SELECT * FROM CMSArchiveTransactions WHERE TransactionGUID = '9B2D0E34-F91C-45E8-A904-1BDA0E72802B' Answer#3: QUERY:( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE TransactionGUID = '9B2D0E34-F91C-45E8-A904-1BDA0E72802B' ) Question#4: Joining to a linked server table SELECT CMSArchiveTransactions.* FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID Answer#4: QUERY:( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" ) Question#5: Joining to a linked server with join criteria SELECT CMSArchiveTransactions.* FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID WHERE c.CustomerID = 4463 Answer#5: QUERY:( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE CustomerID = 4463 ) Question#6: Joining to a linked server with join criteria SELECT CMSArchiveTransactions.* FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID WHERE t.TransactionAmount > 1000 Answer#6: QUERY:( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE TransactionAmount > 1000 ) Question#7: Joining to a linked server with join criteria SELECT CMSArchiveTransactions.* FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID WHERE t.TransactionAmount > 1000 AND c.CustomerID = 4463 Answer#7: QUERY:( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE TransactionAmount > 1000 AND CustomerID = 4463 ) Question#8 Joining to a linked server with join criteria SELECT c.CustomerName, t.TransactionAmount FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID WHERE c.CustomerID = 4463 Answer#8: QUERY:( SELECT CustomerID, TransactionAmount FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE CustomerID = 4463 ) Question#9: Joining to a linked server with join criteria SELECT c.CustomerName, t.TransactionAmount FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID WHERE c.CustomerID >= 4000 AND c.CustomerID <= 5000 Answer#9: QUERY:( SELECT CustomerID, TransactionAmount FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE CustomerID >= 4000 AND CustomerID <= 5000 ) Question#10: Joining to a linked server with aggregates ELECT c.CustomerID, c.CustomerName, LastCustomerTransactions.LastTransactionDate FROM Customers c LEFT JOIN ( SELECT CustomerID, MAX(TransactionDate) AS LastTransactionDate FROM CMSArchiveTransactions GROUP BY CustomerID) AS LastCustomerTransactions ON c.CustomerID = LastCustomerTransactions.CustomerID Answer#10: QUERY:( SELECT CustomerID, MAX(TransactionDate) Col1042 FROM "CMSArchiveTraining"."dbo"."Transactions_90" ) Question#11 Joining to a linked server with aggregates and join criteria SELECT c.CustomerID, c.CustomerName, LastCustomerTransactions.LastTransactionDate FROM Customers c LEFT JOIN ( SELECT CustomerID, MAX(TransactionDate) AS LastTransactionDate FROM CMSArchiveTransactions GROUP BY CustomerID) AS LastCustomerTransactions ON c.CustomerID = LastCustomerTransactions.CustomerID WHERE c.CustomerID = 4463 Answer#11: QUERY:( SELECT CustomerID, MAX(TransactionDate) Col1042 FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE CustomerID = 4463 ) Question#12: Joining to a linked server with aggregates SELECT c.CustomerID, c.CustomerName, LastCustomerTransactions.LastTransactionDate FROM Customers c LEFT JOIN ( SELECT CustomerID, MAX(TransactionDate) AS LastTransactionDate FROM CMSArchiveTransactions GROUP BY CustomerID) AS LastCustomerTransactions ON c.CustomerID = LastCustomerTransactions.CustomerID WHERE c.CustomerName = 'Ian Boyd' Answer#12: QUERY:( SELECT CustomerID, MAX(TransactionDate) Col1042 FROM "CMSArchiveTraining"."dbo"."Transactions_90" ) Question#13: Joining to a linked server with aggregates SELECT c.CustomerName, SUM(CMSArchiveTransactions.TransactionAmount) FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID GROUP BY c.CustomerName Answer#13: QUERY:( SELECT CustomerID, SUM(TransactionAmount) Col1042 FROM "CMSArchiveTraining"."dbo"."Transactions_90" GROUP BY CustomerID ) Question#14: Joining to a linked server with aggregates SELECT c.CustomerName, SUM(CMSArchiveTransactions.TransactionAmount) FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID WHERE CustomerName = 'Ian Boyd' GROUP BY c.CustomerName Answer#14: QUERY:( SELECT CustomerID, SUM(TransactionAmount) Col1042 FROM "CMSArchiveTraining"."dbo"."Transactions_90" GROUP BY CustomerID ) Question#15: Joining to a linked server with aggregates SELECT c.CustomerName, SUM(CMSArchiveTransactions.TransactionAmount) FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID WHERE c.CustomerName LIKE '%ian%' GROUP BY c.CustomerName Answer#15: QUERY:( SELECT CustomerID, SUM(TransactionAmount) Col1042 FROM "CMSArchiveTraining"."dbo"."Transactions_90" GROUP BY CustomerID ) Question#16: Joining to a linked server with aggregates SELECT c.CustomerName, SUM(CMSArchiveTransactions.TransactionAmount) FROM CMSArchiveTransactions t INNER JOIN Customers c ON t.CustomerID = c.CustomerID WHERE CustomerName LIKE '%ian%' AND c.CustomerID >= 4000 AND c.CustomerID <= 5000 GROUP BY c.CustomerName Answer#16: QUERY:( SELECT CustomerID, SUM(TransactionAmount) Col1042 FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE CustomerID >= 4000 AND CustomerID <= 5000 GROUP BY CustomerID ) Question#17: Right join to linked table SELECT * FROM Customers c RIGHT OUTER JOIN CMSArchiveTransactions t ON c.CustomerID = t.CustomerID Answer#17: QUERY: ( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" ) Question#18: Right join to linked table with left table filtering SELECT * FROM Customers c RIGHT OUTER JOIN CMSArchiveTransactions t ON c.CustomerID = t.CustomerID WHERE c.CustomerID = 4463 Answer#18: QUERY: ( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" ) Question#19: Right join to linked table with right table filtering SELECT * FROM Customers c RIGHT OUTER JOIN CMSArchiveTransactions t ON c.CustomerID = t.CustomerID WHERE t.TransactionAmount >= 1000 Answer#19: QUERY: ( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE TransactionAmount >= 1000 ) Question#20: Right join to linked table with both table filtering SELECT * FROM Customers c RIGHT OUTER JOIN CMSArchiveTransactions t ON c.CustomerID = t.CustomerID WHERE c.CustomerID = 4463 AND t.TransactionAmount >= 1000 Answer#20: QUERY: ( SELECT * FROM "CMSArchiveTraining"."dbo"."Transactions_90" WHERE TransactionAmount >= 1000 ) Extra credit questions.
Question#21: CREATE VIEW AllTransactions AS SELECT t.*, 'local' AS SourceTable FROM LocalTransactions t UNION ALL SELECT t.*, 'remote' AS SourceTable FROM servertest.CMSArchiveTraining.dbo.Transactions_90 SELECT * FROM AllTransactions WHERE Source = 'local' Answer#21 no remote query Question#22 SELECT * FROM AllTransactions WHERE Source <> 'remote' Answer#22 no remote query Question#23 SELECT MAX(TranasctionDate) FROM AllTransactions Answer#23 QUERY( SELECT MAX(TransactionDate) Col1402 FROM CMSArchiveTraining.dbo.Transactions_90) Bonus points: AGGREGATE CONCAT LOCALQUERY( SELECT MAX(TransactionDate) Col1402 FROM LocalTransactions) REMOTEQUERY( SELECT MAX(TransactionDate) Col1402 FROM CMSArchiveTraining.dbo.Transactions_90) Question#24 SELECT c.CustomerID, MAX(TransactionDate) FROM Customers c INNER JOIN AllTransactions t GROUP BY c.CustomerID Answer#24 QUERY( SELECT CustomerID, MAX(TranasctionDate) FROM CMSArchiveTraining.dbo.Transactions_90 GROUP BY CustomerID) Bonus points: INNER JOIN Customers AGGREGATE CONCAT REMOTEQUERY( SELECT CustomerID, MAX(TranasctionDate) FROM CMSArchiveTraining.dbo.Transactions_90 GROUP BY CustomerID) LOCALQUERY( SELECT CustomerID, MAX(TransactionDate) FROM LocalTransactions GROUP BY CustomerID) |
|||||||||||||||||||||||