|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Subqueries vs. JoinsI was writing a demo case for instructional purposes on the virtues of using joins instead of subqueries, and I was planning on using test cases like these (using Northwind): Code: SELECT o.orderid, o.orderdate, (SELECT companyname FROM shippers s WHERE s.shipperid = o.shipvia) companyname, (SELECT lastname + ', ' + firstname FROM employees e WHERE e.employeeid = o.employeeid) FullName FROM orders o WHERE o.orderdate between '1-1-1996' and '12-31-1996' ORDER BY o.orderdate Code: SELECT o.orderid, o.orderdate, s.companyname, e.lastname + ', ' + e.firstname FullName FROM orders o INNER JOIN shippers s ON s.shipperid = o.shipvia INNER JOIN employees e ON e.employeeid = o.employeeid WHERE o.orderdate between '1-1-1996' and '12-31-1996' ORDER BY o.orderdate The first uses subqueries, the second uses joins. (Please ignore for the moment that I would probably never return first + last name in this manner, or that this query is inane to begin with.) The point was going to be: don't use a subquery when you can use a join, as it's less efficient. If you run these queries as-is, the execution plan shows a (slightly) better "cost" efficiency for the one that uses joins. However, if you remove the WHERE and ORDER BY clauses, the subquery has a (significantly!) better cost efficiency (running them together in QA and then checking the combined execution plan is the easiest way to get an idea of relative costs, if you want to try this yourself). Anyone know why the subqueries would be more efficient in this case? If the tables had 50,000 records in them, do you think the joins would be more efficient? Anyone know of a good source on this subject? Thanks for any insight, Phil Hi
The point is does an optimizer available to use any indexes defined on the tables , doesn't it? Personally, I'd prefer using JOIN's with an index on FK columns . http://www.sql-server-performance.com/tuning_joins.asp http://www.sql-server-performance.com/query_execution_plan_analysis.asp http://www.sql-server-performance.com/transact_sql_where.asp <psandle***@hotmail.com> wrote in message Show quote news:1123821016.890254.39970@z14g2000cwz.googlegroups.com... > Hello all, > > I was writing a demo case for instructional purposes on the virtues of > using joins instead of subqueries, and I was planning on using test > cases like these (using Northwind): > > Code: > SELECT > o.orderid, > o.orderdate, > (SELECT companyname FROM shippers s WHERE s.shipperid = o.shipvia) > companyname, > (SELECT lastname + ', ' + firstname FROM employees e WHERE > e.employeeid = o.employeeid) FullName > FROM > orders o > WHERE > o.orderdate between '1-1-1996' and '12-31-1996' > ORDER BY > o.orderdate > > > > Code: > SELECT > o.orderid, > o.orderdate, > s.companyname, > e.lastname + ', ' + e.firstname FullName > FROM > orders o > INNER JOIN shippers s ON s.shipperid = o.shipvia > INNER JOIN employees e ON e.employeeid = o.employeeid > WHERE > o.orderdate between '1-1-1996' and '12-31-1996' > ORDER BY > o.orderdate > > > The first uses subqueries, the second uses joins. (Please ignore for > the moment that I would probably never return first + last name in this > manner, or that this query is inane to begin with.) > > The point was going to be: don't use a subquery when you can use a > join, as it's less efficient. > > If you run these queries as-is, the execution plan shows a (slightly) > better "cost" efficiency for the one that uses joins. > > However, if you remove the WHERE and ORDER BY clauses, the subquery has > a (significantly!) better cost efficiency (running them together in QA > and then checking the combined execution plan is the easiest way to get > an idea of relative costs, if you want to try this yourself). > > Anyone know why the subqueries would be more efficient in this case? > > If the tables had 50,000 records in them, do you think the joins would > be more efficient? > > Anyone know of a good source on this subject? > > > Thanks for any insight, > > Phil > Phil wrote:
> If you run these queries as-is, the execution plan shows a (slightly) Not in my Northwind: the first query has a cost of 0.0952 and the> better "cost" efficiency for the one that uses joins. second query has a cost of 0.101 (I'm using SQL Server 2000 SP4). The first query seems to have a better cost because it uses a "Hash Match/Cache" to minimize the number of "Clustered Index Seek"-s in the Employees table. Anyway, I think the difference in performance is too small to be significant in this case. I think that joins are more appropriate (for this query), because they seem easier to read (at least, for me). The optimizer may perform better in later versions of SQL Server, i.e. it may choose the same (best) plan for both queries. Razvan The queries below are not just subqueries, they are correlated subqueries.
There is a big difference. Sometimes I will use a subquery in the FROM clause--usually when I need to aggregate interim results before joining. The optimizer can most of the time decompose a correlated subquery into a join, so most of the time a similar execution plan will be generated whether you use a join or a correlated subquery. In my experience the optimizer seems to have the most trouble when one or more self-joins are involved in a query, and many times in those cases a cursor will actually perform better because you can eliminate one or more iteration steps from the execution plan. The questions posed below have no precise answer. The execution plan that the optimizer chooses depends on a number of different factors: the cardinality of each interim result, the availability of indexes and whether or not they're clustered or covering, the existence of statistics and how current they are, etc. I disagree with your point. The optimizer can most of the time decompose a correlated subquery into a join, so most of the time a similar execution plan will be generated whether you use a join or a correlated subquery. In my experience the optimizer seems to have the most trouble when one or more self-joins are involved in a query, and many times in those cases a cursor will actually perform better because you can eliminate one or more iteration steps from the execution plan. Query optimization is a really difficult subject to cover in a newsgroup post. There is some information in BOL, and there are a number of books on the subject, but I find that the best way to optimize queries is on a case-by-case basis by using Query Analyzer and Profiler. <psandle***@hotmail.com> wrote in message Show quote news:1123821016.890254.39970@z14g2000cwz.googlegroups.com... > Hello all, > > I was writing a demo case for instructional purposes on the virtues of > using joins instead of subqueries, and I was planning on using test > cases like these (using Northwind): > > Code: > SELECT > o.orderid, > o.orderdate, > (SELECT companyname FROM shippers s WHERE s.shipperid = o.shipvia) > companyname, > (SELECT lastname + ', ' + firstname FROM employees e WHERE > e.employeeid = o.employeeid) FullName > FROM > orders o > WHERE > o.orderdate between '1-1-1996' and '12-31-1996' > ORDER BY > o.orderdate > > > > Code: > SELECT > o.orderid, > o.orderdate, > s.companyname, > e.lastname + ', ' + e.firstname FullName > FROM > orders o > INNER JOIN shippers s ON s.shipperid = o.shipvia > INNER JOIN employees e ON e.employeeid = o.employeeid > WHERE > o.orderdate between '1-1-1996' and '12-31-1996' > ORDER BY > o.orderdate > > > The first uses subqueries, the second uses joins. (Please ignore for > the moment that I would probably never return first + last name in this > manner, or that this query is inane to begin with.) > > The point was going to be: don't use a subquery when you can use a > join, as it's less efficient. > > If you run these queries as-is, the execution plan shows a (slightly) > better "cost" efficiency for the one that uses joins. > > However, if you remove the WHERE and ORDER BY clauses, the subquery has > a (significantly!) better cost efficiency (running them together in QA > and then checking the combined execution plan is the easiest way to get > an idea of relative costs, if you want to try this yourself). > > Anyone know why the subqueries would be more efficient in this case? > > If the tables had 50,000 records in them, do you think the joins would > be more efficient? > > Anyone know of a good source on this subject? > > > Thanks for any insight, > > Phil > |
|||||||||||||||||||||||