Home All Groups Group Topic Archive Search About

When to switch from SWITCH() to JOIN

Author
29 Jul 2005 2:41 PM
D Babin
This is a rather complex post in which I state my concerns about the
performance of SWITCH() and JOIN.

I'm writing a utility for use with MS Access databases, generating SQL
executed via ADO.  First I create a series of fact tables for most fields
(but not every field).  The objective is to recreate the source tables with
new values substituted for most field values - derived from the fact tables.

MAIN                      |FACT1      |FACT2        |FACT3     |...Could be
KEY COL1    COL2      COL3|ORG1   NEW1|ORG2     NEW2|ORG3  NEW3|perhaps 100
--------------------------|-----------|-------------|----------|fact tables
1   GM      11/05/04  NO  |BOEING VAL1|07/21/04 VAL1|NO    VAL1|depending on
the
2   SEARS   02/13/05  YES |GM     VAL2|11/05/04 VAL2|YES   VAL2|number of
columns
3   SHELL   07/21/04  YES |IBM    VAL3|02/13/05 VAL3|          |in MAIN
4   BOEING  05/05/05  NO  |SEARS  VAL4|04/12/05 VAL4|          |
5   IBM     04/12/05  YES |SHELL  VAL5|05/05/05 VAL5|          |

.... could be more than
    1,000,000 rows


Here is my solution - recreating the original table with the substituted
values through a series of code-generated queries:

1) First I recreate the table, i.e., CREATE TABLE ...

2) Next I populate ALL of the rows and populate SOME of the fields via INSERT

INSERT INTO NEWMAIN (
       COL1,
       COL2,
       COL3
    SELECT
       MAIN.KEY, 
       IIF(FACT1.NEW1 IS NULL,NULL,FACT1.NEW1) AS COL1,
       IIF(FACT1.NEW2 IS NULL,NULL,FACT1.NEW2) AS COL2,
       SWITCH(MAIN.COL3=,NULL,NULL,MAIN.COL3=,NO,VAL1,MAIN.COL3=,YES,VAL2)
AS COL3
    FROM (MAIN 
       LEFT JOIN FACT1 ON MAIN.COL1 = FACT1.ORG1) 
       LEFT JOIN FACT2 ON MAIN.COL2 = FACT1.ORG2;

3) Finally, if all columns were not populated, I finish populating them with
one or more UPDATE queries similar to the INSERT query above.



Now here are my performance questions:

1) Clearly SWITCH() is superior to JOIN for tiny value sets such as 2,3,4 or
5 values.  However, when does SWITCH() become less efficient than JOIN?  Will
SWITCH() execute faster than JOIN for 100 values?  And if I have multiple
SWITCH() functions containing 100 values each, the SQL statement will become
bloated fast - which begs the next question.

2) Does Microsoft Jet 4.0 OLEDB impose a limitation on SQL statement size -
and what is this limitation?

3) Regarding the JOINS, is there a rule-of-thumb regarding the number of
acceptable JOINS before performance takes a nose dive?

4) Are there performance benefits by arranging the joined tables in a
specific sequence, for example:

   LEFT JOIN FACT7  ... (FACT7 contains 1,000 records)
   LEFT JOIN FACT12 ... (FACT12 contains 10,000 records)
   LEFT JOIN FACT9  ... (FACT9 contains 100,000 records)
   LEFT JOIN FACT20 ... (FACT20 contains 1,000,000 records)

   vs.

   LEFT JOIN FACT20 ... (FACT20 contains 1,000,000 records)
   LEFT JOIN FACT9  ... (FACT9 contains 100,000 records)
   LEFT JOIN FACT12 ... (FACT12 contains 10,000 records)
   LEFT JOIN FACT7  ... (FACT7 contains 1,000 records)

5) Finally, given the SQL language limitations imposed by Microsoft Access,
is there a radically better approach than the general approach that I have
outlined above?

As a side note, I would like to thank Aaron Bertrand, Anith Sen, David
Portas, Frank Chang and Jatinder Singh for replying to my post yesterday "How
to find gaps in sequential key".  I had to leave after posting, and this is
my first opportunity to say Thank you to all.

Best regards,

D Babin

Author
29 Jul 2005 2:49 PM
Aaron Bertrand [SQL Server MVP]
This seems like it involves Microsoft Access, and not SQL Server.  You may
have better luck posting in a newsgroup dedicated to Access...



Show quote
> This is a rather complex post in which I state my concerns about the
> performance of SWITCH() and JOIN.

AddThis Social Bookmark Button