Home All Groups Group Topic Archive Search About

Re: SELECT INTO temp table - Pros and Cons

Author
9 Sep 2006 9:22 AM
Erland Sommarskog
John Smith (some***@microsoft.com) writes:
Show quote
> We've been having a lively debate, at my office, about the pros and cons
> of using SELECT INTO versus CREATE TABLE/INSERT INTO when populating
> temp tables in SQL Server 2000 stored procedures. Without stating my
> opinion, I'd like to hear what others think. Ideally, I'd like to see
> links to supporting documentation from reputable sites (preferably
> Microsoft). Here are some of the points that have been made thus far:
>
> 1) SELECT INTO will be more efficient on large resultsets because tempdb
> is set to simple recovery; therefore, the insert will be minimally
> logged. INSERT INTO will be fully logged.
>
> 2) CREATE TABLE/INSERT INTO is better form because it is explicit and the
> data types of the table columns are easily identifiable within the
> procedure.
>
> 3) SELECT INTO can be problematic because it can cause unnecessary
> recompiles if not coded properly.
>
> 4) It used to be that SELECT INTO would lock tempdb for the duration of
> the SELECT statement's execution; however, this has not been an issue
> with SQL Server version 7.0 and later.
>
> It's hard to find anything definitive from Microsoft. Hopefully someone
> here can point me in the right direction.

It is hard to find anything definitive, because the answer for many
performance questions are "it depends".

I think you make a fairly good summary, although I'm not sure that I
understand the third point - you can get recompiles with CREATE TABLE +
INSERT too. I don't think SELECT INTO is more prone to this.

I like to add a few points:

5) With CREATE TABLE you can defines keys etc for the table directly.
If you add it later you, will get recompiles - which you may or may not
want.

6) I you want to populate an IDENTITY column with a certain order, you
cannot rely on that this will work with SELECT INTO. INSERT usually works.

7) There is a larger overhead for the table creation with SELECT INTO.

The latter point may be seem minor, but I was once researching a
performance problem, and I had about to given up, but decieded to
look at a minor issue in the Profiler trace. In a trigger I had written

    SELECT * INTO #inserted FROM inserted

the reason for this is in my experience the virtual tables inserted and
deleted are very slow, since on SQL 2000 they are read from the log. In
the function I was looking at this particular table was inserted into and
updated frequently - one row at a time in a loop. When I removed the
temp table from the trigger that resolved the performance problem I was
looking at! Part of the issue here is that the entire loop was one
transaction, so I ended up with very many locks in tempdb. Of course,
using CREATE TABLE had similar effects, but the cost for SELECT INTO
was higher. (I'm now using table variables for inserted/deletd.)

I had another interesting experience recently. I've been running tests of
various methods to unpack character lists into tables. My test procedures
used look like this:

   DECLARE @start datetime
   SELECT @start = getdate()

   SELECT number
   INTO   tmp
   FROM   iter_intlist_to_tbl(@str)

   SELECT @tookms = datediff(ms, @start, getdate());

   SELECT number FROM tmp
   DROP TABLE tmp

This worked fine when I ran the test for SQL 2000 a couple of years ago.
But on SQL 2005 some invocations would give higher execution times,
despite the test machine being idle. I had a suspicion that autostats
on the system tables could be the reason, and these autostats cannot
be turned off. So I decided to run a Perl script over all test procedures
and change them to use INSERT into a permanent table instaed. (With a
TRUNCATE TABLE at the end.)

The result of this was interesting. I run these tests with 20, 200, 650,
2000 and 10000 element in the lists. For 10000 elements there was indeed
generally a 10% in execution time. But for the shortest lists, there was
a drastic decrease on the slowest test machine. Average for the fastest
methods fell from over 40 ms to around 5 ms. (And my issue with occassional
odd execution times was resovled.)

My bottom line of this is that if you have real big datasets, SELECT INTO
can give you some benefit. But I prefer CREATE TABLE + INSERT for the
reason listed in point 2.

--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

Author
9 Sep 2006 10:17 AM
Roy Harvey
On Sat, 9 Sep 2006 09:22:09 +0000 (UTC), Erland Sommarskog
<esq***@sommarskog.se> wrote:

Show quote
>> 1) SELECT INTO will be more efficient on large resultsets because tempdb
>> is set to simple recovery; therefore, the insert will be minimally
>> logged. INSERT INTO will be fully logged.
>>
>> 2) CREATE TABLE/INSERT INTO is better form because it is explicit and the
>> data types of the table columns are easily identifiable within the
>> procedure.
>>
>> 3) SELECT INTO can be problematic because it can cause unnecessary
>> recompiles if not coded properly.
>>
>> 4) It used to be that SELECT INTO would lock tempdb for the duration of
>> the SELECT statement's execution; however, this has not been an issue
>> with SQL Server version 7.0 and later.
>>
>> It's hard to find anything definitive from Microsoft. Hopefully someone
>> here can point me in the right direction.
>
>It is hard to find anything definitive, because the answer for many
>performance questions are "it depends".
>
>I think you make a fairly good summary, although I'm not sure that I
>understand the third point - you can get recompiles with CREATE TABLE +
>INSERT too. I don't think SELECT INTO is more prone to this.
>
>I like to add a few points:
>
>5) With CREATE TABLE you can defines keys etc for the table directly.
>If you add it later you, will get recompiles - which you may or may not
>want.
>
>6) I you want to populate an IDENTITY column with a certain order, you
>cannot rely on that this will work with SELECT INTO. INSERT usually works.
>
>7) There is a larger overhead for the table creation with SELECT INTO.

One other potential issue when using SELECT INTO:

8) Any computed columns in the source table become physical columns in
the target.

Roy Harvey
Beacon Falls, CT

AddThis Social Bookmark Button