|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Re: SELECT INTO temp table - Pros and ConsShow quote > We've been having a lively debate, at my office, about the pros and cons It is hard to find anything definitive, because the answer for many> of using SELECT INTO versus CREATE TABLE/INSERT INTO when populating > temp tables in SQL Server 2000 stored procedures. Without stating my > opinion, I'd like to hear what others think. Ideally, I'd like to see > links to supporting documentation from reputable sites (preferably > Microsoft). Here are some of the points that have been made thus far: > > 1) SELECT INTO will be more efficient on large resultsets because tempdb > is set to simple recovery; therefore, the insert will be minimally > logged. INSERT INTO will be fully logged. > > 2) CREATE TABLE/INSERT INTO is better form because it is explicit and the > data types of the table columns are easily identifiable within the > procedure. > > 3) SELECT INTO can be problematic because it can cause unnecessary > recompiles if not coded properly. > > 4) It used to be that SELECT INTO would lock tempdb for the duration of > the SELECT statement's execution; however, this has not been an issue > with SQL Server version 7.0 and later. > > It's hard to find anything definitive from Microsoft. Hopefully someone > here can point me in the right direction. performance questions are "it depends". I think you make a fairly good summary, although I'm not sure that I understand the third point - you can get recompiles with CREATE TABLE + INSERT too. I don't think SELECT INTO is more prone to this. I like to add a few points: 5) With CREATE TABLE you can defines keys etc for the table directly. If you add it later you, will get recompiles - which you may or may not want. 6) I you want to populate an IDENTITY column with a certain order, you cannot rely on that this will work with SELECT INTO. INSERT usually works. 7) There is a larger overhead for the table creation with SELECT INTO. The latter point may be seem minor, but I was once researching a performance problem, and I had about to given up, but decieded to look at a minor issue in the Profiler trace. In a trigger I had written SELECT * INTO #inserted FROM inserted the reason for this is in my experience the virtual tables inserted and deleted are very slow, since on SQL 2000 they are read from the log. In the function I was looking at this particular table was inserted into and updated frequently - one row at a time in a loop. When I removed the temp table from the trigger that resolved the performance problem I was looking at! Part of the issue here is that the entire loop was one transaction, so I ended up with very many locks in tempdb. Of course, using CREATE TABLE had similar effects, but the cost for SELECT INTO was higher. (I'm now using table variables for inserted/deletd.) I had another interesting experience recently. I've been running tests of various methods to unpack character lists into tables. My test procedures used look like this: DECLARE @start datetime SELECT @start = getdate() SELECT number INTO tmp FROM iter_intlist_to_tbl(@str) SELECT @tookms = datediff(ms, @start, getdate()); SELECT number FROM tmp DROP TABLE tmp This worked fine when I ran the test for SQL 2000 a couple of years ago. But on SQL 2005 some invocations would give higher execution times, despite the test machine being idle. I had a suspicion that autostats on the system tables could be the reason, and these autostats cannot be turned off. So I decided to run a Perl script over all test procedures and change them to use INSERT into a permanent table instaed. (With a TRUNCATE TABLE at the end.) The result of this was interesting. I run these tests with 20, 200, 650, 2000 and 10000 element in the lists. For 10000 elements there was indeed generally a 10% in execution time. But for the shortest lists, there was a drastic decrease on the slowest test machine. Average for the fastest methods fell from over 40 ms to around 5 ms. (And my issue with occassional odd execution times was resovled.) My bottom line of this is that if you have real big datasets, SELECT INTO can give you some benefit. But I prefer CREATE TABLE + INSERT for the reason listed in point 2. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx On Sat, 9 Sep 2006 09:22:09 +0000 (UTC), Erland Sommarskog
<esq***@sommarskog.se> wrote: Show quote >> 1) SELECT INTO will be more efficient on large resultsets because tempdb One other potential issue when using SELECT INTO:>> is set to simple recovery; therefore, the insert will be minimally >> logged. INSERT INTO will be fully logged. >> >> 2) CREATE TABLE/INSERT INTO is better form because it is explicit and the >> data types of the table columns are easily identifiable within the >> procedure. >> >> 3) SELECT INTO can be problematic because it can cause unnecessary >> recompiles if not coded properly. >> >> 4) It used to be that SELECT INTO would lock tempdb for the duration of >> the SELECT statement's execution; however, this has not been an issue >> with SQL Server version 7.0 and later. >> >> It's hard to find anything definitive from Microsoft. Hopefully someone >> here can point me in the right direction. > >It is hard to find anything definitive, because the answer for many >performance questions are "it depends". > >I think you make a fairly good summary, although I'm not sure that I >understand the third point - you can get recompiles with CREATE TABLE + >INSERT too. I don't think SELECT INTO is more prone to this. > >I like to add a few points: > >5) With CREATE TABLE you can defines keys etc for the table directly. >If you add it later you, will get recompiles - which you may or may not >want. > >6) I you want to populate an IDENTITY column with a certain order, you >cannot rely on that this will work with SELECT INTO. INSERT usually works. > >7) There is a larger overhead for the table creation with SELECT INTO. 8) Any computed columns in the source table become physical columns in the target. Roy Harvey Beacon Falls, CT |
|||||||||||||||||||||||