Home All Groups Group Topic Archive Search About
Author
27 Jul 2006 11:22 AM
Simon Woods
Hi

I wonder if someone could explain, or direct me to an explanation, of what
happens under the hood when a LIKE statement executes and an IN statement.
Theoretically, should using an IN be faster on a non-indexed field than
using a LIKE on the same field - or is it not possible to say generally?

Thanks

Simon

Author
27 Jul 2006 2:31 PM
Anith Sen
>> Theoretically, should using an IN be faster on a non-indexed field than
>> using a LIKE on the same field - or is it not possible to say generally?

General conclusions are mostly impossible with performance.

--
Anith
Author
27 Jul 2006 8:49 PM
Hugo Kornelis
On Thu, 27 Jul 2006 12:22:43 +0100, Simon Woods wrote:

>Hi
>
>I wonder if someone could explain, or direct me to an explanation, of what
>happens under the hood when a LIKE statement executes and an IN statement.

Hi Simon,

An IN simply gets expanded to a bunch of comparisons with OR. E.g.

  WHERE col1 IN (1, 2, 4)

is equivalent to

  WHERE col1 = 1 OR col1 = 2 OR col1 = 4


How a LIKE is executed depends on the search string, and on whether the
column is indexed and whether the optimizer has chosen to use that
index.

If the search argument doesn't start with a wildcard character (e.g.
WHERE col1 LIKE 'abc[def][^g]%x'), then the index is used to navigate to
the first row that might match (in this case the first row that has
'abc' as first characters in col1); then the index is navigated in
logical index order and each row is checked against the complete filter
to decide if it is selected or not. This stops as soon as the first row
that doesn't match the fixed starting part of the LIKE expression is
found (in the example: the first row starting with 'abd' or above).

If there's no index, or if the LIKE expression starts with a wildcard
(e.g. WHERE col1 LIKE '%key'), then all rows in the table have to be
examined (unless other parts of the query can be used to quickly narrow
doown the amount of rows to be searched).

>Theoretically, should using an IN be faster on a non-indexed field than
>using a LIKE on the same field - or is it not possible to say generally?

There is no single always-correct answer. Best to test both alternatives
on a system that matches your production server as closely as possible,
with data that matches the real data as possible, and in a workload that
matches the real workloads as closely as possible.

--
Hugo Kornelis, SQL Server MVP
Author
28 Jul 2006 6:58 AM
Simon Woods
Hugo Kornelis wrote:

<snip>

Thx Hugo, very helpful.

AddThis Social Bookmark Button