Home All Groups Group Topic Archive Search About

parse out set of numbers from a column

Author
31 Aug 2006 6:16 PM
gv
Hi all,

I have the following text in a Varchar(200) column in a table and want to
parse out just the first set of numbers:

Example data

Col1
Order the 90311 - Prov Order Notify User of Ack 8894
Order 30511 was assigned. 1 tsirry 2000-2-5 00:00: 2003-12-29 1:56:23.827
pwilliams NULL 186 pwilliams
Order -- 8028433 was assigned. 10284 1064 WF - Prov Order Notify User of Ack
8899
Order        202784 was assigned. 1 lbd 2001-12-16 00:00:00.000 203
Order 90159 - WF - Prov Order FOC/PCN Reminder 822
Order # $1088379 - FC or PN Dates needed.  FOC = [Not all Specified]. 1
smray 2002-12-20 00:00:00.000 2003-12-2000-

I would like to return  like this

90311
30511
8028433
202784
90159
1088379

thanks
gv

Author
31 Aug 2006 6:51 PM
gv
This will find the first occurance of a number
       CHARINDEX(CAST(PATINDEX('%[0-9]%',Col1)AS VARCHAR),Col1)

I'm not sure how to determin the end of the set of numbers to find the
length
of numbers to use substring to pull the set of numbers?

thanks
gv


Show quote
"gv" <viator.ge***@gmail.com> wrote in message
news:uvmhqmSzGHA.4116@TK2MSFTNGP02.phx.gbl...
> Hi all,
>
> I have the following text in a Varchar(200) column in a table and want to
> parse out just the first set of numbers:
>
> Example data
>
> Col1
> Order the 90311 - Prov Order Notify User of Ack 8894
> Order 30511 was assigned. 1 tsirry 2000-2-5 00:00: 2003-12-29 1:56:23.827
> pwilliams NULL 186 pwilliams
> Order -- 8028433 was assigned. 10284 1064 WF - Prov Order Notify User of
> Ack 8899
> Order        202784 was assigned. 1 lbd 2001-12-16 00:00:00.000 203
> Order 90159 - WF - Prov Order FOC/PCN Reminder 822
> Order # $1088379 - FC or PN Dates needed.  FOC = [Not all Specified]. 1
> smray 2002-12-20 00:00:00.000 2003-12-2000-
>
> I would like to return  like this
>
> 90311
> 30511
> 8028433
> 202784
> 90159
> 1088379
>
> thanks
> gv
>
Author
31 Aug 2006 7:12 PM
Arnie Rowland
Something like this 'could' work. (I'm concerned that there may be too may variances in the data for this to be reliable...)

DECLARE @MyTestData varchar(100)

SET @MyTestData = 'Order the 90311 - Prov Order Notify User of Ack 8894'

SELECT substring( @MyTestData, patindex( '%[0-9]%', @MyTestData ), ( charindex( ' ', @MyTestData, patindex( '%[0-9]%', @MyTestData )) - patindex( '%[0-9]%', @MyTestData )))

--
Arnie Rowland, Ph.D.
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


Show quote
"gv" <viator.ge***@gmail.com> wrote in message news:uvmhqmSzGHA.4116@TK2MSFTNGP02.phx.gbl...
> Hi all,
>
> I have the following text in a Varchar(200) column in a table and want to
> parse out just the first set of numbers:
>
> Example data
>
> Col1
> Order the 90311 - Prov Order Notify User of Ack 8894
> Order 30511 was assigned. 1 tsirry 2000-2-5 00:00: 2003-12-29 1:56:23.827
> pwilliams NULL 186 pwilliams
> Order -- 8028433 was assigned. 10284 1064 WF - Prov Order Notify User of Ack
> 8899
> Order        202784 was assigned. 1 lbd 2001-12-16 00:00:00.000 203
> Order 90159 - WF - Prov Order FOC/PCN Reminder 822
> Order # $1088379 - FC or PN Dates needed.  FOC = [Not all Specified]. 1
> smray 2002-12-20 00:00:00.000 2003-12-2000-
>
> I would like to return  like this
>
> 90311
> 30511
> 8028433
> 202784
> 90159
> 1088379
>
> thanks
> gv
>
>
Author
31 Aug 2006 8:45 PM
gv
Thanks everyone for your help,

That is great, is it possiable to check for any non numeric char at the end of the numeric number other
then just a space using charindex?

charindex( ' ', @MyTestData, patindex( '%[0-9]%', @MyTestData ))


thks
gv

  "Arnie Rowland" <ar***@1568.com> wrote in message news:e0OXXFTzGHA.744@TK2MSFTNGP05.phx.gbl...
  Something like this 'could' work. (I'm concerned that there may be too may variances in the data for this to be reliable...)

  DECLARE @MyTestData varchar(100)

  SET @MyTestData = 'Order the 90311 - Prov Order Notify User of Ack 8894'

  SELECT substring( @MyTestData, patindex( '%[0-9]%', @MyTestData ), ( charindex( ' ', @MyTestData, patindex( '%[0-9]%', @MyTestData )) - patindex( '%[0-9]%', @MyTestData )))

  --
  Arnie Rowland, Ph.D.
  Westwood Consulting, Inc

  Most good judgment comes from experience.
  Most experience comes from bad judgment.
  - Anonymous


Show quote
  "gv" <viator.ge***@gmail.com> wrote in message news:uvmhqmSzGHA.4116@TK2MSFTNGP02.phx.gbl...
  > Hi all,
  >
  > I have the following text in a Varchar(200) column in a table and want to
  > parse out just the first set of numbers:
  >
  > Example data
  >
  > Col1
  > Order the 90311 - Prov Order Notify User of Ack 8894
  > Order 30511 was assigned. 1 tsirry 2000-2-5 00:00: 2003-12-29 1:56:23.827
  > pwilliams NULL 186 pwilliams
  > Order -- 8028433 was assigned. 10284 1064 WF - Prov Order Notify User of Ack
  > 8899
  > Order        202784 was assigned. 1 lbd 2001-12-16 00:00:00.000 203
  > Order 90159 - WF - Prov Order FOC/PCN Reminder 822
  > Order # $1088379 - FC or PN Dates needed.  FOC = [Not all Specified]. 1
  > smray 2002-12-20 00:00:00.000 2003-12-2000-
  >
  > I would like to return  like this
  >
  > 90311
  > 30511
  > 8028433
  > 202784
  > 90159
  > 1088379
  >
  > thanks
  > gv
  >
  >
Author
31 Aug 2006 9:07 PM
Arnie Rowland
Unfortunately, charindex() does not allow for multiple comparisons like patindex() does.

However, you could use a substring taking the right portion of the string, staring at the first numeric (position returned by patindex), and then look for the patindex() for multiple character comparisons, i.e., '%[ -.,;:]%'. That shouldn't be too much additional trouble.



--
Arnie Rowland, Ph.D.
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


  "gv" <viator.ge***@gmail.com> wrote in message news:u8b8z5TzGHA.4920@TK2MSFTNGP06.phx.gbl...
  Thanks everyone for your help,

  That is great, is it possiable to check for any non numeric char at the end of the numeric number other
  then just a space using charindex?

   charindex( ' ', @MyTestData, patindex( '%[0-9]%', @MyTestData ))


  thks
  gv

    "Arnie Rowland" <ar***@1568.com> wrote in message news:e0OXXFTzGHA.744@TK2MSFTNGP05.phx.gbl...
    Something like this 'could' work. (I'm concerned that there may be too may variances in the data for this to be reliable...)

    DECLARE @MyTestData varchar(100)

    SET @MyTestData = 'Order the 90311 - Prov Order Notify User of Ack 8894'

    SELECT substring( @MyTestData, patindex( '%[0-9]%', @MyTestData ), ( charindex( ' ', @MyTestData, patindex( '%[0-9]%', @MyTestData )) - patindex( '%[0-9]%', @MyTestData )))

    --
    Arnie Rowland, Ph.D.
    Westwood Consulting, Inc

    Most good judgment comes from experience.
    Most experience comes from bad judgment.
    - Anonymous


Show quote
    "gv" <viator.ge***@gmail.com> wrote in message news:uvmhqmSzGHA.4116@TK2MSFTNGP02.phx.gbl...
    > Hi all,
    >
    > I have the following text in a Varchar(200) column in a table and want to
    > parse out just the first set of numbers:
    >
    > Example data
    >
    > Col1
    > Order the 90311 - Prov Order Notify User of Ack 8894
    > Order 30511 was assigned. 1 tsirry 2000-2-5 00:00: 2003-12-29 1:56:23.827
    > pwilliams NULL 186 pwilliams
    > Order -- 8028433 was assigned. 10284 1064 WF - Prov Order Notify User of Ack
    > 8899
    > Order        202784 was assigned. 1 lbd 2001-12-16 00:00:00.000 203
    > Order 90159 - WF - Prov Order FOC/PCN Reminder 822
    > Order # $1088379 - FC or PN Dates needed.  FOC = [Not all Specified]. 1
    > smray 2002-12-20 00:00:00.000 2003-12-2000-
    >
    > I would like to return  like this
    >
    > 90311
    > 30511
    > 8028433
    > 202784
    > 90159
    > 1088379
    >
    > thanks
    > gv
    >
    >
Author
1 Sep 2006 10:57 PM
Hugo Kornelis
On Thu, 31 Aug 2006 14:07:19 -0700, Arnie Rowland wrote:

>Unfortunately, charindex() does not allow for multiple comparisons like patindex() does.
>
>However, you could use a substring taking the right portion of the string, staring at the first numeric (position returned by patindex), and then look for the patindex() for multiple character comparisons, i.e., '%[ -.,;:]%'. That shouldn't be too much additional trouble.

Hi Arnie,

Or use an "inverted pattern": '%[^0-9]%'

--
Hugo Kornelis, SQL Server MVP
Author
2 Sep 2006 12:06 AM
Arnie Rowland
EXCELLENT IDEA!

Thanks Hugo, I didn't consider that. Covers all the bases...

--
Arnie Rowland, Ph.D.
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


Show quote
"Hugo Kornelis" <h***@perFact.REMOVETHIS.info.INVALID> wrote in message
news:llehf2l30sud5j7463rnpdcf0s8j25l40n@4ax.com...
> On Thu, 31 Aug 2006 14:07:19 -0700, Arnie Rowland wrote:
>
>>Unfortunately, charindex() does not allow for multiple comparisons like
>>patindex() does.
>>
>>However, you could use a substring taking the right portion of the string,
>>staring at the first numeric (position returned by patindex), and then
>>look for the patindex() for multiple character comparisons, i.e.,
>>'%[ -.,;:]%'. That shouldn't be too much additional trouble.
>
> Hi Arnie,
>
> Or use an "inverted pattern": '%[^0-9]%'
>
> --
> Hugo Kornelis, SQL Server MVP
Author
31 Aug 2006 9:03 PM
Steve Dassin
In the absence of proper abstraction we anoint
new knights. But their armour is not always
shinning for there is always creeping doubt.
Who is it or what is it that can offer the needed
assurance. And the poor soul is left in the
eternal dilemma, to do or not to do that is the
question.

best,
http://racster.blogspot.com

Westwood - Bruin land? :)
  "Arnie Rowland" <ar***@1568.com> wrote in message news:e0OXXFTzGHA.744@TK2MSFTNGP05.phx.gbl...
  Something like this 'could' work. (I'm concerned that there may be too may variances in the data for this to be reliable...)

  DECLARE @MyTestData varchar(100)

  SET @MyTestData = 'Order the 90311 - Prov Order Notify User of Ack 8894'

  SELECT substring( @MyTestData, patindex( '%[0-9]%', @MyTestData ), ( charindex( ' ', @MyTestData, patindex( '%[0-9]%', @MyTestData )) - patindex( '%[0-9]%', @MyTestData )))

  --
  Arnie Rowland, Ph.D.
  Westwood Consulting, Inc

  Most good judgment comes from experience.
  Most experience comes from bad judgment.
  - Anonymous


Show quote
  "gv" <viator.ge***@gmail.com> wrote in message news:uvmhqmSzGHA.4116@TK2MSFTNGP02.phx.gbl...
  > Hi all,
  >
  > I have the following text in a Varchar(200) column in a table and want to
  > parse out just the first set of numbers:
  >
  > Example data
  >
  > Col1
  > Order the 90311 - Prov Order Notify User of Ack 8894
  > Order 30511 was assigned. 1 tsirry 2000-2-5 00:00: 2003-12-29 1:56:23.827
  > pwilliams NULL 186 pwilliams
  > Order -- 8028433 was assigned. 10284 1064 WF - Prov Order Notify User of Ack
  > 8899
  > Order        202784 was assigned. 1 lbd 2001-12-16 00:00:00.000 203
  > Order 90159 - WF - Prov Order FOC/PCN Reminder 822
  > Order # $1088379 - FC or PN Dates needed.  FOC = [Not all Specified]. 1
  > smray 2002-12-20 00:00:00.000 2003-12-2000-
  >
  > I would like to return  like this
  >
  > 90311
  > 30511
  > 8028433
  > 202784
  > 90159
  > 1088379
  >
  > thanks
  > gv
  >
  >
Author
31 Aug 2006 7:27 PM
Raymond
Try this:

declare @tbl TABLE (keyid int not null identity, col1 varchar(200), col2
int)
insert into @tbl (col1)
select 'Order the 90311 - Prov Order Notify User of Ack 8894'
union
select 'Order 30511 was assigned. 1 tsirry 2000-2-5 00:00: 2003-12-29
1:56:23.827 pwilliams NULL 186 pwilliams'
union
select 'Order -- 8028433 was assigned. 10284 1064 WF - Prov Order Notify
User of Ack 8899'
union
select 'Order        202784 was assigned. 1 lbd 2001-12-16 00:00:00.000 203'
union
select 'Order 90159 - WF - Prov Order FOC/PCN Reminder 822'
union
select 'Order # $1088379 - FC or PN Dates needed.  FOC = [Not all
Specified]. 1 smray 2002-12-20 00:00:00.000 2003-12-2000-'
declare @col1 varchar(200), @col2 varchar(20), @keyid int
declare cur cursor fast_forward for
select keyid, col1 from @tbl
open cur
fetch cur into @keyid, @col1
while @@fetch_status <> -1
begin
set @col2 = ''
while dbo.isReallyNumeric(left(@col1,1)) = 0
begin
  select @col1 = right(@col1,len(@col1)-1)
end
while dbo.isReallyNumeric(left(@col1,1)) = 1
begin
  select @col2 = @col2 + left(@col1,1)
  select @col1 = right(@col1,len(@col1)-1)
end
update @tbl set col2 = @col2 where keyid = @keyid
fetch cur into @keyid, @col1
end
close cur
deallocate cur
select col2 from @tbl

Note, for the fuction dbo.isReallyNumeric see this:
http://classicasp.aspfaq.com/general/what-is-wrong-with-isnumeric.html

--

Raymond Yap

Show quote
"gv" <viator.ge***@gmail.com> wrote in message
news:uvmhqmSzGHA.4116@TK2MSFTNGP02.phx.gbl...
> Hi all,
>
> I have the following text in a Varchar(200) column in a table and want to
> parse out just the first set of numbers:
>
> Example data
>
> Col1
> Order the 90311 - Prov Order Notify User of Ack 8894
> Order 30511 was assigned. 1 tsirry 2000-2-5 00:00: 2003-12-29 1:56:23.827
> pwilliams NULL 186 pwilliams
> Order -- 8028433 was assigned. 10284 1064 WF - Prov Order Notify User of
Ack
> 8899
> Order        202784 was assigned. 1 lbd 2001-12-16 00:00:00.000 203
> Order 90159 - WF - Prov Order FOC/PCN Reminder 822
> Order # $1088379 - FC or PN Dates needed.  FOC = [Not all Specified]. 1
> smray 2002-12-20 00:00:00.000 2003-12-2000-
>
> I would like to return  like this
>
> 90311
> 30511
> 8028433
> 202784
> 90159
> 1088379
>
> thanks
> gv
>
>
Author
1 Sep 2006 12:59 AM
Steve Dassin
Solution to parsing out 1st number from a string.
Using RAC utility v2.34 beta
and
Sql Server 2005 sp1

Dr.Rowland is a very bright guy and an accomplished programmer
and I'm sure he'll provide you a solution. But even he might admit
to getting vertigo in the midst of the black hole created by
the minutiae of the string manipulation involved. I take
it you can identify with this:)

But perhaps the solution can be framed another way
obviating the need for all the horrendous string machinations.

I would frame this problem, like so many similar problems,
as a relatively simple ranking problem. And by abstracting
the pain in the a_s drudgery work, the solution becomes
more intuitive and easily manageable.

Here is some test data:

create table ##t2 (rowid int primary key,strings varchar(2000))
       insert ##t2 values(1,' 12  ')
       insert ##t2 values(2,'Order 90159$ - WF63& - Prov ')
       insert ##t2 values(3,'Order # $1088379 - FC')
       insert ##t2 values(4,'Order#%1088379- & FC')
       insert ##t2 values(5,'Bu74zz- 54==')
       insert ##t2 values(6,'533')
       insert ##t2 values(7,'74 873')
       insert ##t2 values(8,'341$53')

--select * from ##t2

I'll present the solution in 3 easy steps. All this can be
condensed of course:)

Step 1.
Drudgery #1. Split each string by single character per rowid.
             And create a new column that is 1 if the character
             is a digit (ascii between 48 and 57) and 2 for
             everything else. This is column 'type'.

Exec Rac
@split='[position]',
@rows='rowid & [position]',
@pvtcol='strings',@rank='str',
@splitby='1', -- Splits a string by N consecutive positions.
@from='##t2',@defaults1='y',@rowbreak='n',@racheck='y',
@select='select rd,1*rowid as rowid,1*position as pos,str1,
         case when ascii(str1) between 48 and 57 then 1 else 2 end as type
         into ##t3
           from rac
             order by rd'

--select * from ##t3
Here is data for rowid=8. Column 'rd' is an integer
in ascending order over the whole table. Column 'pos'
is an integer indicating the position of each character
in the overall string. Column 'str1' is the character.
So we've just tranposed each string into rows where the
order of the rows corresponds to the left to right
order of the characters in 'strings'.

rd          rowid       pos         str1 type
----------- ----------- ----------- ---- -----------
96          8           1           3    1
97          8           2           4    1
98          8           3           1    1
99          8           4           $    2
100         8           5           5    1
101         8           6           3    1

Step 2. This is the real solution. Looking at
'type' within 'rowid' ordered by 'rd'
(ie partition (rowid,type order by rd))
obtain a rank ('typerank') that increments
by 1 only when 'type' changes. Within a
'type' it stays the same. Ranks go from 1
to N within each 'rowid', where N is the
number of changes (digit/non-digit)+1.
Ranks are reset to 1 with each new 'rowid'.

This is data for rowid=8 showing what we want.

rowid  type pos  str1 typerank
------ ---- ---- ---- -----------
8      1    1    3    1
8      1    2    4    1
8      1    3    1    1
8      2    4    $    2
8      1    5    5    3
8      1    6    3    3

Now the key part. The rows corresponding to the
1st number in each 'strings' (by 'rowid') must
be of 'type' 1 (their digits) and must have a rank
('typerank') <=2. Any other possibilities must
mean it is not the 1st number.

Execute Rac to get the ranks and filter on
'type'=1 and 'typerank'<=2.

Exec Rac
@transform='_dummy_',
@rows='rowid & type & pos & str1',
@rowsort='rd',
@pvtcol='Sql*Plus',
@from='##t3',@defaults1='y',@rowbreak='n',@racheck='y',
@rowindicators='type{typerank}',@counterdatatype='int',
@wherecounters='type=1 and typerank<=2',
@select='select rd,1*rowid as rowid,1*pos as pos,str1
         into ##t4
         from rac
          order by rd'

--select * from ##t4
Here's the data for rowid=7 and 8.
rowid  type pos  str1 typerank
------ ---- ---- ---- -----------
7      1    1    7    1
7      1    2    4    1
8      1    1    3    1
8      1    2    4    1
8      1    3    1    1

We now have the 1st number in rows for
each 'rowid'.

Step 3. Just the drudgery of concatenating the characters
        ('str1'), the rows of each number, per rowid.
        Trivial with Rac.

Exec Rac
@transform='Max(str1) as str',
@rows='rowid',
@pvtcol='pos',
@from='##t4',
@defaults1='y',@racheck='y',
@concatenate='str',@stringname='number',
@separator='',@cutpvt='y'

And here are the numbers. Essentially getting
the occurrence of the Nth number in a string
becomes simplified by filting on 'type' and 'rank'

rowid  number
------ -------
1      12
2      90159
3      1088379
4      1088379
5      74
6      533
7      74
8      341

Amateurs believe their creativity is the mother
of their work. Professionals embrace abstraction
and don't feel orphaned.

Visit RAC @
www.rac4sql.net

I think an even more interesting solution using
D4 (Dataphor) and Sql Server 2005 is possible.

best,
http://racster.blogspot.com
Author
1 Sep 2006 5:18 AM
Arnie Rowland
Steve,

Best promotional presentation about RAC that I've seen!!

Keep it coming.

--
Arnie Rowland, Ph.D.
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


Show quote
"Steve Dassin" <rac4sqlnospam@net> wrote in message
news:eeTKKpXzGHA.2036@TK2MSFTNGP05.phx.gbl...
> Solution to parsing out 1st number from a string.
> Using RAC utility v2.34 beta
> and
> Sql Server 2005 sp1
>
> Dr.Rowland is a very bright guy and an accomplished programmer
> and I'm sure he'll provide you a solution. But even he might admit
> to getting vertigo in the midst of the black hole created by
> the minutiae of the string manipulation involved. I take
> it you can identify with this:)
>
> But perhaps the solution can be framed another way
> obviating the need for all the horrendous string machinations.
>
> I would frame this problem, like so many similar problems,
> as a relatively simple ranking problem. And by abstracting
> the pain in the a_s drudgery work, the solution becomes
> more intuitive and easily manageable.
>
> Here is some test data:
>
> create table ##t2 (rowid int primary key,strings varchar(2000))
>       insert ##t2 values(1,' 12  ')
>       insert ##t2 values(2,'Order 90159$ - WF63& - Prov ')
>       insert ##t2 values(3,'Order # $1088379 - FC')
>       insert ##t2 values(4,'Order#%1088379- & FC')
>       insert ##t2 values(5,'Bu74zz- 54==')
>       insert ##t2 values(6,'533')
>       insert ##t2 values(7,'74 873')
>       insert ##t2 values(8,'341$53')
>
> --select * from ##t2
>
> I'll present the solution in 3 easy steps. All this can be
> condensed of course:)
>
> Step 1.
> Drudgery #1. Split each string by single character per rowid.
>             And create a new column that is 1 if the character
>             is a digit (ascii between 48 and 57) and 2 for
>             everything else. This is column 'type'.
>
> Exec Rac
> @split='[position]',
> @rows='rowid & [position]',
> @pvtcol='strings',@rank='str',
> @splitby='1', -- Splits a string by N consecutive positions.
> @from='##t2',@defaults1='y',@rowbreak='n',@racheck='y',
> @select='select rd,1*rowid as rowid,1*position as pos,str1,
>         case when ascii(str1) between 48 and 57 then 1 else 2 end as type
>         into ##t3
>           from rac
>             order by rd'
>
> --select * from ##t3
> Here is data for rowid=8. Column 'rd' is an integer
> in ascending order over the whole table. Column 'pos'
> is an integer indicating the position of each character
> in the overall string. Column 'str1' is the character.
> So we've just tranposed each string into rows where the
> order of the rows corresponds to the left to right
> order of the characters in 'strings'.
>
> rd          rowid       pos         str1 type
> ----------- ----------- ----------- ---- -----------
> 96          8           1           3    1
> 97          8           2           4    1
> 98          8           3           1    1
> 99          8           4           $    2
> 100         8           5           5    1
> 101         8           6           3    1
>
> Step 2. This is the real solution. Looking at
> 'type' within 'rowid' ordered by 'rd'
> (ie partition (rowid,type order by rd))
> obtain a rank ('typerank') that increments
> by 1 only when 'type' changes. Within a
> 'type' it stays the same. Ranks go from 1
> to N within each 'rowid', where N is the
> number of changes (digit/non-digit)+1.
> Ranks are reset to 1 with each new 'rowid'.
>
> This is data for rowid=8 showing what we want.
>
> rowid  type pos  str1 typerank
> ------ ---- ---- ---- -----------
> 8      1    1    3    1
> 8      1    2    4    1
> 8      1    3    1    1
> 8      2    4    $    2
> 8      1    5    5    3
> 8      1    6    3    3
>
> Now the key part. The rows corresponding to the
> 1st number in each 'strings' (by 'rowid') must
> be of 'type' 1 (their digits) and must have a rank
> ('typerank') <=2. Any other possibilities must
> mean it is not the 1st number.
>
> Execute Rac to get the ranks and filter on
> 'type'=1 and 'typerank'<=2.
>
> Exec Rac
> @transform='_dummy_',
> @rows='rowid & type & pos & str1',
> @rowsort='rd',
> @pvtcol='Sql*Plus',
> @from='##t3',@defaults1='y',@rowbreak='n',@racheck='y',
> @rowindicators='type{typerank}',@counterdatatype='int',
> @wherecounters='type=1 and typerank<=2',
> @select='select rd,1*rowid as rowid,1*pos as pos,str1
>         into ##t4
>         from rac
>          order by rd'
>
> --select * from ##t4
> Here's the data for rowid=7 and 8.
> rowid  type pos  str1 typerank
> ------ ---- ---- ---- -----------
> 7      1    1    7    1
> 7      1    2    4    1
> 8      1    1    3    1
> 8      1    2    4    1
> 8      1    3    1    1
>
> We now have the 1st number in rows for
> each 'rowid'.
>
> Step 3. Just the drudgery of concatenating the characters
>        ('str1'), the rows of each number, per rowid.
>        Trivial with Rac.
>
> Exec Rac
> @transform='Max(str1) as str',
> @rows='rowid',
> @pvtcol='pos',
> @from='##t4',
> @defaults1='y',@racheck='y',
> @concatenate='str',@stringname='number',
> @separator='',@cutpvt='y'
>
> And here are the numbers. Essentially getting
> the occurrence of the Nth number in a string
> becomes simplified by filting on 'type' and 'rank'
>
> rowid  number
> ------ -------
> 1      12
> 2      90159
> 3      1088379
> 4      1088379
> 5      74
> 6      533
> 7      74
> 8      341
>
> Amateurs believe their creativity is the mother
> of their work. Professionals embrace abstraction
> and don't feel orphaned.
>
> Visit RAC @
> www.rac4sql.net
>
> I think an even more interesting solution using
> D4 (Dataphor) and Sql Server 2005 is possible.
>
> best,
> http://racster.blogspot.com
>
>
Author
1 Sep 2006 3:23 AM
Steve Dassin
"Arnie Rowland" <ar***@1568.com> wrote in message
news:O2YsVYYzGHA.2076@TK2MSFTNGP04.phx.gbl...
> Best promotional presentation about RAC that I've seen!!

Your generosity is appreciated! :)

best,
steve
Author
1 Sep 2006 2:14 PM
gv
Thanks Steve
Your help is gratiously appretiated

gv



Show quote
"Steve Dassin" <rac4sqlnospam@net> wrote in message
news:eeTKKpXzGHA.2036@TK2MSFTNGP05.phx.gbl...
> Solution to parsing out 1st number from a string.
> Using RAC utility v2.34 beta
> and
> Sql Server 2005 sp1
>
> Dr.Rowland is a very bright guy and an accomplished programmer
> and I'm sure he'll provide you a solution. But even he might admit
> to getting vertigo in the midst of the black hole created by
> the minutiae of the string manipulation involved. I take
> it you can identify with this:)
>
> But perhaps the solution can be framed another way
> obviating the need for all the horrendous string machinations.
>
> I would frame this problem, like so many similar problems,
> as a relatively simple ranking problem. And by abstracting
> the pain in the a_s drudgery work, the solution becomes
> more intuitive and easily manageable.
>
> Here is some test data:
>
> create table ##t2 (rowid int primary key,strings varchar(2000))
>       insert ##t2 values(1,' 12  ')
>       insert ##t2 values(2,'Order 90159$ - WF63& - Prov ')
>       insert ##t2 values(3,'Order # $1088379 - FC')
>       insert ##t2 values(4,'Order#%1088379- & FC')
>       insert ##t2 values(5,'Bu74zz- 54==')
>       insert ##t2 values(6,'533')
>       insert ##t2 values(7,'74 873')
>       insert ##t2 values(8,'341$53')
>
> --select * from ##t2
>
> I'll present the solution in 3 easy steps. All this can be
> condensed of course:)
>
> Step 1.
> Drudgery #1. Split each string by single character per rowid.
>             And create a new column that is 1 if the character
>             is a digit (ascii between 48 and 57) and 2 for
>             everything else. This is column 'type'.
>
> Exec Rac
> @split='[position]',
> @rows='rowid & [position]',
> @pvtcol='strings',@rank='str',
> @splitby='1', -- Splits a string by N consecutive positions.
> @from='##t2',@defaults1='y',@rowbreak='n',@racheck='y',
> @select='select rd,1*rowid as rowid,1*position as pos,str1,
>         case when ascii(str1) between 48 and 57 then 1 else 2 end as type
>         into ##t3
>           from rac
>             order by rd'
>
> --select * from ##t3
> Here is data for rowid=8. Column 'rd' is an integer
> in ascending order over the whole table. Column 'pos'
> is an integer indicating the position of each character
> in the overall string. Column 'str1' is the character.
> So we've just tranposed each string into rows where the
> order of the rows corresponds to the left to right
> order of the characters in 'strings'.
>
> rd          rowid       pos         str1 type
> ----------- ----------- ----------- ---- -----------
> 96          8           1           3    1
> 97          8           2           4    1
> 98          8           3           1    1
> 99          8           4           $    2
> 100         8           5           5    1
> 101         8           6           3    1
>
> Step 2. This is the real solution. Looking at
> 'type' within 'rowid' ordered by 'rd'
> (ie partition (rowid,type order by rd))
> obtain a rank ('typerank') that increments
> by 1 only when 'type' changes. Within a
> 'type' it stays the same. Ranks go from 1
> to N within each 'rowid', where N is the
> number of changes (digit/non-digit)+1.
> Ranks are reset to 1 with each new 'rowid'.
>
> This is data for rowid=8 showing what we want.
>
> rowid  type pos  str1 typerank
> ------ ---- ---- ---- -----------
> 8      1    1    3    1
> 8      1    2    4    1
> 8      1    3    1    1
> 8      2    4    $    2
> 8      1    5    5    3
> 8      1    6    3    3
>
> Now the key part. The rows corresponding to the
> 1st number in each 'strings' (by 'rowid') must
> be of 'type' 1 (their digits) and must have a rank
> ('typerank') <=2. Any other possibilities must
> mean it is not the 1st number.
>
> Execute Rac to get the ranks and filter on
> 'type'=1 and 'typerank'<=2.
>
> Exec Rac
> @transform='_dummy_',
> @rows='rowid & type & pos & str1',
> @rowsort='rd',
> @pvtcol='Sql*Plus',
> @from='##t3',@defaults1='y',@rowbreak='n',@racheck='y',
> @rowindicators='type{typerank}',@counterdatatype='int',
> @wherecounters='type=1 and typerank<=2',
> @select='select rd,1*rowid as rowid,1*pos as pos,str1
>         into ##t4
>         from rac
>          order by rd'
>
> --select * from ##t4
> Here's the data for rowid=7 and 8.
> rowid  type pos  str1 typerank
> ------ ---- ---- ---- -----------
> 7      1    1    7    1
> 7      1    2    4    1
> 8      1    1    3    1
> 8      1    2    4    1
> 8      1    3    1    1
>
> We now have the 1st number in rows for
> each 'rowid'.
>
> Step 3. Just the drudgery of concatenating the characters
>        ('str1'), the rows of each number, per rowid.
>        Trivial with Rac.
>
> Exec Rac
> @transform='Max(str1) as str',
> @rows='rowid',
> @pvtcol='pos',
> @from='##t4',
> @defaults1='y',@racheck='y',
> @concatenate='str',@stringname='number',
> @separator='',@cutpvt='y'
>
> And here are the numbers. Essentially getting
> the occurrence of the Nth number in a string
> becomes simplified by filting on 'type' and 'rank'
>
> rowid  number
> ------ -------
> 1      12
> 2      90159
> 3      1088379
> 4      1088379
> 5      74
> 6      533
> 7      74
> 8      341
>
> Amateurs believe their creativity is the mother
> of their work. Professionals embrace abstraction
> and don't feel orphaned.
>
> Visit RAC @
> www.rac4sql.net
>
> I think an even more interesting solution using
> D4 (Dataphor) and Sql Server 2005 is possible.
>
> best,
> http://racster.blogspot.com
>
>

AddThis Social Bookmark Button