Home All Groups Group Topic Archive Search About

Parsing a TEXT column in an SP or trigger (SQL SERVER 2000)

Author
11 Aug 2005 10:09 PM
rhaley
I have a datatable that contains a TEXT column that I will need to
parse on the server side in a stored procedure or in a trigger. I would
like to be able to read the data "line by line" using Carriage Returns
but I don't think that will be possible. Regardless, I have to retrieve
the TEXT column from the table and then search through it for specific
character configurations . I have thus far gleened that I use a TEXTPTR
function to create a pointer to the TEXT and the READTEXT function to
get 'chunks' of data back. What I don't know is how to get the data
into a parsable datatype such as varchar. The following is as far as
I've got:

ALTER PROCEDURE parse_for_values
(
@message_id INT
)
AS
BEGIN

DECLARE @text_ptr VARBINARY(16)
DECLARE @temp varchar(4000)

SELECT @text_ptr = TEXTPTR(content) FROM message WHERE message_id =
@message_id

READTEXT message.content @text_ptr 0 0

set @temp = @text_ptr --Doesn't do what I thought

print cast(@temp as varchar) --definetly returns the wrong stuff
END


????? Where to go from here? I don't need help parsing (lol, unless
someone has some snazy ideas), I just need to know how to get the data
into a parsable datatype. Or is there some way of parsing the TEXTPTR?

Any suggestions would be a tremendous help.  Thanks all!

Sincerely,

Russ

Author
11 Aug 2005 10:41 PM
ML
Erm... just one question. What if a chunk between two linefeed characters
happens to be larger than varchar(8000)?

Text and image data types are used to store large pieces of data. Processing
them IMHO belongs to the application layer. To an appropriate editor.


ML
Author
11 Aug 2005 10:51 PM
rhaley
I FULLY agree. This is business logic, no two ways about it. That being
said, I still need a solution to the problem. :(

Russ
Author
11 Aug 2005 10:55 PM
Wayne Snyder
Use Substring to pull off values. and Patindex to find the cR/LF

IF EXISTS (SELECT table_name FROM INFORMATION_SCHEMA.TABLES
      WHERE table_name = 'npub_info')
   DROP TABLE npub_info
GO
-- Create npub_info table in pubs database. Borrowed from instpubs.sql.
USE pubs
GO
CREATE TABLE npub_info
(
pub_id         char(4)           NOT NULL
         REFERENCES publishers(pub_id)
         CONSTRAINT UPKCL_npubinfo PRIMARY KEY CLUSTERED,
pr_info        ntext             NULL
)

GO

-- Fill the pr_info column in npub_info with international data.
RAISERROR('Now at the inserts to pub_info...',0,1)

GO

INSERT npub_info VALUES('0736', N'üThis is sample text data for New Moon
Books, publisher 0736 in the pubs database')
INSERT npub_info values('0877', N'üThis is sample text data for Binnet &
Hardley, publisher 0877 in the pubs databa')
INSERT npub_info values('1389', N'üThis is sample text data for Algodata
Infosystems, publisher 1389 in the pubs da')
INSERT npub_info values('9952', N'üThis is sample text data for Scootney
Books, publisher 9952 in the pubs database')
INSERT npub_info values('1622', N'üThis is sample text data for Five Lakes
Publishing, publisher 1622 in the pubs d')
INSERT npub_info values('1756', N'üThis is sample text data for Ramona
Publishers, publisher 1756 in the pubs datab')
INSERT npub_info values('9901', N'üThis is sample text data for GGG&G,
publisher 9901 in the pubs database. GGG&G i')
INSERT npub_info values('9999', N'üThis is sample text data for Lucerne
Publishing, publisher 9999 in the pubs data')
GO
-- Join between npub_info and pub_info on pub_id.
SELECT pr.pub_id, SUBSTRING(pr.pr_info, 1, 35) AS pr_info,
   SUBSTRING(npr.pr_info, 1, 35) AS npr_info
FROM pub_info pr INNER JOIN npub_info npr
   ON pr.pub_id = npr.pub_id
ORDER BY pr.pub_id ASC

--
Wayne Snyder, MCDBA, SQL Server MVP
Mariner, Charlotte, NC
www.mariner-usa.com
(Please respond only to the newsgroups.)

I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org

<rha***@axys.com> wrote in message
Show quote
news:1123798183.726000.203480@g44g2000cwa.googlegroups.com...
>I have a datatable that contains a TEXT column that I will need to
> parse on the server side in a stored procedure or in a trigger. I would
> like to be able to read the data "line by line" using Carriage Returns
> but I don't think that will be possible. Regardless, I have to retrieve
> the TEXT column from the table and then search through it for specific
> character configurations . I have thus far gleened that I use a TEXTPTR
> function to create a pointer to the TEXT and the READTEXT function to
> get 'chunks' of data back. What I don't know is how to get the data
> into a parsable datatype such as varchar. The following is as far as
> I've got:
>
> ALTER PROCEDURE parse_for_values
> (
> @message_id INT
> )
> AS
> BEGIN
>
> DECLARE @text_ptr VARBINARY(16)
> DECLARE @temp varchar(4000)
>
> SELECT @text_ptr = TEXTPTR(content) FROM message WHERE message_id =
> @message_id
>
> READTEXT message.content @text_ptr 0 0
>
> set @temp = @text_ptr --Doesn't do what I thought
>
> print cast(@temp as varchar) --definetly returns the wrong stuff
> END
>
>
> ????? Where to go from here? I don't need help parsing (lol, unless
> someone has some snazy ideas), I just need to know how to get the data
> into a parsable datatype. Or is there some way of parsing the TEXTPTR?
>
> Any suggestions would be a tremendous help.  Thanks all!
>
> Sincerely,
>
> Russ
>
Author
11 Aug 2005 11:50 PM
Aaron Bertrand [SQL Server MVP]
>I have a datatable that contains a TEXT column that I will need to
> parse on the server side in a stored procedure or in a trigger. I would
> like to be able to read the data "line by line" using Carriage Returns
> but I don't think that will be possible.

Sure, but it's not very trivial (in fact I've put over an hour into it now,
and still am not satisfied with the solution).  I will say that you don't
need TEXTPTR or READTEXT to do it, but I'm not convinced that (a) you need
to handle this in the database at all, and (b) that you can't do this with
LIKE or PATINDEX.

> Regardless, I have to retrieve
> the TEXT column from the table and then search through it for specific
> character configurations .

What is wrong with LIKE or PATINDEX?  Why do you have to treat each line
differently?  If these lines of data mean different things, why aren't you
storing each line separately?

You might want to look at how I used TEXTPTR etc. to provide a solution for
simulating REPLACE() in TEXT/NTEXT columns:

http://www.aspfaq.com/2445
Author
12 Aug 2005 6:45 PM
rhaley
What I'm trying to do is pull patterned strings out of an email that
has been stored in a database. There can be one or more lines per email
so I need to parse through the email to pull out the strings. The file
will look something like this:

{Imagine an email header hear}

<start pattern>text that I need to pull out<end pattern>CRLF
<start pattern>text that I need to pull out<end pattern>CRLF
<start pattern>text that I need to pull out<end pattern>CRLF
<start pattern>text that I need to pull out<end pattern>CRLF

This is why I need to be able to parse line by line. I can either take
the field from the original table (i.e. in an sp) or from inserted
(trigger) and then parse the field and insert the lines into a
different table.

The idea was to do this on the database side so the functionality can
be re-used by any other application  (i.e. all that any other app needs
to do is dump the email into the database and the trigger will take
care of the processing).

Thank being said, I just spent an hour on it in C# and have most of the
functionality done.

I guess we'll just see where this goes...

Thanks for all your help everyone!

Cheers
Russ
Author
12 Aug 2005 10:56 PM
Aaron Bertrand [SQL Server MVP]
> What I'm trying to do is pull patterned strings out of an email that
> has been stored in a database. There can be one or more lines per email
> so I need to parse through the email to pull out the strings. The file
> will look something like this:
>
> {Imagine an email header hear}
>
> <start pattern>text that I need to pull out<end pattern>CRLF
> <start pattern>text that I need to pull out<end pattern>CRLF
> <start pattern>text that I need to pull out<end pattern>CRLF
> <start pattern>text that I need to pull out<end pattern>CRLF
>
> This is why I need to be able to parse line by line.

I don't understand.  Why do you need to treat the text line by line?  What
happens if you have this:

<start pattern>text that I needCRLF
to pull out<end pattern>

?

Anyway, you can play with this, but egads, be careful.



CREATE TABLE #blat
(
id INT,
foo TEXT
)

SET NOCOUNT ON

INSERT #blat SELECT 1, 'foo

bar

blatchung

aaaaaaa...--> out to 1000, you can generate this easily using REPLICATE()
bbbbbbb...--> out to 1000
cccccccc...--> out to 1000
eeeeeee...--> out to 1000
ddddddd...--> out to 1000
ffffffffffff...--> out to 1000
gggggggg...--> out to 1000
hhhhhhh...--> out to 1000
iiiiiiiiiiiiiiiii...--> out to 1000
jjjjjjjjjjjjjjjjj...--> out to 1000
kkkkkkk...--> out to 1000



..
'



DECLARE @lastStart INT,
@lastEnd INT,
@id INT,
@counter INT,
@var VARCHAR(8000)

SELECT @id = 1,
        @lastStart = 1,
        @lastEnd = 1,
        @counter = 0

WHILE 1 = 1
BEGIN

SELECT @lastStart = PATINDEX('%'+CHAR(13)+CHAR(10)+'%', SUBSTRING(foo,
@lastEnd, 8000)) FROM #blat WHERE id = @id

--SELECT @lastEnd = @lastStart+COALESCE(@lastEnd+1,0)

SET @counter = @counter + 1

IF @lastStart = 0
  SELECT @var = REPLACE(REPLACE(SUBSTRING(foo, @lastEnd, 8000), CHAR(10),
''), CHAR(13), '') FROM #blat WHERE id = @id
ELSE
  SELECT @var = REPLACE(REPLACE(SUBSTRING(foo, @lastEnd, @lastStart),
CHAR(10), ''), CHAR(13), '') FROM #blat WHERE id = @id

SELECT @lastEnd = @lastEnd+LEN(@var)+2 ---CASE @counter WHEN 1 THEN 2 ELSE
0 END--COALESCE(@lastEnd+1,0)

IF RTRIM(@var) <> ''
BEGIN
PRINT '-------------------'
PRINT 'You can parse line '+RTRIM(@counter)+' here:'
PRINT '-------------------'
PRINT 'VAR = '+@var
END

IF @lastStart = 0
  BREAK
END



DROP TABLE #blat
Author
12 Aug 2005 11:04 PM
Aaron Bertrand [SQL Server MVP]
Sorry, that example could have been formatted much better (and I really wish
OE didn't translate tabs to spaces on send).


CREATE TABLE #blat
(
id INT,
foo TEXT
)

SET NOCOUNT ON

INSERT #blat SELECT 1, 'foo
bar
blatchung
aaaaaaa...--> out to 1000, you can generate this easily using REPLICATE()
bbbbbbb...--> out to 1000

cccccccc...--> out to 1000

eeeeeee...--> out to 1000
ddddddd...--> out to 1000
ffffffffffff...--> out to 1000
gggggggg...--> out to 1000
hhhhhhh...--> out to 1000
iiiiiiiiiiiiiiiii...--> out to 1000
jjjjjjjjjjjjjjjjj...--> out to 1000
kkkkkkk...--> out to 1000



..
'

DECLARE @lastStart INT,
@lastEnd INT,
@id INT,
@counter INT,
@var VARCHAR(8000),
@range INT

SELECT @id = 1,
        @lastStart = 1,
        @lastEnd = 1,
        @counter = 0

WHILE 1 = 1
BEGIN

SELECT @lastStart = PATINDEX
(
  '%'+CHAR(13)+CHAR(10)+'%',
  SUBSTRING(foo, @lastEnd, 8000)
)
FROM #blat
WHERE id = @id

SET @counter = @counter + 1

IF @lastStart = 0
SET @range = 8000
ELSE
SET @range = @lastStart

SELECT @var = REPLACE(REPLACE(
SUBSTRING(foo, @lastEnd, @range),
  CHAR(10), ''), CHAR(13), '')
FROM #blat
WHERE id = @id

SELECT @lastEnd = @lastEnd+LEN(@var)+2

IF RTRIM(@var) <> ''
BEGIN
PRINT '-------------------'
PRINT 'You can parse line '+RTRIM(@counter)+' here:'
PRINT '-------------------'
PRINT 'VAR = '+@var
END

IF @lastStart = 0
  BREAK
END

DROP TABLE #blat

AddThis Social Bookmark Button