Home All Groups Group Topic Archive Search About

BCP exporting data, what's this ÿ character?

Author
2 Sep 2005 10:17 AM
Scott A. Keen
I'm using BCP to export data.

When I look at the data with Notepad, some empty fields have this "ÿ"
character, but this character does not appear in the database table.

This is apparently causing problems with the company we're exchange data
with. They are asking for the field to be empty, not to contain this "ÿ"
character.

What can I do to make sure the field is blank instead of containing this "ÿ"
character? Is this a character set issue? How do I correct this?

Thanks,
Scott


For your reference, I've set up a format file which looks like this...

8.0
55
1 SQLCHAR 0 36 "|" 1 Order_GUID SQL_Latin1_General_Cp437_BIN
....snip...
54 SQLCHAR 0 50 "|" 54 PlaceReceipt SQL_Latin1_General_Cp437_BIN
55 SQLCHAR 0 50 "\r\n" 55 container_SealNumber SQL_Latin1_General_Cp437_BIN

Author
2 Sep 2005 10:42 AM
John Bell
Hi

You may want to post DDL and sample data for this so it can be re-produced.

If you try bcp with the format option, does the format file differ to the
one you have?

John

Show quote
"Scott A. Keen" wrote:

> I'm using BCP to export data.
>
> When I look at the data with Notepad, some empty fields have this "ÿ"
> character, but this character does not appear in the database table.
>
> This is apparently causing problems with the company we're exchange data
> with. They are asking for the field to be empty, not to contain this "ÿ"
> character.
>
> What can I do to make sure the field is blank instead of containing this "ÿ"
> character? Is this a character set issue? How do I correct this?
>
> Thanks,
> Scott
>
>
> For your reference, I've set up a format file which looks like this...
>
> 8.0
> 55
> 1 SQLCHAR 0 36 "|" 1 Order_GUID SQL_Latin1_General_Cp437_BIN
> ....snip...
> 54 SQLCHAR 0 50 "|" 54 PlaceReceipt SQL_Latin1_General_Cp437_BIN
> 55 SQLCHAR 0 50 "\r\n" 55 container_SealNumber SQL_Latin1_General_Cp437_BIN
>
>
>
Author
4 Sep 2005 3:13 PM
David
We have a UNIX system that exports that character when a blank field should
be returned.  That strange character typically represents a place holder for
the table / field DB structure.  On the UNIX system we have it uses Informix
as its DB and informix uses the ÿ to repersent a blank field.  Thus, when I
import into MS SQL the character gets loaded into the table.

When exporting the data, I have got around this by using the replace
command, replace(Field, 'ÿ','') to get the blank field, but I do not know if
you can do this with bcp, I typically use dts.

-David

Show quote
"John Bell" wrote:

> Hi
>
> You may want to post DDL and sample data for this so it can be re-produced.
>
> If you try bcp with the format option, does the format file differ to the
> one you have?
>
> John
>
> "Scott A. Keen" wrote:
>
> > I'm using BCP to export data.
> >
> > When I look at the data with Notepad, some empty fields have this "ÿ"
> > character, but this character does not appear in the database table.
> >
> > This is apparently causing problems with the company we're exchange data
> > with. They are asking for the field to be empty, not to contain this "ÿ"
> > character.
> >
> > What can I do to make sure the field is blank instead of containing this "ÿ"
> > character? Is this a character set issue? How do I correct this?
> >
> > Thanks,
> > Scott
> >
> >
> > For your reference, I've set up a format file which looks like this...
> >
> > 8.0
> > 55
> > 1 SQLCHAR 0 36 "|" 1 Order_GUID SQL_Latin1_General_Cp437_BIN
> > ....snip...
> > 54 SQLCHAR 0 50 "|" 54 PlaceReceipt SQL_Latin1_General_Cp437_BIN
> > 55 SQLCHAR 0 50 "\r\n" 55 container_SealNumber SQL_Latin1_General_Cp437_BIN
> >
> >
> >
Author
4 Sep 2005 8:07 PM
Steve Kass
Scott,

This is a code page issue, probably combined with some database
system's (yours?) use of char(255) to mean something (that something
may well be "nothing", in fact). The byte value 255 represents ÿ in
the default Windows code page 1252, but it was the non-breaking
space character in code page 437. Once upon a time
(and still, with 6.5 compatibility) SQL Server could not store
an empty string, and this could have been used to distinguish
"empty" from space(1), if the latter was meaningful data. I'm
speculating, based on the fact that code page 437 and
no-empty-strings both flourished in the 1990's.

If you are opening the data in Notepad and seeing ÿ, you
are probably interpreting the data in code page 1252, and I can
only guess that when you look "in the database table" the data is
interpreted in code page 437, so you "see" the non-breaking space,
which looks like nothing. You may need to take a look at your
data after conversion to binary to be sure what's there, since when
someone says "contains the ÿ character" they aren't making an
unambiguous statement.

The reverse match, by the way, which is possible if you are seeing
ÿ in code page 437, is that the data is supposed to represent the tilde
character ˜ (That's the little raised tilde, not to be confused with
the bigger in-line one, ~, which is also called tilde,) But I think this
is less likely, if these values are supposed to be empty.

If the client is also seeing ÿ, the client may not require the data
to be encoded in 437. The client may or may not need
the non-breaking spaces in the data. In fact, if there are legacy
systems in the mix, it might be that no one knows what they need. :(
Code page 1252 has a non-breaking space, by the way, but it is
char(160).

The solution depends on your requirements, and could amount
to changing the collation specifier in the format file, scrubbing your
database to replace or remove all non-breaking spaces, or
post-processing the bcp output file to remove these characters,
replace them with spaces, or replace them with non-breaking
spaces.

Sorry I don't have a "just do this" answer, but I hope it helps.

Steve Kass
Drew University

Scott A. Keen wrote:

Show quote
>I'm using BCP to export data.
>
>When I look at the data with Notepad, some empty fields have this "ÿ"
>character, but this character does not appear in the database table.
>
>This is apparently causing problems with the company we're exchange data
>with. They are asking for the field to be empty, not to contain this "ÿ"
>character.
>
>What can I do to make sure the field is blank instead of containing this "ÿ"
>character? Is this a character set issue? How do I correct this?
>
>Thanks,
>Scott
>
>
>For your reference, I've set up a format file which looks like this...
>
>8.0
>55
>1 SQLCHAR 0 36 "|" 1 Order_GUID SQL_Latin1_General_Cp437_BIN
>...snip...
>54 SQLCHAR 0 50 "|" 54 PlaceReceipt SQL_Latin1_General_Cp437_BIN
>55 SQLCHAR 0 50 "\r\n" 55 container_SealNumber SQL_Latin1_General_Cp437_BIN
>
>

>

AddThis Social Bookmark Button