|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Full-Text search indexing HTML contentI've been quite busy these days looking up which was the best way to
create a full-text search catalog for a field composed entirely of HTML
content, and I found out that the best way to handle it, so it could
ignore HTML tags as the search was performed, was to make the column an
Image-type field, and associate a file type in a separate column.
I developed a small application that read each of the text field entries, converted it to a byte[] variable using the UnicodeEncoding class (and it's GetString and GetBytes methods) and saved the resulting byte array as a binary file in the new image field. I even tested how the field data would fare in a physical file, by using a FileStream to write the Byte[] data on several different records. So far, so good. It all seemed to be working. The problem is that after I build up the full-text catalog (using the wizard, and specifying the file type field related to the "image" field where the HTML file is stored), I get no resulting records whenever I perform a query using either CONTAINS or FREETEXT. Such queries worked just fine when the field was a text field instead of an image field, so I'm guessing something went wrong either on the data conversion, or on the catalog building process. Anyone has any ideas? -- bacusgod ------------------------------------------------------------------------ Posted via http://www.codecomments.com ------------------------------------------------------------------------ |
|||||||||||||||||||||||