Tell me SQL Server Full-Text searcher is crazy, not me.

Posted by Ian Boyd on Stack Overflow See other posts from Stack Overflow or by Ian Boyd
Published on 2010-06-03T13:29:29Z Indexed on 2010/06/03 13:34 UTC
Read the original article Hit count: 298

i have some customers with a particular address that the user is searching for:

123 generic way

There are 5 rows in the database that match:

ResidentialAddress1
=============================
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY

i run a FT query to look for these rows. i'll show you each step as i add more criteria to the search:

SELECT ResidentialAddress1 FROM Patrons
WHERE CONTAINS(Patrons.ResidentialAddress1, '"123*"')

ResidentialAddress1
=========================
123 MAPLE STREET
12345 TEST
123 MINE STREET
123 GENERIC WAY
123 FAKE STREET
...

(30 row(s) affected)

Okay, so far so good, now adding the word "generic":

SELECT ResidentialAddress1 FROM Patrons
WHERE  CONTAINS(Patrons.ResidentialAddress1, '"123*"')
AND CONTAINS(Patrons.ResidentialAddress1, '"generic*"')

ResidentialAddress1
=============================
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY
123 GENERIC WAY

(5 row(s) affected)

Excellent. And now i'l add the final keyword that the user wants to make sure exists:

SELECT ResidentialAddress1 FROM Patrons
WHERE  CONTAINS(Patrons.ResidentialAddress1, '"123*"')
AND CONTAINS(Patrons.ResidentialAddress1, '"generic*"')
AND CONTAINS(Patrons.ResidentialAddress1, '"way*"')


ResidentialAddress1            
------------------------------ 

(0 row(s) affected)

Huh? No rows? What if i query for just "way*":

SELECT ResidentialAddress1 FROM Patrons
WHERE CONTAINS(Patrons.ResidentialAddress1, '"way*"')

ResidentialAddress1            
------------------------------ 

(0 row(s) affected)

At first i thought that perhaps it's because of the *, and it's requiring that the root way have more characters after it. But that's not true:

  • Searching for "123*" matches "123"
  • Searching for "generic*" matches "generic"
  • Books online says, The asterisk matches zero, one, or more characters

What if i remove the * just for s&g:

SELECT ResidentialAddress1 FROM Patrons
WHERE CONTAINS(Patrons.ResidentialAddress1, '"way"')

Server: Msg 7619, Level 16, State 1, Line 1
A clause of the query contained only ignored words. 

So one might think that you are just not allowed to even search for way, either alone, or as a root. But this isn't true either:

SELECT * FROM Patrons
WHERE CONTAINS(Patrons.*, '"way*"')

AccountNumber FirstName Lastname
------------- --------- --------
33589         JOHN      WAYNE                    

So sum up, the user is searching for rows that contain all the words:

123 generic way

Which i, correctly, translate into the WHERE clauses:

SELECT * FROM Patrons
WHERE CONTAINS(Patrons.*, '"123*"')
AND CONTAINS(Patrons.*, '"generic*"')
AND CONTAINS(Patrons.*, '"way*"')

which returns no rows. Tell me this just isn't going to work, that it's not my fault, and SQL Server is crazy.

Note: i've emptied the FT index and rebuilt it.

© Stack Overflow or respective owner

Related posts about sql-server

Related posts about sql-server-2000