1

I have the following MYSQL query in a Cold Fusion Component being called from a ColdFusion website. I have a custom take that basically builds a boolean query for the following fields: "PlaceName,Country,Adm1,adm2,adm3,locality".

<!--- SEARCH PLACES 2--->
<cffunction name="searchPlaces_full" access="public" returntype="query">
    <cfargument name="q" type="string" required="yes">


    <CF_BOOLSEARCH searchterm="#arguments.q#" field="PlaceName,Country,Adm1,adm2,adm3,locality " booloperator="and">
    <cfquery name="GetPlaces" datasource="#application.settings.dsn#">
    SELECT 
        places.CatID,
        places.PlaceName,
        places.PlaceID,
        places.Address1,
        places.PostalCode,
        places.Locality,
        places.Address2,
        places.ImageThumb,
        places.Adm1,
        places.Country,
        places.Adm2,
        places.Adm3,
        places.contributorid,
        places.lng,
        places.lat,
        places.verified,
        places.verified_by,
        places.verified_date
    FROM places INNER JOIN places_cats ON places.PlaceID = places_cats.PlaceID
    WHERE  
    <cfif len(trim(arguments.q))>
    (#PreserveSingleQuotes(boolsearch)#)
    <cfelse>
    1=0
    </cfif>
    AND places_cats.CATID IN (#arguments.categories#)
    GROUP BY 
        places.CatID,
        places.PlaceName,
        places.PlaceID,
        places.Address1,
        places.PostalCode,
        places.Locality,
        places.Address2,
        places.ImageThumb,
        places.Adm1,
        places.Country,
        places.Adm2,
        places.Adm3,
        places.contributorid,
        places.lng,
        places.lat,
        places.verified,
        places.verified_by,
        places.verified_date
        ORDER BY PlaceName
        </cfquery>


    <cfreturn getPlaces>
</cffunction>

There are 624227 records in the database. If I do a search for Chappaqua, the actually SQL that gets run is as follows:

       SELECT 
        places.CatID,
        places.PlaceName,
        places.PlaceID,
        places.Address1,
        places.PostalCode,
        places.Locality,
        places.Address2,
        places.ImageThumb,
        places.Adm1,
        places.Country,
        places.Adm2,
        places.Adm3,
        places.contributorid,
        places.lng,
        places.lat,
        places.verified,
        places.verified_by,
        places.verified_date
    FROM places INNER JOIN places_cats ON places.PlaceID = places_cats.PlaceID
    WHERE  

    (((PlaceName LIKE '%chappaqua%') OR (Country LIKE '%chappaqua%') OR (Adm1 LIKE '%chappaqua%') OR (adm2 LIKE '%chappaqua%') OR (adm3 LIKE '%chappaqua%') OR (locality  LIKE '%chappaqua%')))

    AND places_cats.CATID IN (1,21,15,32,16,26,29,27,28,25,75,89,38,5,36,88,87,31,33,24,35,37,90,39,40,34,30,9,8,7,11,20,19,96,97,95,13,17,14,12,3,2,4,84,85,86)
    GROUP BY 
        places.CatID,
        places.PlaceName,
        places.PlaceID,
        places.Address1,
        places.PostalCode,
        places.Locality,
        places.Address2,
        places.ImageThumb,
        places.Adm1,
        places.Country,
        places.Adm2,
        places.Adm3,
        places.contributorid,
        places.lng,
        places.lat,
        places.verified,
        places.verified_by,
        places.verified_date
        ORDER BY PlaceName

I know it is ugly and complex. It takes about 1836ms to run. Is there any better way to write the query or code so that it speeds up the returned data in under a second?

Here is the Explain on the SQL: SQL EXPLAIN

12
  • 4
    Show us your indexes. Show us your EXPLAIN. Commented Feb 1, 2013 at 20:05
  • You appear to use no aggregating functions, so what's with the GROUP BY? Commented Feb 1, 2013 at 20:07
  • 2
    @Strawberry They are trying to get DISTINCT rows Commented Feb 1, 2013 at 20:08
  • Suggest they use that then! Commented Feb 1, 2013 at 20:09
  • @Strawberry That's why the GROUP BY is frustrating with MySQL Commented Feb 1, 2013 at 20:10

1 Answer 1

4

The problem is that the engine is doing a full table scan of the table, then sorting the results for the group by.

The full table scan almost seems necessary because of the likes:

(((PlaceName LIKE '%chappaqua%') OR (Country LIKE '%chappaqua%') OR (Adm1 LIKE '%chappaqua%') OR (adm2 LIKE '%chappaqua%') OR (adm3 LIKE '%chappaqua%') OR (locality  LIKE '%chappaqua%')))

The issue is that an index on PlaceName cannot be used, because the initial characters are not fixed.

So . . . Can you remove the group by? You can at least replace it with distinct, although I don't think that would affect the query plan. Do you have an index on place_cats(placeId, catId)? That would at least prevent the query from reading the category table so it could just do index lookups.

Can you limit the search only to words at the beginning of the fields?

The only other alternative that I can think of is to switch to using a full text index in MySQL.

Sign up to request clarification or add additional context in comments.

4 Comments

OK... Shaving the wildcard off the beginning of the word does save a half second, but I would hate to have to do that.
Yes, I do have an index on place_cats(placeId, catId).
Yeah, wouldn't it be worth looking at doing a full-text search in place of all that LIKE stuff?
@user1224994 . . . One option is full text search (dev.mysql.com/doc/refman/5.5/en/fulltext-search.html). It is possible that if you had a table of just the names and then joining that back would work in your case. The smaller table would scan faster and the overhead of the join would be small on a not-very-selective query.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.