0

I have strings stored in a table like:

1. "the quick brown fox"
2. "the quick brown fox jumps"
3. "the quick brown fox jumps over the lazy dog"
4. "the quick potato does nothing"

And given three input words I want to return that entry when all three words are found in the string

So I'm doing this:

WHERE word1 IN stringfield AND word2 IN stringfield AND word3 IN stringfeild

However I want to optionally provide additional input words to filter the results by entries that contain the most matches to the input words. So all returned matches will have at least three matches.

So for example the input words of:

"the", "quick", "brown", "fox", "jumps", "over"

returns:

3.
2. 
1. 

Because 3 has the most matches, then 2, then 1. And 4 doesnt get selected because it didnt contain at least three matches.

Is that at all possible? And is this the fastest way to do it, or would I be better to use junction tables? If so how? Thanks so much.

2
  • 1
    i guess it is doable on mySQL but i guess it would be easier on php...given the fact that you have tagged the question with "php" Commented May 30, 2015 at 11:14
  • Yes either works for me. Commented May 30, 2015 at 11:32

2 Answers 2

1

First, you might be best off using MySQLs full text functionality. Read about it here.

I am assuming that you are constructing your where clause dynamically, so if you have five words, you can construct:

WHERE stringfield LIKE '%word1%' OR
      stringfield LIKE '%word2%' OR
      stringfield LIKE '%word3%' OR
      stringfield LIKE '%word4%' OR
      stringfield LIKE '%word5%' 

The IN operator simply does not do what you think it is doing.

If you can do this, then the full query would also have:

WHERE ((stringfield LIKE '%word1%') +
       (stringfield LIKE '%word2%') +
       (stringfield LIKE '%word3%') +
       (stringfield LIKE '%word4%') +
       (stringfield LIKE '%word5%')
      ) >= 3
ORDER BY ((stringfield LIKE '%word1%') +
          (stringfield LIKE '%word2%') +
          (stringfield LIKE '%word3%') +
          (stringfield LIKE '%word4%') +
          (stringfield LIKE '%word5%')
         ) DESC

MySQL treats boolean expressions as integers in a numeric context. This makes it particularly easy to count the number of matches. But, as I say, a full text index may be what you really need.

Sign up to request clarification or add additional context in comments.

5 Comments

Thats fantastic. Just to complicate things even further, is there any way via php or mysql that I can store how many matches each resulting entry had? Also full-text looks good, but any idea how to achieve the same thing using those commands?
@Lukesmith . . . Yes. Just put the expression in the SELECT clause.
Like this?: SELECT *, ((stringfield LIKE '%word1%') + (stringfield LIKE '%word2%') + (stringfield LIKE '%word3%') + (stringfield LIKE '%word4%') + (stringfield LIKE '%word5%') ) AS numMatches ... is this going to be very slow algorithm now? Say if I have average 10 words and 300 - 500 database entries?
@Lukesmith . . . It will take the same amount of time, regardless of whether it is in the select, where, or order by clauses. You can use order by NumMatches for that case. However, for speed, you should look into the full text functions.
OK great thanks so much. And yeah I'll definitely look into the full-text functions.
1

Well when i start to think that something is hard then i try to get it done... Here is a solution : (TheName of the Database is 'Test')

1st create this function

DELIMITER $$
CREATE DEFINER = 'root'@'%'
FUNCTION Test.countOccurence (LineTocheck nvarchar(255), criteriaToMatch nvarchar(15))
RETURNS int(11)
BEGIN
  DECLARE Occurences int DEFAULT 0;
  SELECT
    (LENGTH(LineTocheck) - LENGTH(REPLACE(LineTocheck, criteriaToMatch, ''))) / LENGTH(criteriaToMatch) INTO Occurences;
  RETURN Occurences;
END
$$

DELIMITER ;

2nd you execute the query :

SELECT Generic.id
    ,Description
    ,SUM(countOccurence(Description, c.criteria))
FROM Generic
    ,criteria c
GROUP BY Description
    ,Generic.id
ORDER BY SUM(countOccurence(Description, c.criteria)) desc

P.S. the table structure is : For the Criteria :

CREATE TABLE Test.criteria (
  id int(11) NOT NULL AUTO_INCREMENT,
  criteria varchar(15) NOT NULL,
  PRIMARY KEY (id)
)
ENGINE = INNODB
AUTO_INCREMENT = 1
CHARACTER SET utf8
COLLATE utf8_general_ci;

For the table you want to search the Occurences

CREATE TABLE Test.Generic (
  id int(11) NOT NULL AUTO_INCREMENT,
  Description varchar(255) NOT NULL,
        PRIMARY KEY (id)
)
ENGINE = INNODB
AUTO_INCREMENT = 1
CHARACTER SET utf8
COLLATE utf8_general_ci;

SET NAMES 'utf8';

INSERT INTO Test.criteria(id, criteria) VALUES
(1, 'fox');
INSERT INTO Test.criteria(id, criteria) VALUES
(2, 'brown');
INSERT INTO Test.criteria(id, criteria) VALUES
(3, 'over');

SET NAMES 'utf8';

INSERT INTO Test.Generic(id, Description) VALUES
(1, 'the quick brown fox');
INSERT INTO Test.Generic(id, Description) VALUES
(2, 'the quick brown fox jumps');
INSERT INTO Test.Generic(id, Description) VALUES
(3, 'the quick brown fox jumps over the lazy dog');
INSERT INTO Test.Generic(id, Description) VALUES
(4, 'the quick potato does nothing');

Use Dbforge MySQL Studio Express (free) to connect to MySQL and run the statements http://www.devart.com/login.html?returnToUrl=/dbforge/mysql/studio/download.html%3Ffd=dbforgemysqlfree.exe

Test it and let me know

2 Comments

Sorry but this was beyond me, hopefully it will be useful to some future user though.
@Lukesmith its quite straight forward...just copy paste the sql statements..i will modify the answer to include sample statements

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.