7

I have a search box that performs a search on title field based on the given input, so the user has recommended all available titles starting with the text inserted.It is based on Lucene and Hibernate Search. It works fine until space is entered. Then the result disapear. For example, I want "Learning H" to give me "Learning Hibernate" as the result. However, this doesn't happen. could you please advice me what should I use here instead.

Query Builder:

QueryBuilder qBuilder = fullTextSession.getSearchFactory()
        .buildQueryBuilder().forEntity(LearningGoal.class).get();
  Query query = qBuilder.keyword().wildcard().onField("title")
        .matching(searchString + "*").createQuery();

  BooleanQuery bQuery = new BooleanQuery();
  bQuery.add(query, BooleanClause.Occur.MUST);
  for (LearningGoal exGoal : existingGoals) {
     Term omittedTerm = new Term("id", String.valueOf(exGoal.getId()));
     bQuery.add(new TermQuery(omittedTerm), BooleanClause.Occur.MUST_NOT);
  }
  @SuppressWarnings("unused")
  org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery(
        query, LearningGoal.class);

Hibernate class:

@AnalyzerDef(name = "searchtokenanalyzer",tokenizer = @TokenizerDef(factory = StandardTokenizerFactory.class),
filters = {
  @TokenFilterDef(factory = StandardFilterFactory.class),
  @TokenFilterDef(factory = LowerCaseFilterFactory.class),
  @TokenFilterDef(factory = StopFilterFactory.class,params = { 
      @Parameter(name = "ignoreCase", value = "true") }) })
      @Analyzer(definition = "searchtokenanalyzer")
public class LearningGoal extends Node {
3
  • printing the query to output will definitely help you.. Commented Mar 8, 2013 at 1:41
  • It is useful indeed, but didn't help me to understand why I don't have results. For example, I have learning goal whose title is "Learning Probability Theory". The output of two queries are bQuery:+title:learning p* hibQuery:FullTextQueryImpl(title:learning p*) for input string "learning p". It finds value if the input string is "learning". Commented Mar 8, 2013 at 5:07
  • I also tried to replace space with ?, but it didn't give result. Commented Mar 8, 2013 at 5:40

2 Answers 2

9

I found workaround for this problem. The idea is to tokenize input string and remove stop words. For the last token I created a query using keyword wildcard, and for the all previous words I created a TermQuery. Here is the full code

    BooleanQuery bQuery = new BooleanQuery();
    Session session = persistence.currentManager();
    FullTextSession fullTextSession = Search.getFullTextSession(session);
    Analyzer analyzer = fullTextSession.getSearchFactory().getAnalyzer("searchtokenanalyzer");
    QueryParser parser = new QueryParser(Version.LUCENE_35, "title", analyzer);
    String[] tokenized=null;
    try {
    Query query=    parser.parse(searchString);
    String cleanedText=query.toString("title");
     tokenized = cleanedText.split("\\s");

    } catch (ParseException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

    QueryBuilder qBuilder = fullTextSession.getSearchFactory()
            .buildQueryBuilder().forEntity(LearningGoal.class).get();
    for(int i=0;i<tokenized.length;i++){
         if(i==(tokenized.length-1)){
            Query query = qBuilder.keyword().wildcard().onField("title")
                    .matching(tokenized[i] + "*").createQuery();
                bQuery.add(query, BooleanClause.Occur.MUST);
        }else{
            Term exactTerm = new Term("title", tokenized[i]);
            bQuery.add(new TermQuery(exactTerm), BooleanClause.Occur.MUST);
        }
    }
        for (LearningGoal exGoal : existingGoals) {
        Term omittedTerm = new Term("id", String.valueOf(exGoal.getId()));
        bQuery.add(new TermQuery(omittedTerm), BooleanClause.Occur.MUST_NOT);
    }
    org.hibernate.Query hibQuery = fullTextSession.createFullTextQuery(
            bQuery, LearningGoal.class);
Sign up to request clarification or add additional context in comments.

4 Comments

Can you please add more explanations? I do not get it so far. Why are you using a different query for the last token? And please modify your example, that it is clear enough. Why are existingGoals necessary at all?
Let's say we have title "Hibernate Search". When user entered "Hibernate Se" the first token will be "Hibernate" and we are taking exact term since we know that the user entered the whole word he wanted, as he already started to type another word. For the second word "se", since we know that user might not finished typing, we are using wildcard to be sure that he's not in the middle of the word, which is exactly the case here. So the query for the last word will cover everything starting with "se", and all words user entered before will be used as the exact terms.
For the second question (existingGoals), this is something very specific to my use case scenario. I wanted to exclude from the search results those titles that user already added to his list of selected items, so these existingGoals are actually titles that should be ignored, and you might not need it in your case.
That does make a lot of sense here. I just used your loop for my use case. :) Thank you!
-2

SQL uses different wildcards than any terminal. In SQL '%' replaces zero or more occurrences of any character (in the terminal you use '*' instead), and the underscore '_' replaces exactly one character (in the terminal you use '?' instead). Hibernate doesn't translate the wildcard characters.

So in the second line you have to replace matching(searchString + "*") with

  matching(searchString + "%")

3 Comments

Are you sure about this? After this it doesn't give me any results, even without spaces in searchString. Previously (with *) I had some results until the space arise in searchString.I don't know how this HibernateSearch is related to SQL? It performs searching over the Lucene indexes which are not stored in database, so I'm not sure if it uses SQL syntax.
For Hibernate + SQL I'm sure, but I don't use Lucene, and I don't know what the Lucene engine is doing with the input.
I see. You thought that this is regular database query. However, Hibernate Search uses Lucene queries to search over lucene indexes and its syntax is not the same as SQL lucenetutorial.com/lucene-query-syntax.html

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.