1

For my search I want to take into account the fact that the "space" character is not mandatory in a filter request.
For exemple:
when I filter on "THE ONE" I see the corresponding document.
I want to see it even if I write "THEONE".
This is how my query is built today:

boolQueryBuilder.must(QueryBuilders.boolQuery()
     .should(QueryBuilders.wildcardQuery("description", "*" + 
         searchedWord.toLowerCase() + "*"))
     .should(QueryBuilders.wildcardQuery("id", "*" + 
         searchedWord.toUpperCase() + "*"))
     .should(QueryBuilders.wildcardQuery("label", "*" + 
         searchedWord.toUpperCase() + "*"))
     .minimumShouldMatch("1"));

What I want is to add this filter: (Writing a space-ignoring autocompleter with ElasticSearch)

"word_joiner": {
  "type": "word_delimiter",
  "catenate_all": true
}   

But I don't know how to do this using the API. Any idea?
Thanks!

EDIT: Following @raam86 suggestion, I added my own custom analyzer:

{
    "index": {
      "number_of_shards": 1,
      "analysis": {
        "filter": {
          "word_joiner": {
            "type": "word_delimiter",
            "catenate_all": true
          }
        },
        "analyzer": {
          "custom_analyzer": {
            "type": "custom",
            "tokenizer": "standard",
            "filter": [
              "word_joiner"
            ]
          }
        }
      }
    }
}

And here is the document:

@Document(indexName = "cake", type = "pa")
@Setting(settingPath = "/elasticsearch/config/settings.json")
public class PaElasticEntity implements Serializable {
   @Field(type = FieldType.String, analyzer = "custom_analyzer")
    private String maker;
}

Still not working...

1 Answer 1

5

You need a shingle token filter. Simple example.

1. create index with settings

PUT joinword
{
    "settings": {
        "analysis": {
            "filter": {
                "word_joiner": {
                    "type": "shingle",
                    "output_unigrams": "true",
                    "token_separator": ""
                }
            },
            "analyzer": {
                "word_join_analyzer": {
                    "type": "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "word_joiner"
                    ]
                }
            }
        }
    }
}

2. check that analyzer work as expected

GET joinword/_analyze?pretty
{
  "analyzer": "word_join_analyzer",
  "text": "ONE TWO"
}

output:

{
  "tokens" : [ {
    "token" : "one",
    "start_offset" : 0,
    "end_offset" : 3,
    "type" : "<ALPHANUM>",
    "position" : 0
  }, {
    "token" : "onetwo",
    "start_offset" : 0,
    "end_offset" : 7,
    "type" : "shingle",
    "position" : 0
  }, {
    "token" : "two",
    "start_offset" : 4,
    "end_offset" : 7,
    "type" : "<ALPHANUM>",
    "position" : 1
  } ]
}

So now you can find this document by one, two or onetwo. A search will be case insensitive.

Working Spring example

Full project available on GitHub.

Entity:

@Document(indexName = "document", type = "document", createIndex = false)
@Setting(settingPath = "elasticsearch/document_index_settings.json")
public class DocumentES {
    @Id()
    private String id;
    @Field(type = String, analyzer = "word_join_analyzer")
    private String title;

    public DocumentES() {
    }

    public DocumentES(java.lang.String title) {
        this.title = title;
    }

    public java.lang.String getId() {
        return id;
    }

    public void setId(java.lang.String id) {
        this.id = id;
    }

    public String getTitle() {
        return title;
    }

    public void setTitle(String title) {
        this.title = title;
    }

    @Override
    public java.lang.String toString() {
        return "DocumentES{" +
                "id='" + id + '\'' +
                ", title='" + title + '\'' +
                '}';
    }
}

Main:

@SpringBootApplication
@EnableConfigurationProperties(value = {ElasticsearchProperties.class})
public class Application implements CommandLineRunner {
    @Autowired
    ElasticsearchTemplate elasticsearchTemplate;

    public static void main(String[] args) {
        SpringApplication.run(Application.class);
    }

    @Override
    public void run(String... args) throws Exception {
        elasticsearchTemplate.createIndex(DocumentES.class);
        elasticsearchTemplate.putMapping(DocumentES.class);

        elasticsearchTemplate.index(new IndexQueryBuilder()
                .withIndexName("document")
                .withType("document")
                .withObject(new DocumentES("ONE TWO")).build()
        );

        Thread.sleep(2000);
        NativeSearchQuery query = new NativeSearchQueryBuilder()
                .withIndices("document")
                .withTypes("document")
                .withQuery(matchQuery("title", "ONEtWO"))
                .build();

        List<DocumentES> result = elasticsearchTemplate.queryForList(query, DocumentES.class);

        result.forEach (System.out::println);

    }
}
Sign up to request clarification or add additional context in comments.

19 Comments

Thanks for the answer! Still doesn't work with this analyzer :( But I couldn't do the step two... I just tried this query: http://localhost:9200/cake/_search?q=ONETWO and it doesn't give me any result. What tool do you use to preform the second step?
@anna you can use curl something like curl -XGET http://localhost:9200/cake/_analyze?pretty -d { "analyzer": "word_join_analyzer", "text": "ONE TWO" }
All right, so I executed this curl -XGET "http://localhost:9200/cake/_analyze?analyzer=word_join_analyzer&pretty" -d 'ONE TWO' and I get the error: curl: (6) Could not resolve host: TWO'... Does it allow spaces? I tried using the character %20, but the results were totally wrong.
curl localhost:9200/joinword/_analyze?pretty -d '{"analyzer":"word_join_analyzer", "text": "ONE TWO"}'
Found solution: escaping space character as "\ ". So I get the right output! The one that is shown by @Nikita Klimov... So does this mean that this works?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.