0

I'm currently using the ElasticSearch 5.5 to search through source files as fast as possible.

The thing is, I need to search for exact phrases, but it can be just part of it. I've searched the whole google but couldn't find other similar cases.

For example, if the source file is "public static class ElasticMethods". I need to be able to search for "ic static class Elast".

I'm not sure about what analyzer should be used. If I use the standard analyzer, it will break words a part. That is a problem because I need to search for the exact phrase, so even if the source file has words like public or static and class, it is not a match if it is not in the exactly same order.

I've tried to use the keyword analyzer then, but the problem with the keyword is that a term cannot have more then 32kb, which most of my sources have.

If someone could help I'd appreciate it.

Here is one of the mappings that I've tried.

PUT /my_index?pretty
{
  "mappings": {
    "programa": {
      "properties": {
        "tFSPath": {
          "type": "text",
          "analyzer": "keyword"
        },
        "fileName": {
          "type": "text",
          "analyzer": "keyword"
        },
        "data": {
          "type": "text",
          "analyzer": "keyword"
        }
      }
    }
  }
}

A document example:

PUT /my_index/programa/2
{
    "tFSPath": "$/Projects/AuthAPI/AuthAPI/Controllers/HomeController.cs",
    "fileName:" : "HomeController.cs",
    "data": "using System.Web.Mvc; namespace AuthAPI.Controllers { public class HomeController : Controller { public ActionResult Index() { ViewBag.Title = \"Home Page\"; return View(); } } }"
}

And the query that got me the closest of what I need:

POST my_index/_search
{
    "query": {
        "regexp":{
            "data": ".*\"public ActionResult Index\".*"
      }
    }
}

In the example above, "ic ActionResult Ind" would be a match, but "Index ActionResult public" woulnd't. That's what I need.

4
  • Please provide more information to the document. What type is the filed you 're searching on? Commented Dec 22, 2017 at 13:56
  • They're source files from multiple languages. Java, C#, Javascript, etc.I`m using the text type for that. Commented Dec 22, 2017 at 15:15
  • How does our query look like? And please give an example of a document. Otherwize it's hard to understnad what you are trying to do... Also check this: stackoverflow.com/questions/30517904/… and this stackoverflow.com/questions/22093334/… Commented Dec 22, 2017 at 15:39
  • Hey aholbreich, thanks for the help. I've read that post, the problem is that I can't use the keyword analyzer because my files have more than 32kb. I've also edited the post with an example of file and a query that worked, but only using the keyword analyzer. Commented Dec 22, 2017 at 16:16

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.