Let's say your input is this:
/mnt/qfs-X/Asset_Management/XG_Marketing_/Episodic-SG_1001_1233.jpg
What I am going to do is convert all this forward slash and underscore into whitespaces
So effectively your input would be looking now as
mnt qfs-X Asset_Management XG Marketing Episodic-SG 1001 1233.jpg
Using the standard tokenizer along with token_filter(standard and lowercase) below would be the list of words you'd finally have which would be stored in your inverted index eventually which could be queried.
mnt qfs X asset management xg marketing episodic sg 1001 1233 jpg
Below is the sample mapping and query for the above:
Mapping
PUT mysampleindex
{
"settings":{
"analysis":{
"analyzer":{
"my_analyzer":{
"tokenizer":"standard",
"char_filter":[
"my_char_filter"
],
"filter":[
"standard",
"lowercase"
]
}
},
"char_filter":{
"my_char_filter":{
"type":"pattern_replace",
"pattern":"\\/|_",
"replacement":" "
}
}
}
},
"mappings":{
"mydocs":{
"properties":{
"mytext":{
"type":"text",
"analyzer":"my_analyzer"
}
}
}
}
}
Sample Document
POST mysampleindex/mydocs/1
{
"mytext": "nt/qfs-X/Asset_Management/XG_Marketing_/Episodic-SG_1001_1233.jpg"
}
Sample Query
POST mysampleindex/_search
{
"query":{
"match":{
"mytext":"qfs episodic sg 1001 jpg"
}
}
}
Keep in mind that when you send the above query to Elasticsearch, Elasticsearch would take the input and apply the Search Time Analysis there as well. I'd suggest you to read this link for more information on this and its the reason why you would get the document even with the below query string.
"mytext": "QFS EPISODIC SG 1001 jpg"
Now if you try to search using pisodic (episodic) i.e below query as an example, the search wouldn't return anything, coz your inverted index doesn't save the words in that fashion. For such scenarios I'd suggest you to make use of N-Gram Tokenizer so that episodic would be further create words like episodi, pisodic which would be stored in inverted index.
POST mysampleindex/_search
{
"query":{
"match":{
"mytext":"pisodic"
}
}
}
Also note that I have been making use of text and not keyword datatype.
I hope this helps!