0

I'm new to Entity Framework. I need to search this word 'johnny' in text stored in the database. The text is from pdf files. So there are many words in the text columns.

Here is my code

using (var d = new DatabaseContext())
{
    var l = d.Pages.Where(x => x.Text.ToLower().Contains(text.ToLower())).ToList();
}

So far the code is working.

But the requirement changed. If the user types in jhonny bravo, the program will have to search for the word jhonny and bravo in the Text column. The jhonny and bravo should be found, even if the words in the Text column are:

Jhonny is bravo
jhonny is not bravo

How can I solve this?

I came up with the idea that split the text and search for each word

using (var d = new DatabaseContext())
{
    var split = Regex.Split(text, @"\s+");

    if (split.Count() > 1)
    {
        var l = d.Pages.Where(x => x.Text.ToLower().Contains(text.ToLower())).ToList();
    }
}

But using the code above. How can I create a dynamic search? What if the search term contains 6 words? How can I do the query? Thank you.

1
  • Try Shorpy Commented Apr 8 at 7:02

2 Answers 2

1

You can create Where chain from word conditions:

using (var db = new DatabaseContext())
{
    var words = Regex.Split(text, @"\s+");
    var query = db.Pages.AsQuerable();
    foreach(var word in words)
        query = query.Where(x => x.Text.ToLower().Contains(word.ToLower()));
    var answer = query.ToList();
}
Sign up to request clarification or add additional context in comments.

3 Comments

Why do you use AsQueryable? dbContext.Pages already implements IQueryable<Page>, doesn't it?
@HaraldCoppoolse, yes it is. But how are you going assign IQueryable<Page> to IDbSet<Page>? Only reverse is possible.
Do you really need IDbSet? Is that because you are using query syntax instead of method syntax? I always use method syntax, so I'm not really familiar with the ins and outs of query syntax. Using method syntax, I would think that since you are using only extension methods of IQueryable, that you wouldn't need IdbSet?
0

Here I split the text on spaces, we then get a list of every word in the text.

I also use the method Distinct() to remove all the duplicated word, I'm not sure if there is any performance gain but if you don't like it you can remove it.

var keywords = ["john", "bravo", "hello"]
var l = d.Pages
         .Where(page => { 
                 var words = page.Text.ToLower().Split(' '). Distinct();

                 foreach(var keyword in keywords) {
                     if (!words.Contains(keyword.ToLower())
                        return false;
                 }

                 return true;
             }
         )
         .ToList();

// "john," "johnnxx" will also count as true

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.