0

I want the most efficient (in case of speed) solution to remove array of given words from the given string:

So far I have this not working solution:

const excludeWordList = ['the', 'in', 'a', 'an'];

run("the wall into")
run("paintings covered the wall another words into this")

function run(speech) {

    for(let a = 0; a < excludeWordList.length; a++) {
        speech = speech.replaceAll(excludeWordList[a], '');
    }
    console.log(speech);
}

As you see the code has three major issues:

  1. It removes characters inside the words not just the single words

  2. The result is not a trimmed string we have extra spaces inside words of the result too

  3. The code is not the most efficient way I think!!!, because I need to loop through all the excludeWordList array.

I wrote my function as my and as you see the Gainza function is the most efficient function in this case:

enter image description here

5 Answers 5

2

I'd use the a filter & set approach to minimize computation time (instead of includes or indexOf that iterate the whole array)

const excluded = new Set(['the', 'in', 'a', 'an']);


function run(speech) {
    return speech.split(' ')
           .filter(word => !excluded.has(word))
           .join(' ');
}


run("the wall into")
run("paintings covered the wall another words into this")
Sign up to request clarification or add additional context in comments.

Comments

2

Here's a way that doesn't loop, replaces all excluded words with a single regex, and finished up with a trim and extra space cleanup. This should be the fasted method. You can map the array into a Regex and preserve word boundaries \b, which have to be escaped when building it dynamically

const excludeWordList = ['the', 'in', 'a', 'an'];
const reg = new RegExp(excludeWordList.map(w => `\\b${w}\\b`).join('|'), 'g')

run("the wall into")
run("paintings covered the wall another words into this")

function run(speech) {
  speech = speech.replaceAll(reg, '').trim().replace(/\s\s+/g, ' ');
  console.log(speech);
}

Comments

1

const excludeWordList = ['the', 'in', 'a', 'an'];

run("the wall into")
run("paintings covered the wall another words into this")

function run(speech) {

    const result = speech.split(' ').filter(word=>!excludeWordList.includes(word)).join(' ')
    console.log(result);
}

5 Comments

If the word list is large, it would be more efficient to convert it to a Set first, to avoid all the sequential searching in includes
Agree w/ @Barmar. This includes is "heavy" on long lists an won't query in O(1)
His solution has the best performance why?
I used for loop in my solution and our solution is very similar, you used filter but your solution is doing better almost every time. I thought for loop has far better performance than filter method Am I wrong???
@SaraRee It took 2 years but today I can answer, my code is faster thanks to functional programming, since in a for loop you are occupying more memory space than an anonymous function.
1

You can try using split() and filter().

Edit: indexOf() complexity is O(N) because of linear search. Since we have a fixed set of words we want to exclude, converting to a set is ideal.

new Set() is also O(N), but since it is being done only once and your run() will be called more often it makes sense here. With set, .has() has O(1) complexity.

const excludeWordList = ['the', 'in', 'a', 'an'];

const excludeWordSet = new Set(excludeWordList);

run("the wall into")
run("paintings covered the wall another words into this")

function run(speech) {
speech = speech.split(' ').filter((a) => {
return (!excludeWordSet.has(a))
}).join(' ');
  
console.log(speech);
}

filter() still has O(N) complexity. join() has O(N). so this is still O(N^2), same as your initial attempt.

2 Comments

indexOf is the same as includes and will consume much time on large sets
Thanks for that. Using set now.
1

The problem with a solution using whitespaces as word separator is that it will most likely fail when you use punctuation for example e.g.

'the, Beattles'.split(' ').includes('the')
//=> false

Instead you should use \b (word boundary):

const excludes = ['the', 'in', 'a', 'an'];
const re = new RegExp('\\b(?:'+excludes.join('|')+')\\b', 'g');

console.log("the.wall.into".replace(re, ''));
console.log("paintings covered the, wall another words into this".replace(re, ''));

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.