0

I am having some trouble when I try to compare two strings. My first string is a word and my second string are some letters that form my word, or not, for example:

String 1, my word: "test"
String 2, my soup: "adhesljdtth"

In this case, I got all the characters of both strings, and start to process them, when I found some char that belongs to my word in my soup, I need to remove it from my soup, and go to the next element.

I found some ways to compare it and get the results using: StringBuilder, LinkedList, arrays and so on, all work with small strings, but when I get a string with a million of characters, I got some performance problems. I tried to use Binary Search in this case, but even this is taking so long to process my results.

I am using Array.sort function to sort both of my strings.

And to verify if soup has all the letters to form my word, I am doing this:

for (int i = 0; i < wordLenght; i++) {
    char key = wordCharList[i];
    int length = soupCharList.size();
    int low = 0;
    int high = length - 1;

    while (low <= high) {

        int mid = (low + high) >>> 1;
        char midVal = soupCharList.get(mid);

        if (midVal < key) {
            low = mid + 1;
        }
        else if (midVal > key) {
            high = mid - 1;
        }
        else if(midVal == key) {
            soupCharList.remove(mid);
            break;
        }
        if(high == -1) {   
            return false;
        }
    }
}
    return true;
}

Do you have any ideas how to compare it with a better performance?

2

1 Answer 1

0

I try to compare two strings

For comparing strings, use String#compare. Obviously, you're doing something else, so name it properly.

I found some ways to compare it and get the results using: StringBuilder, LinkedList, arrays and so on, all work with small strings, but when I get a string with a million of characters

None of these data structures have fast lookup. Use Set or Map for this.

  • When you want to know, if the soup contains all characters from the word, use Set#containsAll.
  • When you want to know, if the soup contains all characters from the word with a sufficient number of occurrences, use a Map<Character, Integer>.
  • For counting, Guava Multiset<Character> is easier to use.

As the number of characters are bounded to a small value, you can use arrays containing the counts. This is not very general, but it's extremely simple and extremely fast:

int[] wordCounts = makeCounts(word);
int[] soupCounts = makeCounts(soup);
for (int i=0; i<wordCounts.length; ++i) {
    if (wordCounts[i] > soupCount[i]) return false;
}
return true;

int[] makeCounts(String s) {
    int[] result = new int[Character.MAX_VALUE + 1];
    for (int i=0; i<s.length(); ++i) ++result[s.charAt(i)];
    return result;
}

As your strings don't use all characters, there are optimizations possible.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.