Permutation of string as substring of another

Question

Given a string A and another string B. Find whether any permutation of B exists as a substring of A.

For example,

if A = "encyclopedia"

if B="dep" then return true as ped is a permutation of dep and ped is a substring of A.

My solution->

if length(A)=n and length(B)=m

I did this in 0((n-m+1)*m) by sorting B and then checking A 
with window size of m each time.

I need to find a better and a faster solution.

That's already a good approach. A slightly faster approach is to simply count the frequency of each character in B, and then see whether those counts match the counts in each window of A that you consider. — j_random_hacker
– j_random_hacker, Commented Mar 29, 2014 at 21:57
You can easily update B's frequency vector in O(1) time per window with this approach -- just subtract a count of 1 for the outgoing character, and add a count of 1 for the incoming one. — j_random_hacker
– j_random_hacker, Commented Mar 29, 2014 at 21:58
1. Build up in freqB[i] the number of times character i appears in B. (E.g. in your example, freqB['d'] == freqB['e'] == freqB['p'] == 1, and freqB[i] == 0 for all other characters i.) 2. For each length-m window of A, do the same, but storing them in freqA[], and then check whether, for each character i, freqA[i] == freqB[i]. If so, you have a match. To move from the length-m window starting at position j to the next one, you'll need to do --freqA[A[j]] and ++freqA[A[j+m]]. — j_random_hacker
– j_random_hacker, Commented Mar 29, 2014 at 22:13

rsaha77 · Accepted Answer · 2022-10-19 18:48:16Z

12

There is a simpler solution to this problem.

Here: n = A.size (), m = B.size ()

The idea is to use hashing.

First we hash the characters of string B.

Suppose: B = "dep"

hash_B ['d'] = 1;
hash_B ['e'] = 1;
hash_B ['p'] = 1;

Now we run a loop over the string 'A' for each window of size 'm'.

Suppose: A = "encyclopedia"

First window of size 'm' will have characters {e, n, c}. We will hash them now.

win ['e'] = 1
win ['n'] = 1
win ['c'] = 1

Now we check if the frequency of each character from both the arrays (hash_B [] and win []) are same. Note: Maximum size of hash_B [] or win [] is 26, hence the complexity would be O(26 * N) ~ O(N).

If they are not same we shift our window.

After shifting the window we decrease the count of win ['e'] by 1 and increase the count of win ['y'] by 1.

win ['n'] = 1
win ['c'] = 1
win ['y'] = 1

During the seventh shift, the status of your win array is:

win ['p'] = 1;
win ['e'] = 1;
win ['d'] = 1;

which is same as the hash_B array. So, Print "SUCCESS" and exit.

edited Oct 19, 2022 at 18:48

answered Sep 9, 2014 at 22:43

rsaha77

1632 silver badges14 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Katie Over a year ago

Thank you. This is the clearest answer for me. It's the same window method as described above but with better examples. One question: Why 26 in the O calculation?

rsaha77 Over a year ago

@KatieSissons Thanks for pointing that out. Edited the answer :)

them Over a year ago

why you call it linear time? This is not linear time, the time here is O(|B| * |A|), there is another solution with O(|A| + |B|).

Vladimir Buskin Over a year ago

@them, yes and if B is a substring of A means smaller string, 0(|A| + |B|) is simplified to 0(|A|), just for fun

Ephraim · Accepted Answer · 2015-10-25 02:32:14Z

If I only have to worry about ASCII characters, it can be done in O(n) time with O(1) space. My code also prints the permutations out, but can be easily modified to simply return true at the first instance instead. The main part of the code is located in the printAllPermutations() method. Here is my solution:

Some Background

This is a solution that I came up with, it is somewhat similar to the idea behind the Rabin Karp Algorithm. Before I understanding the algorithm, I will explain the math behind it as follows:

Let S = {A_1, ..., A_n} be a multiset list of size N that contains only prime numbers. Let the sum of the numbers in S equal some integer Q. Then S is the only possible entirely prime multiset of size N, whose elements can sum to Q.

Because of this, we know we can map every character to a prime number. I propose a map as follows:

1 -> 1st prime
2 -> 2nd prime
3 -> 3rd prime
...
n -> nth prime

If we do this (which we can because ASCII only has 256 possible characters), then it becomes very easy for us to find each permutation in the larger string B.

The Algorithm:

We will do the following:

1: calculate the sum of the primes mapped to by each of the characters in A, let's call it smallHash.

2: create 2 indices (righti and lefti). righti is initialized to zero, and lefti is initialzed to the size of A.

ex:     |  |
        v  v
       "abcdabcd"
        ^  ^
        |  |

3: Create a variable currHash, and initialize it to the sum of the corresponding prime numbers mapped to by each of the characters in B, between (inclusive) righti, and lefti - 1.

4: Iterate both righti and lefti by 1, each time updating currHash by subtracting the prime mapped from the character that is no longer in the range (lefti - 1) and adding the prime corresponding to the character just added to the range (righti)

5: Each time currHash is equal to smallHash, the characters in the range must be a permutation. So we print them out.

6: Continue until we have reached the end of B. (When righti is equal to the length of B)

This solution runs in O(n) time complexity and O(1) space.

The Actual Code:

public class FindPermutationsInString {
    //This is an array containing the first 256 prime numbers
    static int primes[] = 
          {
            2,     3,     5,     7,    11,    13,    17,    19,    23,    29,
            31,    37,    41,    43,    47,    53,    59,    61,    67,    71,
            73,    79,    83,    89,    97,   101,   103,   107,   109,   113,
            127,   131,   137,   139,   149,   151,   157,   163,   167,   173,
            179,   181,   191,   193,   197,   199,   211,   223,   227,   229,
            233,   239,   241,   251,   257,   263,   269,   271,   277,   281,
            283,   293,   307,   311,   313,   317,   331,   337,   347,   349,
            353,   359,   367,   373,   379,   383,   389,   397,   401,   409,
            419,   421,   431,   433,   439,   443,   449,   457,   461,   463,
            467,   479,   487,   491,   499,   503,   509,   521,   523,   541,
            547,   557,   563,   569,   571,   577,   587,   593,   599,   601,
            607,   613,   617,   619,   631,   641,   643,   647,   653,   659,
            661,   673,   677,   683,   691,   701,   709,   719,   727,   733,
            739,   743,   751,   757,   761,   769,   773,   787,   797,   809,
            811,   821,   823,   827,   829,   839,   853,   857,   859,   863,
            877,   881,   883,   887,   907,   911,   919,   929,   937,   941,
            947,   953,   967,   971,   977,   983,   991,   997,  1009,  1013,
           1019,  1021,  1031,  1033,  1039,  1049,  1051,  1061,  1063,  1069,
           1087,  1091,  1093,  1097,  1103,  1109,  1117,  1123,  1129,  1151,
           1153,  1163,  1171,  1181,  1187,  1193,  1201,  1213,  1217,  1223,
           1229,  1231,  1237,  1249,  1259,  1277,  1279,  1283,  1289,  1291,
           1297,  1301,  1303,  1307,  1319,  1321,  1327,  1361,  1367,  1373,
           1381,  1399,  1409,  1423,  1427,  1429,  1433,  1439,  1447,  1451,
           1453,  1459,  1471,  1481,  1483,  1487,  1489,  1493,  1499,  1511,
           1523,  1531,  1543,  1549,  1553,  1559,  1567,  1571,  1579,  1583,
           1597,  1601,  1607,  1609,  1613,  1619
          };

    public static void main(String[] args) {
        String big = "abcdabcd";
        String small = "abcd";
        printAllPermutations(big, small);
    }

    static void printAllPermutations(String big, String small) {

        // If the big one is smaller than the small one,
        // there can't be any permutations, so return
        if (big.length() < small.length()) return;

        // Initialize smallHash to be the sum of the primes
        // corresponding to each of the characters in small.
        int smallHash = primeHash(small, 0, small.length());

        // Initialize righti and lefti.
        int lefti = 0, righti = small.length();

        // Initialize smallHash to be the sum of the primes
        // corresponding to each of the characters in big.
        int currentHash = primeHash(small, 0, righti);

        while (righti <= big.length()) {
            // If the current section of big is a permutation
            // of small, print it out.
            if (currentHash == smallHash)
                System.out.println(big.substring(lefti, righti));

            // Subtract the corresponding prime value in position
            // lefti. Then increment lefti
            currentHash -= primeHash(big.charAt(lefti++));

            if (righti < big.length()) // To prevent index out of bounds
                // Add the corresponding prime value in position righti.
                currentHash += primeHash(big.charAt(righti));

            //Increment righti.
            righti++;
        }

    }

    // Gets the sum of all the nth primes corresponding
    // to n being each of the characters in str, starting
    // from position start, and ending at position end - 1.
    static int primeHash(String str, int start, int end) {
        int value = 0;
        for (int i = start; i < end; i++) {
            value += primeHash(str.charAt(i));
        }
        return value;
    }

    // Get's the n-th prime, where n is the ASCII value of chr
    static int primeHash(Character chr) {
        return primes[chr];
    }
}

Keep in mind, however, that this solution only works when the characters can only be ASCII characters. If we are talking about unicode, we start getting into prime numbers that exceed the maximum size of an int, or even a double. Also, I'm not sure that there are 1,114,112 known primes.

I love this answer, except for the part where you say you add/subtract the times, it's multiply/divide. Sum of 2 primes are not unique, product of 2 primes is.
Could you tell me what is the name of the math you using? I found 2 words which are not permutation of each other having the same hash. Please take a look at: pastebin.com/c2TuXENY
Is this not O(|A|+|B|), not O of a single variable? Calculating smallHash is O(|B|), and iterating through A is O(|A|). I like this solution though, hashing using primes and their products is elegant.
This is very elegant but unfortunately have to agree with @TronyTr here. Runnable Python code to show the collision: pastebin.com/BuvWeGVV. I know the radix-2p hash can be used to determine if two strings are permutations of each other but it's not possible to compute the radix-2p hash in a chained manner like this as you do in step 4 (each character's 2p must be raised to a different power depending on its position in the string).

rici · Accepted Answer · 2014-03-29 23:38:08Z

7

Building a little on the algorithm presented by j_random_hacker in comments, it is possible to find the match in O(|A|+|B|), as follows: (Note: throughout, we use |A| to mean "the length of A".)

Create an integer array count whose domain is the size of the alphabet, initialized to all 0s.
Set distance to 0
For each character B_i in B:
- Decrement count[B_i].
- If the previous count of count[B_i] was 0, also increment distance.
For each character A_i in A:
- Increment count[A_i]
- If i is greater than |B| decrement count[A_i-|B|].
- For each of the two count values modified, if the previous value was 0, then increment distance and if the new value is 0 then decrement distance.
- If the result is that distance is 0 then a match has been found.

Note: The algorithm presented by j_random_hacker is also O(|A|+|B]) because the cost of comparing freqA with freqB is O(|alphabet|), which is a constant. However, the above algorithm reduces the comparison cost to a small constant. In addition, it is theoretically possible to make this work even if the alphabet is not a constant size by using the standard trick for uninitialized arrays.

edited Mar 29, 2014 at 23:38

answered Mar 29, 2014 at 23:26

rici

243k30 gold badges263 silver badges364 bronze badges

3 Comments

j_random_hacker Over a year ago

I like it! So distance is the number of characters whose counts disagree in the substrings (from A and from B) being considered at the moment. And I think it's fair to make alphabet size a parameter k, so that my algorithm is O(k|A| + |B|) and yours is O(|A| + |B| + k).

tzs Over a year ago

|B| <= |A|, so it is meaningless to say O(|A| + |B|). It is simply O(|A|).

rici Over a year ago

@tzs: "Meaningless" seems a bit overstated. "Redundant" would be a reasonable criticism.

Patrick Trentin · Accepted Answer · 2019-10-11 15:49:30Z

This answer proposes a fixed implementation of the idea proposed by @Ephraim in his own answer.

The original answer confuses the multiplication property of a given set of primes with addition. The fixed statement would be:

Let S = {A_1, ..., A_n} be a multiset list of size N that contains only prime numbers. Let the product of the numbers in S equal some integer Q. Then S is the only possible entirely prime multiset of size N, whose elements can multiply to Q.

In order to avoid numerical overflows, the implementation uses infinite precision arithmetic based on the C++ library libgmpxx.

Under the assumption that the comparison between two numbers is in O(1), then the solution is in O(|A| + |B|). The previous assumption might actually not be the case for inputs where |B| is large enough. When |B| > |A| the function returns in O(1).

Example:

#include <iostream>
#include <string>
#include <gmpxx.h>

static int primes[] =
          {
            2,     3,     5,     7,    11,    13,    17,    19,    23,    29,
            31,    37,    41,    43,    47,    53,    59,    61,    67,    71,
            73,    79,    83,    89,    97,   101,   103,   107,   109,   113,
            127,   131,   137,   139,   149,   151,   157,   163,   167,   173,
            179,   181,   191,   193,   197,   199,   211,   223,   227,   229,
            233,   239,   241,   251,   257,   263,   269,   271,   277,   281,
            283,   293,   307,   311,   313,   317,   331,   337,   347,   349,
            353,   359,   367,   373,   379,   383,   389,   397,   401,   409,
            419,   421,   431,   433,   439,   443,   449,   457,   461,   463,
            467,   479,   487,   491,   499,   503,   509,   521,   523,   541,
            547,   557,   563,   569,   571,   577,   587,   593,   599,   601,
            607,   613,   617,   619,   631,   641,   643,   647,   653,   659,
            661,   673,   677,   683,   691,   701,   709,   719,   727,   733,
            739,   743,   751,   757,   761,   769,   773,   787,   797,   809,
            811,   821,   823,   827,   829,   839,   853,   857,   859,   863,
            877,   881,   883,   887,   907,   911,   919,   929,   937,   941,
            947,   953,   967,   971,   977,   983,   991,   997,  1009,  1013,
           1019,  1021,  1031,  1033,  1039,  1049,  1051,  1061,  1063,  1069,
           1087,  1091,  1093,  1097,  1103,  1109,  1117,  1123,  1129,  1151,
           1153,  1163,  1171,  1181,  1187,  1193,  1201,  1213,  1217,  1223,
           1229,  1231,  1237,  1249,  1259,  1277,  1279,  1283,  1289,  1291,
           1297,  1301,  1303,  1307,  1319,  1321,  1327,  1361,  1367,  1373,
           1381,  1399,  1409,  1423,  1427,  1429,  1433,  1439,  1447,  1451,
           1453,  1459,  1471,  1481,  1483,  1487,  1489,  1493,  1499,  1511,
           1523,  1531,  1543,  1549,  1553,  1559,  1567,  1571,  1579,  1583,
           1597,  1601,  1607,  1609,  1613,  1619
          };

mpz_class prime_hash(std::string const &str, size_t start, size_t end)
{
    mpz_class hash(1);
    for (size_t i = start; i < end; ++i) {
        hash *= primes[(size_t) str.at(i)];
    }
    return hash;
}

void print_all_permutations(std::string const &big, std::string const &small)
{
    const size_t big_len = big.length();
    const size_t small_len = small.length();

    if (small_len <= 0 || big_len < small_len) {
        // no possible permutation!
        return;
    }

    // O(small_len)
    mpz_class small_hash = prime_hash(small, 0, small_len);
    mpz_class curr_hash = prime_hash(big, 0, small_len);

    // O(big_len)
    size_t left_idx = 0;
    do {

        if (curr_hash == small_hash) {
            std::cout << "Permutation `" << big.substr(left_idx, small_len)
                      << "' of `" << small
                      << "' at index " << left_idx
                      << " of `" << big
                      << "'." << std::endl;
        }

        curr_hash /= primes[(size_t) big.at(left_idx)];

        if (left_idx + small_len < big_len) {
            curr_hash *= primes[(size_t) big.at(left_idx + small_len)];
        }

        ++left_idx;

    } while (left_idx + small_len < big_len);
}

int main(int argc, char *argv[])
{
    // @user2826957's input
    print_all_permutations("encyclopedia", "dep");

    // @Ephraim's input
    print_all_permutations("abcdabcd", "abcd");

    // @Sam's input
    print_all_permutations("cbabadcbbabbc", "abbc");

    // @butt0s input
    print_all_permutations("", "");
    print_all_permutations("", "a");
    print_all_permutations("a", "");
    print_all_permutations("a", "a");

    return 0;
}

The example is compiled with:

~$ g++ permstr.cpp -lgmpxx -lgmp -o run
~$ ./run
Permutation `ped' of `dep' at index 7 of `encyclopedia'.
Permutation `abcd' of `abcd' at index 0 of `abcdabcd'.
Permutation `bcda' of `abcd' at index 1 of `abcdabcd'.
Permutation `cdab' of `abcd' at index 2 of `abcdabcd'.
Permutation `dabc' of `abcd' at index 3 of `abcdabcd'.
Permutation `cbab' of `abbc' at index 0 of `cbabadcbbabbc'.
Permutation `cbba' of `abbc' at index 6 of `cbabadcbbabbc'.
Permutation `a' of `a' at index 0 of `a'.

note that 1619 * 1619 * 1619 > int.Max; this way may face operation overflows error
@ChrisPhan libgmpxx features INFINITE precision arithmetic, hence no overflow. It might become slow pretty quickly though. This was noted in my answer.

Sam · Accepted Answer · 2018-08-26 16:17:31Z

1

My approach is first give yourself a big example such as

a: abbc b: cbabadcbbabbc Then literally go through and underline each permutation a: abbc b: cbabadcbbabbc '__' '__' '__' Therefore For i-> b.len: sub = b.substring(i,i+len) isPermuted ? Here is code in java

class Test {
  public static boolean isPermuted(int [] asciiA, String subB){
    int [] asciiB = new int[26];

    for(int i=0; i < subB.length();i++){
      asciiB[subB.charAt(i) - 'a']++;
    }
    for(int i=0; i < 26;i++){
        if(asciiA[i] != asciiB[i])
        return false;
    }
    return true;
  }
  public static void main(String args[]){
    String a = "abbc";
    String b = "cbabadcbbabbc";
    int len = a.length();
    int [] asciiA = new int[26];
    for(int i=0;i<a.length();i++){
      asciiA[a.charAt(i) - 'a']++;
    }
    int lastSeenIndex=0;
    for(int i=0;i<b.length()-len+1;i++){
      String sub = b.substring(i,i+len);
      System.out.printf("%s,%s\n",sub,isPermuted(asciiA,sub));
} }
}

answered Aug 26, 2018 at 16:17

Sam

6615 silver badges5 bronze badges

2 Comments

Priyanka Taneja Over a year ago

What will be the complexity of this solution?

Vladimir Buskin Over a year ago

Time complexity is O(A + B * (A + A)). 1st A - filling "asciiA", 2nd A - making substring "sub" (can be optimized by sending indexStart, indexEnd, so we don't need to copy), 3rd A - filling "asciiB". And 0(A + B * (A + A)) simplifies to O(B*A). A - length of a, B - length of b. Additional space complexity is O(1).

Adil Soomro · Accepted Answer · 2018-09-06 13:09:07Z

1

The below function will return true if the String B is a permuted substring of String A.

public boolean isPermutedSubstring(String B, String A){
    int[] arr = new int[26];

    for(int i = 0 ; i < A.length();++i){
        arr[A.charAt(i) - 'a']++;
    }
    for(int j=0; j < B.length();++j){
        if(--arr[B.charAt(j)-'a']<0) return false;
    }
    return true;
}

edited Sep 6, 2018 at 13:09

Adil Soomro

37.7k9 gold badges105 silver badges155 bronze badges

answered Jul 19, 2017 at 17:14

akuriako

4475 silver badges7 bronze badges

Comments

atertmac · Accepted Answer · 2020-04-17 13:45:52Z

1

The idea is clear in above talkings. An implementation with O(n) time complexity is below.

#include <stdio.h>
#include <string.h>

const char *a = "dep";
const char *b = "encyclopediaped";

int cnt_a[26];
int cnt_b[26];

int main(void)
{
    const int len_a = strlen(a);
    const int len_b = strlen(b);

    for (int i = 0; i < len_a; i++) {
            cnt_a[a[i]-'a']++;
            cnt_b[b[i]-'a']++;
    }

    int i;
    for (i = 0; i < len_b-len_a; i++) {
            if (memcmp(cnt_a, cnt_b, sizeof(cnt_a)) == 0)
                    printf("%d\n", i);
            cnt_b[b[i]-'a']--;
            cnt_b[b[i+len_a]-'a']++;
    }
    if (memcmp(cnt_a, cnt_b, sizeof(cnt_a)) == 0)
            printf("%d\n", i);

    return 0;
}

edited Apr 17, 2020 at 13:45

atertmac

33 bronze badges

answered Sep 16, 2016 at 5:12

cyb70289

777 bronze badges

Comments

gstanlo · Accepted Answer · 2017-11-22 04:46:12Z

Here's a solution that's pretty much rici's answer. https://wandbox.org/permlink/PdzyFvv8yDf3t69l It allocates a little more than 1k stack memory for the frequency table. O(|A| + |B|), no heap allocations.

#include <string>

bool is_permuted_substring(std::string_view input_string,
                           std::string_view search_string) {
  if (search_string.empty()) {
    return true;
  }

  if (search_string.length() > input_string.length()) {
    return false;
  }

  int character_frequencies[256]{};
  auto distance = search_string.length();
  for (auto c : search_string) {
    character_frequencies[(uint8_t)c]++;
  }

  for (auto i = 0u; i < input_string.length(); ++i) {
    auto& cur_frequency = character_frequencies[(uint8_t)input_string[i]];
    if (cur_frequency > 0) distance--;
    cur_frequency--;

    if (i >= search_string.length()) {
      auto& prev_frequency = ++character_frequencies[(
          uint8_t)input_string[i - search_string.length()]];
      if (prev_frequency > 0) {
        distance++;
      }
    }

    if (!distance) return true;
  }

  return false;
}

int main() {
  auto test = [](std::string_view input, std::string_view search,
                 auto expected) {
    auto result = is_permuted_substring(input, search);
    printf("%s: is_permuted_substring(\"%.*s\", \"%.*s\") => %s\n",
           result == expected ? "PASS" : "FAIL", (int)input.length(),
           input.data(), (int)search.length(), search.data(),
           result ? "T" : "F");
  };

  test("", "", true);
  test("", "a", false);
  test("a", "a", true);
  test("ab", "ab", true);
  test("ab", "ba", true);
  test("aba", "aa", false);
  test("baa", "aa", true);
  test("aacbba", "aab", false);
  test("encyclopedia", "dep", true);
  test("encyclopedia", "dop", false);

  constexpr char negative_input[]{-1, -2, -3, 0};
  constexpr char negative_search[]{-3, -2, 0};
  test(negative_input, negative_search, true);

  return 0;
}

Yawar Murtaza · Accepted Answer · 2018-12-23 21:17:46Z

I am late to this party...

The question is also discussed in the book named Cracking the Coding Interview, 6th Edition on page number 70. The auther says there is a possiblity of finding all permutations using O(n) time complexity (linear) but she doesnt write the algorithm so I thought I should give it a go.

Here is the C# solution just in case if someone was looking...

Also, I think (not 100% sure) it finds the count of permutations using O(n) time complexity.

public int PermutationOfPatternInString(string text, string pattern)
{
    int matchCount = 0;
    Dictionary<char, int> charCount = new Dictionary<char, int>();
    int patLen = pattern.Length;
    foreach (char c in pattern)
    {
        if (charCount.ContainsKey(c))
        {
            charCount[c]++;
        }
        else
        {
            charCount.Add(c, 1);
        }
    }

    var subStringCharCount = new Dictionary<char, int>();

    // loop through each character in the given text (string)....
    for (int i = 0; i <= text.Length - patLen; i++)
    {
        // check if current char and current + length of pattern-th char are in the pattern.
        if (charCount.ContainsKey(text[i]) && charCount.ContainsKey(text[i + patLen - 1]))
        {
            string subString = text.Substring(i, patLen);
            int j = 0;
            for (; j < patLen; j++)
            {
                // there is no point going on if this subString doesnt contain chars that are in pattern...
                if (charCount.ContainsKey(subString[j]))
                {
                    if (subStringCharCount.ContainsKey(subString[j]))
                    {
                        subStringCharCount[subString[j]]++;
                    }
                    else
                    {
                        subStringCharCount.Add(subString[j], 1);
                    }
                }
                else
                {
                    // if any of the chars dont appear in the subString that we are looking for
                    // break this loop and continue...
                    break;
                }
            }

            int x = 0;

            // this will be true only when we have current subString's permutation count
            // matched with pattern's.
            // we need this because the char count could be different 
            if (subStringCharCount.Count == charCount.Count)
            {
                for (; x < patLen; x++)
                {
                    if (subStringCharCount[subString[x]] != charCount[subString[x]])
                    {
                        // if any count dont match then we break from this loop and continue...
                        break;
                    }
                }
            }

            if (x == patLen)
            {
                // this means we have found a permutation of pattern in the text...
                // increment the counter.
                matchCount++;
            }

            subStringCharCount.Clear(); // clear the count map.
        }
    }

    return matchCount;
}

and here is the unit test method...

[TestCase("encyclopedia", "dep", 1)]
[TestCase("cbabadcbbabbcbabaabccbabc", "abbc", 7)]
[TestCase("xyabxxbcbaxeabbxebbca", "abbc", 2)]
public void PermutationOfStringInText(string text, string pattern, int expectedAnswer)
{
    int answer = runner.PermutationOfPatternInString(text, pattern);
    Assert.AreEqual(expectedAnswer, answer);
}

Eric Aya · Accepted Answer · 2019-02-22 08:34:38Z

Taking O(TEXT.length) runtime complexity.

Some of these solutions will not work when average of calculated text value matches average of calculated pattern value. Ex - uw and vv. Though they are not matching, some solutions above still returns TRUE.

public static void main(String[] args) {
    String pattern = "dep";
    String text = "encyclopedia";
    System.out.println(isPermutationAvailable(pattern, text));
}

public static boolean isPermutationAvailable(String pattern, String text) {
    if (pattern.length() > text.length()) {
        return false;
    }
    int[] patternMap = new int[26];
    int[] textMap = new int[26];
    for (int i = 0; i < pattern.length(); i++) {
        patternMap[pattern.charAt(i) - 'a']++;
        textMap[text.charAt(i) - 'a']++;
    }
    int count = 0;
    for (int i = 0; i < 26; i++) {
        if (patternMap[i] == textMap[i]) {
            count++;
        }
    }
    for (int i = 0; i < text.length() - pattern.length(); i++) {
        int r = text.charAt(i + pattern.length()) - 'a';
        int l = text.charAt(i) - 'a';
        if (count == 26) {
            return true;
        }

        textMap[r]++;
        if (textMap[r] == patternMap[r]) {
            count++;
        }
        else if (textMap[r] == patternMap[r] + 1) {
            count--;
        }

        textMap[l]--;
        if (textMap[l] == patternMap[l]) {
            count++;
        }
        else if (textMap[l] == patternMap[l] - 1) {
            count--;
        }
    }
    return count == 26;
}

user2937812 · Accepted Answer · 2020-12-09 12:58:17Z

Based on checking the longer string with a window size of short one. Since permutation only changes the position of a character, sorted string will always be the same.


        #include <iostream>
        #include <bits/stdc++.h>
        using namespace std;
        
        int main ()
        {
          string shortone =  "abbc";
          string longone ="cbabadcbbabbcbabaabccbabc";
        
          int s_length = shortone.length ();
          int l_length = longone.length ();
          string sub_string;
          string unsorted_substring; // only for printing
        
          // sort the short one
          sort (shortone.begin (), shortone.end ());
        
          if (l_length > s_length)
            {
                      for (int i = 0; i <= l_length - s_length; i++){
                    
                          sub_string = "";
                          
                          //Move the window
                          sub_string = longone.substr (i, s_length);
                          unsorted_substring=sub_string;
                          
                          // sort the substring window 
                          sort (sub_string.begin (), sub_string.end ());
                          if (shortone == sub_string)
                            {
                              cout << "substring is :" << unsorted_substring << '\t' <<
                            "match found at the position: " << i << endl;
                            }
                
                
                    }
            }
          return 0;
        }

Collectives™ on Stack Overflow

Permutation of string as substring of another

11 Answers 11

4 Comments

Some Background

The Algorithm:

The Actual Code:

5 Comments

3 Comments

2 Comments

My approach is first give yourself a big example such as

2 Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

4 Comments

Some Background

The Algorithm:

The Actual Code:

5 Comments

3 Comments

2 Comments

My approach is first give yourself a big example such as

2 Comments

Comments

Comments

Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related