Skip to main content
Spelling and grammar
Source Link
Toby Speight
  • 88.7k
  • 14
  • 104
  • 327

So I've been working on this for a few days actually. I implemented a solution to the LeetCode problem: 567. Permutation in String

Given two strings s1(queryStr) and s2(sourceStr), return true if s2 contains a permutation of s1, or false otherwise.In other words, return true if one of s1's permutations is thea substring of s2.

  1. Make it as readable as possible, as if I were doing this at work (we prioritize readability over the typical 1 letter variable-letter-variable nonsense you see in LeetCode submissions).
  2. It needs to still be fast.
  3. If itsit's slow, learn why and optimize.

Okay so forFor my algorithm, I ended up using the sliding window method. If you're unfamiliar with that, basically you keep track of a "window" of letters you can actually see. You check if that window is a permutation, and then if not you just slide it up by 1.

  • queryString size = 38
    sourceStr size = 10^810⁸
    Average time to find permutation = 11832ms

  • queryString size = 38
    sourceStr size = 10^710⁷
    Average time to find permutation = 1235ms

  • queryString size = 38
    sourceStr size = 10^610⁶
    Average time to find permutation = 118ms

  • queryString size = 38
    sourceStr size = 10^410⁴
    Average time to find permutation = 1ms

  • queryString size = 8
    sourceStr size = 10^810⁸
    Average time to find permutation = 23ms

All of these test cases pass and the solution is accepted by LeetCode. But, according to LeetCode, my solution falls in the bottom 10% of solutions for CPU speed. And while their method of speed benchmarking is terrible, itsit's consistently at the bottom.

Given the data, its clear that time increases with the size of query string. This makes sense if you look at IsWindowPermutation. I'm iterating the size of window to check whether it is a permutation. This means for small windows itsit's very fast, but for larger windows itsit's slower.

So I've been working on this for a few days actually. I implemented a solution to the LeetCode problem: 567. Permutation in String

Given two strings s1(queryStr) and s2(sourceStr), return true if s2 contains a permutation of s1, or false otherwise.In other words, return true if one of s1's permutations is the substring of s2.

  1. Make it as readable as possible, as if I were doing this at work (we prioritize readability over the typical 1 letter variable nonsense you see in LeetCode submissions)
  2. It needs to still be fast.
  3. If its slow, learn why and optimize.

Okay so for my algorithm, I ended up using the sliding window method. If you're unfamiliar with that, basically you keep track of a "window" of letters you can actually see. You check if that window is a permutation, and then if not you just slide it up by 1.

  • queryString size = 38
    sourceStr size = 10^8
    Average time to find permutation = 11832ms

  • queryString size = 38
    sourceStr size = 10^7
    Average time to find permutation = 1235ms

  • queryString size = 38
    sourceStr size = 10^6
    Average time to find permutation = 118ms

  • queryString size = 38
    sourceStr size = 10^4
    Average time to find permutation = 1ms

  • queryString size = 8
    sourceStr size = 10^8
    Average time to find permutation = 23ms

All of these test cases pass and the solution is accepted by LeetCode. But, according to LeetCode, my solution falls in the bottom 10% of solutions for CPU speed. And while their method of speed benchmarking is terrible, its consistently at the bottom.

Given the data, its clear that time increases with the size of query string. This makes sense if you look at IsWindowPermutation. I'm iterating the size of window to check whether it is a permutation. This means for small windows its very fast, but for larger windows its slower.

I implemented a solution to the LeetCode problem: 567. Permutation in String

Given two strings s1(queryStr) and s2(sourceStr), return true if s2 contains a permutation of s1, or false otherwise.In other words, return true if one of s1's permutations is a substring of s2.

  1. Make it as readable as possible, as if I were doing this at work (we prioritize readability over the typical 1-letter-variable nonsense you see in LeetCode submissions).
  2. It needs to still be fast.
  3. If it's slow, learn why and optimize.

For my algorithm, I ended up using the sliding window method. If you're unfamiliar with that, basically you keep track of a "window" of letters you can actually see. You check if that window is a permutation, and then if not you just slide it up by 1.

  • queryString size = 38
    sourceStr size = 10⁸
    Average time to find permutation = 11832ms

  • queryString size = 38
    sourceStr size = 10⁷
    Average time to find permutation = 1235ms

  • queryString size = 38
    sourceStr size = 10⁶
    Average time to find permutation = 118ms

  • queryString size = 38
    sourceStr size = 10⁴
    Average time to find permutation = 1ms

  • queryString size = 8
    sourceStr size = 10⁸
    Average time to find permutation = 23ms

All of these test cases pass and the solution is accepted by LeetCode. But, according to LeetCode, my solution falls in the bottom 10% of solutions for CPU speed. And while their method of speed benchmarking is terrible, it's consistently at the bottom.

Given the data, its clear that time increases with the size of query string. This makes sense if you look at IsWindowPermutation. I'm iterating the size of window to check whether it is a permutation. This means for small windows it's very fast, but for larger windows it's slower.

added 221 characters in body
Source Link

The Problem

Given two strings s1(queryStr) and s2(sourceStr), return true if s2 contains a permutation of s1, or false otherwise.In other words, return true if one of s1's permutations is the substring of s2.

The Problem

Given two strings s1(queryStr) and s2(sourceStr), return true if s2 contains a permutation of s1, or false otherwise.In other words, return true if one of s1's permutations is the substring of s2.

added 54 characters in body
Source Link

Method 1 takes 100 nanoseconds to perform the each permutation check by indexing with an offset. Method 2 takes 0 - 100 nanoseconds.

So, I think what I learned is thatMaybe indexing into an array by performing some sortwith the output of some arithmetic is verysuper expensive. I'm going? Either way I submitted this optimization to try making the lookup table size 256 soLeetCode and am now faster than 97% of all submissions. So, I'd call that no offset is neededgood enough. ItI think the swings now are just means that you won't see agoing to be down to the way they measure speed gain until queryStrwhich is over 256 lettersnotoriously inaccurate.

Method 1 takes 100 nanoseconds to perform the each permutation check by indexing with an offset. Method 2 takes 0 nanoseconds.

So, I think what I learned is that indexing into an array by performing some sort of arithmetic is very expensive. I'm going to try making the lookup table size 256 so that no offset is needed. It just means that you won't see a speed gain until queryStr is over 256 letters.

Method 1 takes 100 nanoseconds to perform the each permutation check by indexing with an offset. Method 2 takes 0 - 100 nanoseconds.

Maybe indexing into array with the output of some arithmetic is super expensive? Either way I submitted this optimization to LeetCode and am now faster than 97% of all submissions. So, I'd call that good enough. I think the swings now are just going to be down to the way they measure speed which is notoriously inaccurate.

added 42 characters in body
Source Link
Loading
added 1976 characters in body
Source Link
Loading
added 238 characters in body
Source Link
Loading
added 16 characters in body; edited title
Source Link
Loading
added 16 characters in body; edited title
Source Link
Loading
Added time limit excedded tag.
Link
pacmaninbw
  • 26.2k
  • 13
  • 47
  • 114
Loading
Formatting improvements
Source Link
G. Sliepen
  • 69.5k
  • 3
  • 75
  • 180
Loading
Source Link
Loading