1

I'm struggling with understanding a concept of minimum supersequence of two given strings.

I understand how to print this supersequence from 2D dp table that we get after solving LCS problem for two given strings. I do not understand how we can prove that output we get is indeed minimum supersequence.

We need to prove that

  1. It's minimal - I guess it's obvious because we print all distinct elements of two strings and print elements of LCS only once, it's okay

  2. It's a supersequence - I do not understand why the output result we get mantains the correct order of supersequence

Can someone clarify it, please?

1 Answer 1

0

Since you don't specify the exact algorithm other than mentioning that it's DP table, I'm going to start from the beginning. Consider a procedure to check if S is a subsequence of T:

check(S, T):
    state = 0
    if length(S) == 0: return true
    for char c in T:
        if S[state] == c: state++
        if length(S) == state: return true
    return false

The internal state here is the length of the longest prefix of S that can be found as a subsequence of the currently processed prefix of T. It should be obvious why this is correct.

With 2 strings (S1, S2), we have pairs of states (state1, state2) and dynamic programming becomes: what is the shortest possible prefix of T such that the procedure above ends up with these state values when we run it independently on these 2 strings? Compared to listing all possible strings with a given length and calculating this pair of states for each of them, the DP only uses two optimisation ideas:

  • to find dp[state1][state2] from (prev_state1, prev_state2) for any previous state by adding any character, we only need any shortest prefix of T that gives (prev_state1, prev_state2), which is dp[prev_state1][prev_state2], because (look at check(S, T)) the next state doesn't depend on previous part of prefix of T, only on the previous state and the character being added
  • the only next characters c that are worth trying are S1[state1] and S2[state2], since anything else will give the same state with greater length

At the end, dp[length(S1)][length(S2)] gives the shortest prefix i.e. shortest superstring T for which both checks (Sx is a subsequence of T) say true. All that the DP transitions do is efficient simulation of these state-update loops for both strings.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.