2

We can reduce the Longest Common Subsequence problem to Longest Increasing Subsequence problem if at most one sequence has repetitions. The process of reducing the problem is explained here:

Suppose you have the sequences:

S1 = {D, B, A, C, E}
S2 = {E, Z, X, D, A, Y, C}

Then, create a sequence of integers S3, where you have to put the position of each element of S2 in S1 (if the element doesn't exists in S1, then ignore that element). In the example:

S3 = {4, 0, 2, 3}
// Z, X and Y in S2 where ignored

Then, just find the LIS in S3. To find the original elements, just use the integers in the LIS as indices of S1. For example, in this case the LIS of S3 is {0, 2, 3}, where that represents the sequence {D, A, C}.

How does this approach work? Why does this reduction solve the problem of finding the longest common subsequence?

2 Answers 2

2

Given how you build S3, you are guaranteed that the elements of S3 point to "only and all" the common elements of S1 and S2.

By using the positions and finding the longest increasing subsequence you make sure that what you find will be a subsequence of the original S1 and S2 and not just the number of elements they have in common:

  • It will be a subsequence of S1 because the numerical value of the elements in S3 encode the positions in S1, so increasing subsequence of S3 = subsequence of S1
  • It will be a subsequence of S2 because the elements encoded in the elements of S3 are in the same order as the elements of S2, so subsequence of S3 = subsequence of S2

Therefore, the longest increasing subsequence of S3 will "encode":

  • a subsequence of S1
  • a subsequence of S2
  • a sequence that contains only elements present in both S1 and S2
  • the longest of such subsequences

i.e. the longest common subsequence between S1 and S2

Which you can "decode" by using the process described, i.e. take the elements of S1 with index = element of S3.

NOTE: as pointed out in the link, this only works when at most one of the sequences has repetitions, and when building S3 you should take the sequence without repetitions as S1.

This seems like the inverse of the more common reduction of LIS to LCS.

Sign up to request clarification or add additional context in comments.

2 Comments

Could you please explain this part : "It will be a subsequence of S2 because the elements encoded in the elements of S3 are in the same order as the elements of S2, so subsequence of S3 = subsequence of S2". Thanks!
You build S3 by taking each element of S2, in order, and if it's also present in S1 then you add it to S3 (encoded). That means that S3 is just S2 with some elements removed (the non common ones), in encoded form; therefore each subsequence of S3 necessarily encodes a subsequence of S2. This does not hold for S1: a subsequence of S3 may not encode a subsequence of S1. That's why for S1 i highlighted the increasing part, while for S2 I highlighted the subsequence part of longest increasing subsequence.
0
i made a code working on all sequences whether both have repetitions or not.
import bisect

#Solving by replacing LCS with LIS
# Core idea: Save the character indices of text1, list the indices in order of text2, and calculate LIS

def longestCommonSubsequence(text1: str, text2: str) -> int:
# Step 1: Create a map to record all the character positions in text1# Reverse sort the index list for each character to ensure the correct order when applying LIS
pos_map = {}
for i, char in enumerate(text1):
    if char not in pos_map:
        pos_map[char] = []
        pos_map[char].append(i)
for char in pos_map:
    pos_map[char].reverse()
# step 2: Create an array of positions in text1 in the order of text2
sequence = []
for char in text2:
    if char in pos_map:
        sequence.extend(pos_map[char])

return lis(sequence)
# step3:Finding LIS from the positions array
def lis(arr: list) -> int:
    if not arr:
        return 0
    tails = [] 
    
    for num in arr:
        idx = bisect.bisect_left(tails, num)
        
        if idx == len(tails):
            tails.append(num)
        else:
            tails[idx] = num
           
    return len(tails)
print(longestCommonSubsequence('abcaaabcabc', 'abcaabc')) #7

1 Comment

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.