4

I want to find an algorithm which can sort (or arrange) array of the strings such that if any string (say B) is sub-string of any other string (say ABAC) than that B should come after ABAC. e.g. :

suppose the string are :

abc
bc
zef
abcde

then order will be :

abcde, 
abc, 
bc 
and zef can come anywhere in the order.
4
  • 1
    What programming language do you like the code to be in? Also, in the event where, say, string "bc" is a substring of two string, where would you like "bc" to go. For example, there are 3 strings, "abc", "dbc", and "bc". Where would you like "bc" to go in that case? Commented Oct 2, 2019 at 5:28
  • "bc" could go anywhere after "abc" and "dbc" but "abc" and "dbc" order doesn't matter. Commented Oct 2, 2019 at 5:52
  • @KevinNg any programming language or pseudo code will work as long as i am able to understand how algorithm works like C,C++, python, Java, php etc. Commented Oct 2, 2019 at 6:03
  • Build a trie seems to be a good approach here. Commented Oct 2, 2019 at 6:19

1 Answer 1

1

Sort algorithms are based on comparing pairs of values. Often programming languages allow to provide the built-in sort-method with a comparator function, which should take two arguments, and return an integer value indicating their relative order (-1, 0 or 1).

So define the comparator as follows:

compare(a, b):
    if a is substring of b then return 1
    if b is substring of a then return -1
    if a < b then return -1
    if a > b then return 1
    return 0

This substring-test should first check the length of the two strings to potentially avoid a scan of the strings. Because when a.length > b.length, then a cannot be a substring of b. Or you could also explicitly write:

compare(a, b):
    if a.length <= b.length and a is substring of b then return 1
    if a.length >= b.length and b is substring of a then return -1
    if a < b then return -1
    if a > b then return 1
    return 0

If the target programming language does not offer this possibility, then you should write your own sorting function (like QuickSort), and make sure it can use such a comparator, so that (starting from a standard implementation) you would replace:

 if a < b

with:

 if compare(a, b) < 0

...etc.

Transitivity of the relationship

Let's assume for a moment that the relationship that is encoded in the compare function is not transitive, so that we could find three strings a, b and c for which:

  • compare(a, b) < 0
  • compare(b, c) < 0
  • but also: compare(c, a) <= 0

First, note what this says about the lengths of the three strings:

  • compare(a, b) < 0 implies that a.length >= b.length
  • compare(b, c) < 0 implies that b.length >= c.length
  • compare(c, a) <= 0 implies that c.length >= a.length

From the first two we conclude that a.length >= c.length, and combining that with the third, we can conclude all three strings have the same length.

So now we have:

  • compare(a, b) < 0 implies that a is alphabetically ordered before b
  • compare(b, c) < 0 implies that b is alphabetically ordered before c
  • compare(c, a) <= 0 implies that c is alphabetically ordered before a, or is equal to a.

This leads to a contradiction. And so we must conclude that the relationship is transitive.

Sign up to request clarification or add additional context in comments.

6 Comments

I don't think this will work as here you are using comparison sort which uses the transitive property of comparison which your comparative function doesn't follow.;
e.g. if three strings are A : "abc" , B : "bc" , C : "bcef" then from you comparison function : compare(A,B) < 0 compare(B,C) = 0 then sorting algorithm makes a assumption that compare(A,C) < 0 which clearly isn't true.
@Sagar, you may have a point, but I don't see it in your example. First of all, I had the return value reversed putting substrings first, which I have corrected now, so they come last. But so far I see no problem in the conclusion that the algorithm makes from earlier comparisons. compare(A, C) is really not determined by the OP's conditions: they could be ordered either way, since neither is a substring of the other. Do you have another counter example?
See the proof of transitivity I added to my answer.
sorry for the misunderstanding. i am also not able to find any counterexample so i have marked this as the solution.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.