24
$\begingroup$

I have a list and I want to find (in this particular case the first) appearance of a any of some subsequences, of possible different lengths. None of the subsequences is a subsequence of each other. In my particular case I could do this translating the list to a string and using StringPosition. But I could do it because all elements on my list were one-character-long. Before realizing this I had implemented a not-nearly-one-liner that did the trick without recurring to Strings. It didn't do any useless comparison but it did lots of useless coping of the list as a whole, and it turned out to be 50 times slower than the StringPosition version. It can be improved, avoiding that issue, making it even less one-liner. The task just seems too easy to describe so as to be so not-easy to program well... Is there an efficient way to do it for the general case? "Find the first appearance of one of many subsequences (possible different lengths, perhaps could be patterns, or not) in a list"

(Wow, I think I just thought of a good way, I'll give it a shot... If it works I'll auto-answer. But I'd still like your input, I'm afraid I'm missing some options)

$\endgroup$
6
  • $\begingroup$ I asked something very similar here: stackoverflow.com/questions/8740033/… $\endgroup$ Commented Jan 29, 2012 at 14:40
  • $\begingroup$ @Szabolcs, thanks. I'll read it now. What should I do? Close this question? Or leave it because it hasn't been asked heeere? $\endgroup$ Commented Jan 29, 2012 at 14:42
  • $\begingroup$ @Rojo Leave it, people shouldn't be expected to check SO before posting. I posted my favourite solution as an answer, and credited the original answerer. $\endgroup$ Commented Jan 29, 2012 at 14:43
  • $\begingroup$ @Szabolcs, ok, for whatever reason I'll post my recent idea too, hehe, tell me what you think $\endgroup$ Commented Jan 29, 2012 at 14:51
  • 3
    $\begingroup$ For packed arrays, the fastest method I am aware of is the seqposC function from this answer: stackoverflow.com/questions/8364804/… $\endgroup$ Commented Jan 29, 2012 at 15:12

2 Answers 2

24
$\begingroup$

I asked the same question on StackOverflow recently, and the answer that is now my favourite came from Jan Pöschko (modified):

findSubsequence[list_, {ss__}] := 
  ReplaceList[list, {pre___, ss, ___} :> Length[{pre}] + 1]

This will find all positions of ss in list. Example:

findSubsequence[Range[50] ~Mod~ 17, {4, 5, 6}]

{4, 21, 38}

Despite using patterns, this solution runs very quickly, even for packed arrays. Please see the question I linked to for more possibilities.


A potentially useful generalization to other heads may be had with:

findSubsequence[list : h_[__], _[ss__]] :=
  ReplaceList[list, h[pre___, ss, ___] :> Length[{pre}] + 1]

Allowing such forms as:

x = Hold[1 + 1, 2 + 1, 3 + 1, 4 + 1, 2 + 1, 3 + 1, 1 + 1, 2 + 1, 3 + 1];

findSubsequence[x, Hold[2 + 1, 3 + 1]]

{2, 5, 8}

$\endgroup$
2
  • $\begingroup$ Very nice solution and interesting thread you linked to $\endgroup$ Commented Jan 29, 2012 at 15:07
  • 3
    $\begingroup$ Since you mentioned packed arrays: my function seqposC from this answer stackoverflow.com/questions/8364804/…, is 6-8 times faster than the method you described. The speed comparison is here: stackoverflow.com/questions/8740033/…. This is not to detract from the elegance of the latter. $\endgroup$ Commented Jan 29, 2012 at 15:10
22
$\begingroup$

In V10.1 there is a nice new function called SequencePosition. You can use it as:

First /@ SequencePosition[Range[50]~Mod~17, {4, 5, 6}]

{4, 21, 38}

Time comparison:

t1 = findSubsequence[Range[5000]~Mod~17,{4,5,6}]//RepeatedTiming//First
t2 = First/@SequencePosition[Range[5000]~Mod~17,{4,5,6}]//RepeatedTiming//First
t1/t2 = 28x
$\endgroup$
1
  • 6
    $\begingroup$ This should be the selected answer now. $\endgroup$ Commented Jun 16, 2015 at 14:01

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.