Clojure sub-sequence position in sequence

Question

Does Clojure provide any builtin way to find the position of a sub-sequence in a given sequence?

A. Webb · Accepted Answer · 2013-03-05 13:05:33Z

7

Clojure provides a builtin way for easy Java Interop.

(java.util.Collections/indexOfSubList '(a b c 5 6 :foo g h) '(5 6 :foo))
;=> 3

answered Mar 5, 2013 at 13:05

A. Webb

26.5k1 gold badge67 silver badges97 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Tudor Vintilescu Over a year ago

Thank you for your answer. That's what I'll use in the end, but I'm usually trying to avoid explicitly calling Java Interop from 'business' code, as I find it to be a little verbose. Thank you nevertheless.

NielsK Over a year ago

While this may work, be aware that a collection isn't a sequence.

A. Webb Over a year ago

@NielsK Philosophical notions aside, I think you'll find java.util.List as a superclass of a seq and that the java method is on pair of java.util.Lists. As such, you could use this on lazy sequences (just be careful not to evaluate an infinite one) (java.util.Collections/indexOfSubList (range 10) (range 3 7)) ;=> 3, vectors, sorted-maps, etc.

NielsK Over a year ago

That's interesting to know. However, it seems it does work on lazy seqs, but the operation itself doesn't seem to be lazy. I tried both methods on (range 10000000) (range 999998 999999), and got a GC overhead limit on the Collections way, and the normal answer with my find-pos. So there must be more than purely philosophical notions.

A. Webb Over a year ago

@NielsK You are right. That does indeed appear to be trying to realize the entire range into memory when you disable the GC overhead check.

NielsK · Accepted Answer · 2013-03-05 13:58:40Z

A sequence is an abstraction, not a concretion. Certain concretions that you can use through the sequence abstraction have a way to find the position of a subsequence (strings and java collections, for instance), but sequences in general don't, because the underlying concretion doesn't have to have an index.

What you can do however, is create a juxt of the element identity and an index function. Have a look at map-indexed.

Here's a naive implementation that will lazily find the position of (all) the subsequence(s) in a sequence. Just use first or take 1 to find only one:

(defn find-pos
  [sq sub]
  (->>
    (partition (count sub) 1 sq)
    (map-indexed vector)
    (filter #(= (second %) sub))
    (map first)))

=> (find-pos  [:a :b \c 5 6 :foo \g :h]
                [\c 5 6 :foo])
(2)

=> (find-pos  "the quick brown fox"
                (seq "quick"))
(4)

Take care that index-based algorithms generally aren't something you would do in a functional language. Unless there are good reasons you need the index in the final result, lavish use of index lookup is considered code smell.

Collectives™ on Stack Overflow

Clojure sub-sequence position in sequence

2 Answers 2

5 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related