6

Does Clojure provide any builtin way to find the position of a sub-sequence in a given sequence?

2 Answers 2

7

Clojure provides a builtin way for easy Java Interop.

(java.util.Collections/indexOfSubList '(a b c 5 6 :foo g h) '(5 6 :foo))
;=> 3
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for your answer. That's what I'll use in the end, but I'm usually trying to avoid explicitly calling Java Interop from 'business' code, as I find it to be a little verbose. Thank you nevertheless.
While this may work, be aware that a collection isn't a sequence.
@NielsK Philosophical notions aside, I think you'll find java.util.List as a superclass of a seq and that the java method is on pair of java.util.Lists. As such, you could use this on lazy sequences (just be careful not to evaluate an infinite one) (java.util.Collections/indexOfSubList (range 10) (range 3 7)) ;=> 3, vectors, sorted-maps, etc.
That's interesting to know. However, it seems it does work on lazy seqs, but the operation itself doesn't seem to be lazy. I tried both methods on (range 10000000) (range 999998 999999), and got a GC overhead limit on the Collections way, and the normal answer with my find-pos. So there must be more than purely philosophical notions.
@NielsK You are right. That does indeed appear to be trying to realize the entire range into memory when you disable the GC overhead check.
4

A sequence is an abstraction, not a concretion. Certain concretions that you can use through the sequence abstraction have a way to find the position of a subsequence (strings and java collections, for instance), but sequences in general don't, because the underlying concretion doesn't have to have an index.

What you can do however, is create a juxt of the element identity and an index function. Have a look at map-indexed.

Here's a naive implementation that will lazily find the position of (all) the subsequence(s) in a sequence. Just use first or take 1 to find only one:

(defn find-pos
  [sq sub]
  (->>
    (partition (count sub) 1 sq)
    (map-indexed vector)
    (filter #(= (second %) sub))
    (map first)))

=> (find-pos  [:a :b \c 5 6 :foo \g :h]
                [\c 5 6 :foo])
(2)

=> (find-pos  "the quick brown fox"
                (seq "quick"))
(4)

Take care that index-based algorithms generally aren't something you would do in a functional language. Unless there are good reasons you need the index in the final result, lavish use of index lookup is considered code smell.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.