24

I'm trying to take user input and storing it in a list, only instead of a list consisting of a single string, I want each word scanned in to be its own string. Example:

> (input)
This is my input. Hopefully this works

would return:

("this" "is" "my" "input" "hopefully" "this" "works")

Taking note that I don't want any spaces or punctuation in my final list.

Any input would be greatly appreciated.

2

5 Answers 5

23

split-sequence is the off-the-shelf solution.

you can also roll your own:

(defun my-split (string &key (delimiterp #'delimiterp))
  (loop :for beg = (position-if-not delimiterp string)
    :then (position-if-not delimiterp string :start (1+ end))
    :for end = (and beg (position-if delimiterp string :start beg))
    :when beg :collect (subseq string beg end)
    :while end))

where delimiterp checks whether you want to split on this character, e.g.

(defun delimiterp (c) (or (char= c #\Space) (char= c #\,)))

or

(defun delimiterp (c) (position c " ,.;/"))

PS. looking at your expected return value, you seem to want to call string-downcase before my-split.

PPS. you can easily modify my-split to accept :start, :end, :delimiterp &c.

PPPS. Sorry about bugs in the first two versions of my-split. Please consider that an indicator that one should not roll one's own version of this function, but use the off-the-shelf solution.

Sign up to request clarification or add additional context in comments.

10 Comments

I find plenty of material on split-sequence, but apparently I need to import the cl-utilities package, which I just can't figure out how to do =/ #imanewb
@SeanEvans: careful! import is a CL function which you do not want here! what you need is install the package using, e.g., quicklisp: (ql:quickload "split-sequence")
@sds: Your edit broke your code (for instance, test with "" and "a").
To clarify, the first code can't handle strings that end with a delimiter (e.g. "abc "), and the second version most of the times fails to get the last token (e.g. "ab cd" -> ("ab")).
I think I fixed the code now. Sorry about the bugs.
|
12

For that task in Common-Lisp I found useful (uiop:split-string str :separator " ") and the package uiop, in general, has a lot of utilities, take a look at the docs https://common-lisp.net/project/asdf/uiop.html#index-split_002dstring.

2 Comments

uiop:split-string is nice, but it can’t split by newline, sadly.
@PhilippLudwig I don't believe that's true. I have a text file with a bunch of newlines and was able to run (uiop:split-string (uiop:read-file-string "input.txt") :separator uiop:+lf+). This returned a cons of strings, one for each line in the file.
5

There's cl-ppcre:split:

* (split "\\s+" "foo   bar baz
frob")
("foo" "bar" "baz" "frob")

* (split "\\s*" "foo bar   baz")
("f" "o" "o" "b" "a" "r" "b" "a" "z")

* (split "(\\s+)" "foo bar   baz")
("foo" "bar" "baz")

* (split "(\\s+)" "foo bar   baz" :with-registers-p t)
("foo" " " "bar" "   " "baz")

* (split "(\\s)(\\s*)" "foo bar   baz" :with-registers-p t)
("foo" " " "" "bar" " " "  " "baz")

* (split "(,)|(;)" "foo,bar;baz" :with-registers-p t)
("foo" "," NIL "bar" NIL ";" "baz")

* (split "(,)|(;)" "foo,bar;baz" :with-registers-p t :omit-unmatched-p t)
("foo" "," "bar" ";" "baz")

* (split ":" "a:b:c:d:e:f:g::")
("a" "b" "c" "d" "e" "f" "g")

* (split ":" "a:b:c:d:e:f:g::" :limit 1)
("a:b:c:d:e:f:g::")

* (split ":" "a:b:c:d:e:f:g::" :limit 2)
("a" "b:c:d:e:f:g::")

* (split ":" "a:b:c:d:e:f:g::" :limit 3)
("a" "b" "c:d:e:f:g::")

* (split ":" "a:b:c:d:e:f:g::" :limit 1000)
("a" "b" "c" "d" "e" "f" "g" "" "")

http://weitz.de/cl-ppcre/#split

For common cases there is the (new, "modern and consistent") cl-str string manipulation library:

(str:words "a sentence    with   spaces") ; cut with spaces, returns words
(str:replace-all "," "sentence") ; to easily replace characters, and not treat them as regexps (cl-ppcr treats them as regexps)

You have cl-slug to remove non-ascii characters and also punctuation:

 (asciify "Eu André!") ; => "Eu Andre!"

as well as str:remove-punctuation (that uses cl-change-case:no-case).

Comments

0
; in AutoLisp usage (splitStr "get off of my cloud" " ") returns (get off of my cloud)

(defun splitStr (src delim / word letter)

  (setq wordlist (list))
  (setq cnt 1)
  (while (<= cnt (strlen src))

    (setq word "")

    (setq letter (substr src cnt 1))
    (while (and (/= letter delim) (<= cnt (strlen src)) ) ; endless loop if hits NUL
      (setq word (strcat word letter))
      (setq cnt (+ cnt 1))      
      (setq letter (substr src cnt 1))
    ) ; while

    (setq cnt (+ cnt 1))
    (setq wordlist (append wordlist (list word)))

  )

  (princ wordlist)

  (princ)

) ;defun

Comments

-1
(defun splitStr (src pat /)
    (setq wordlist (list))
    (setq len (strlen pat))
    (setq cnt 0)
    (setq letter cnt)
    (while (setq cnt (vl-string-search pat src letter))
        (setq word (substr src (1+ letter) (- cnt letter)))
        (setq letter (+ cnt len))
        (setq wordlist (append wordlist (list word)))
    )
    (setq wordlist (append wordlist (list (substr src (1+ letter)))))
)

1 Comment

While this may answer the question, it is always good to provide an explanation of your code and any references that may be helpful. Check out How to Answer for details on answering questions.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.