LISP - Splitting string with delimiter also included in new list

Question

I have a list of elements following

("(aviyon" "213" "flyingman" "no))") as list

What i want is that I want to split this list containing strings using parentheses as splitter but also want to include these parentheses in a new list without breaking the order

My desired output of new list(or same list modified)

("(" "aviyon" "213" "flyingman" "no" ")" ")")

I am coming from imperative languages and this would be 15 minute job in Java or C++. But here i'm stuck what to do. I know i have to

1- Get a element from list in a loop

I think this is done with (nth 1 '(listname) )

2- separate without removing delimiter put in to a new list

I found functions such as SPLIT-SEQUENCE but i can't do without removing it and without breaking original order.

Any help would be appreciated.

("(aviyon","213","flyingman","no))") is not a valid list in Common Lisp, since the commas as separators are allowed only in a string. So there is an external string which is not shown in the example? — Renzo
– Renzo, Commented Nov 25, 2018 at 8:10
have you actually tried to use Common Lisp yet? There are no commas in Lisp lists. Just start it up and you can interactively find out what works and what not. Stackoverflow questions are best when you have a real problem with a report what is not working, If you haven't actually learned Lisp yet, there are forums for help like reddit.com/r/learnlisp — Rainer Joswig
– Rainer Joswig, Commented Nov 25, 2018 at 8:11
Sorry i added commas for better reading experience yes in my actual list its same without commas. I fixed that in original message. — itsfreeandtakesminute
– itsfreeandtakesminute, Commented Nov 25, 2018 at 8:18

Alexander Artemenko · Accepted Answer · 2018-11-25 08:44:20Z

4

You can use cl-ppcre library to do the job.

For example:

CL-USER> (ql:quickload :cl-ppcre)

CL-USER> (cl-ppcre:split "([\\(\\)])" "(aviyon" :with-registers-p t)
("" "(" "aviyon")
CL-USER> (cl-ppcre:split "([\\(\\)])" "no))" :with-registers-p t)
("no" ")" "" ")")
CL-USER>

However, it makes empty-strings in a list. Use remove-if function to get rid of them:

CL-USER> (defun empty-string-p (s) (string= s ""))
EMPTY-STRING-P
CL-USER> (remove-if 'empty-string-p
                    (list "no" ")" "" ")"))
("no" ")" ")")

Finally, you can construct a function which does both, and run it in an imperative loop (yes, Common Lisp is not functional as many think):

CL-USER> (defun remove-empty-strings (l)
           (remove-if 'empty-string-p l))
REMOVE-EMPTY-STRINGS
CL-USER> (defun split (s)
           (cl-ppcre:split "([\\(\\)])"
                           s
                           :with-registers-p t))
SPLIT
CL-USER> (defparameter *the-list* '("(aviyon" "213" "flyingman" "no))"))
*THE-LIST*
CL-USER> (loop for item in *the-list*
               for splitted = (split item)
               for cleaned = (remove-empty-strings splitted)
               append cleaned)
("(" "aviyon" "213" "flyingman" "no" ")" ")")

answered Nov 25, 2018 at 8:44

Alexander Artemenko

23.2k8 gold badges42 silver badges38 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Alexander Artemenko Over a year ago

Of cause, you can write this in another way without using external library and almost without "consing" – using additional memory.

itsfreeandtakesminute Over a year ago

Can you give me example on how "Alexeander Artemenko" because this seems awfully fixed for this situation only. My list could change only thing that wont change i want parantheses as delimeter and i dont want to lose them while splitting it.

Alexander Artemenko Over a year ago

You can do this by going through all list items and searching ( and ) in strings using position function and then collecting pieces of these strings cutting them with sebseq function.

coredump · Accepted Answer · 2018-11-26 10:32:21Z

Let's have another answer, without external libraries. Like you already did, we can split in the problem into smaller parts:

define a function which builds a list of tokens from a string, all-tokens
apply this function on all strings in your input list, and concatenate the result:
```
(mapcan #'all-tokens strings)
```

The first part, taking a state and building a list from it, looks like an unfold operation (anamorphism).

Fold (catamorphism), called reduce in Lisp, builds a value from a list of values and a function (and optionally an initial value). The dual operation, unfold, takes a value (the state), a function, and generate a list of values. In the case of unfold, the step function accepts a state and returns new state along with the resulting list.

Here, let's define a state as 3 values: a string, a starting position in the string, and a stack of tokens parsed so far. Our step function next-token returns the next state.

 ;; definition follows below
 (declare (ftype function next-token))

The main function which gets all tokens from a string just computes a fixpoint:

(defun all-tokens (string)
  (do (;; initial start value is 0
       (start 0)
       ;; initial token stack is nil
       (tokens))

      ;; loop until start is nil, then return the reverse of tokens
      ((not start) (nreverse tokens))

    ;; advance state
    (multiple-value-setq (string start tokens)
      (next-token string start tokens))))

We need an auxiliary function:

(defun parenthesisp (c)
  (find c "()"))

The step function is defined as follows:

(defun next-token (string start token-stack)
  (let ((search (position-if #'parenthesisp string :start start)))
    (typecase search
      (number
       ;; token from start to parenthesis
       (when (> search start)
         (push (subseq string start search) token-stack))
       ;; parenthesis
       (push (subseq string search (1+ search)) token-stack)
       ;; next state
       (values string (1+ search) token-stack))
      (null
       ;; token from start to end of string
       (when (< start (1- (length string)))
         (push (subseq string start) token-stack))
       ;; next-state
       (values string nil token-stack)))))

You can try with a single string:

(next-token "(aviyon" 0 nil)
"(aviyon"
1
("(")

If you take the resulting state values and reuse them, you have:

(next-token "(aviyon" 1 '("("))
"(aviyon"
NIL
("aviyon" "(")

And here, the second return value is NIL, which ends the generation process. Finally, you can do:

(mapcan #'all-tokens '("(aviyon" "213" "flyingman" "no))"))

Which gives:

("(" "aviyon" "213" "flyingman" "no" ")" ")")

The above code is not fully generic in the sense that all-tokens knows too much about next-token: you could rewrite it to take any kind of state. You could also handle sequences of strings using the same mechanism, by keeping more information in your state variable. Also, in a real lexer you would not want to reverse the whole list of tokens, you would use a queue to feed a parser.

Thanks this is really helpful considering i was not able to decipher all connections of which functions called on cl-pre libraries. Very helpful !

Gwang-Jin Kim · Accepted Answer · 2018-11-25 09:51:27Z

0

solution

Since you didn't understood Alexander's solution and since I anyway wrote my solution:

;; load two essential libraries for any common lisper
(ql:quickload :cl-ppcre)
(ql:quickload :alexandria)
;; see below to see how to install quicklisp for `ql:quickload` command
;; it is kind of pythons `import` and if not install `pip install`
;; in one command for common-lisp

(defun remove-empty-string (string-list) 
  (remove-if #'(lambda (x) (string= x "")) string-list))


(defun split-parantheses-and-preserve-them (strings-list)
  (remove-empty-string 
  (alexandria:flatten 
    (mapcar #'(lambda (el) (cl-ppcre:split "(\\(|\\))" 
                                           el 
                                           :with-registers-p t)) 
            strings-list))))

 ;; so now your example
 (defparameter *list* '("(aviyon" "213" "flyingman" "no))"))

 (split-parantheses-and-preserve-them *list*)
 ;; returns:
 ;; ("(" "aviyon" "213" "flyingman" "no" ")" ")")

how this works

(cl-ppcre:split "(\\(|\\))" a-string) splits the string by ( or ). Because in regex pattern ( or ) are used for capturing the match - like here too (the outer parantheses captures) - you have to escape them. \\( or \\). So with cl-ppcre:split you can split any string in common lisp by regex-pattern. Super cool and super efficient package written by Edi Weitz. He wrote several super sophisticated packages for common lisp - they are also called ediware or edicls in the community. By the way - cl-ppcre is even more efficient and faster than gold-standard for regex: the perl regex engine!

:with-regiesters-p t option then preserves the matched delimiter - which has to be captured by parentheses like this: (<pattern>) in the pattern.

mapcar this over the list to apply it on each string element in your string list.

However, what you got after that is a list of lists. (Each inner list containing the splitted result for each string-element of the list).

Flatten the list by alexandria:flatten. For many functions not in the standard of lisp, but which you think they are basic - like flatten a list - look always first in alexandria - mostly it has a function you desire - it is a huge library. That is why you need it anyway as a common lisper ;) .

But still, there will be empty strings around to be removed. That is why I wrote remove-empty-string which uses remove-if - which together with remove-if-not is the standard filtering function for lists. It takes a predicate function - here (lambda (x) (string= x "")) which gives T if string is an empty string and NIL if not. It removes all elements in the resulting flattened list in our function, which are empty strings. In other languages it will be named filter but yeah - sometimes function names in common-lisp are not very well chosen. Sometimes I think we should create alias names and move over to them and keep the old names for backward-compatibility. Clojure has nicer names for functions ... Maybe cl people should overtake clojure function names ...

quicklisp

@Alexander Artemenko wrote exactly my solution - he came first. I will add: If you are so new to common lisp, maybe you don't know how to use quicklisp. Do in terminal (linux or macos):

wget https://beta.quicklisp.org/quicklisp.lisp

Otherwise manually download in windows from the address.

I put it into ~/quicklisp folder.

Then in clisp or sbcl do:

(load "~/quicklisp/quicklisp.lisp") ;; just path to where downloaded
;; quicklisp.lisp file is!

;; then install quicklisp:
(quicklisp-quickstart:install)

;; then search for cl-ppcre
(ql:system-apropos "cl-ppcre")

;; then install cl-ppcre
(ql:quickload "cl-ppcre")

;; and to autoload everytime you start sbcl or clisp
;; in linux or mac - sorry I don't now windows that well
;; I have the opinion every programmer should us unix
;; as their OS
;; you have to let quicklisp be loaded when they start
;; by an entry into the init file
;; mostly located in ~/.sbclrc or ~/.clisprc.slip or such ...
;; respectively.
;; quicklisp does an entry automatically if you do:
(ql:add-to-init-file)

;; after installation do:
(quit)

;; If you then restart sbcl or clisp and try:
(ql:quickload :cl-ppcre)
;; it should work, - if not, you have to manually load
;; quicklisp first
(load "~/quicklisp/setup.lisp") ;; or wherever quicklisp's
;; setup.lisp file has been stored in your system!
;; and then you can do
(ql:quickload :cl-ppcre)

;; to install alexandria package then, do
(ql:quickload :alexandria) ;; or "alexandria"

;; ql:quickload installs the package from quicklisp repository,
;; if it cannot find package on your system.

;; learn more about quicklisp, since this is the package
;; manager of common lisp - like pip for python

edited Nov 25, 2018 at 9:51

answered Nov 25, 2018 at 9:15

Gwang-Jin Kim

11.1k20 silver badges39 bronze badges

6 Comments

itsfreeandtakesminute Over a year ago

Thanks i got this pretty great just one quick question. I see both of you used one or two function from these libraries hence they are open source can I just add implementation of "split" function from cl-ppcre and be done with it? It would be nicer if i didn't use additional library just wondering about it.

Gwang-Jin Kim Over a year ago

sure you can. But you need cl-ppcre and alexandria anyway. And especially cl-ppcre is optimized by hell. So it is recommended to use it at any way you can! Because simply: You can't write better code - rather you would introduce more bugs - especially as a beginner.

Gwang-Jin Kim Over a year ago

In addition - this problem requires not a normal split function but preserving the to be splitted thing. That would take me very likely more than 15 min to fiddle that out. And for matching - finding positions for "(" or ")" in the string, I would anyway use cl-ppcre :D . Any common lisper has to learn cl-ppcre and alexandria - because they contain essential functions for anyone dealing with strings and lists etc - and especially for someone wanting many of the functions from other imperative functions ready-to-use. They are super well tested. So please learn to use them!

Svante Over a year ago

Instead of using flatten with mapcar, you can use mapcan.

Svante Over a year ago

Instead of remove-if with a more verbose test, you can use (remove "" list :test #'string=).

|

Collectives™ on Stack Overflow

LISP - Splitting string with delimiter also included in new list

3 Answers 3

3 Comments

1 Comment

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related