2

I have a string with underscores separating words (e.g. aaa_bbb_ccc)

I created a function to covert the string to camelCase (e.g. aaaBbbCcc).

I am wondering if there are some things that I am doing wrong which affect the performance. This is the code:

(defun underscore-to-camel (input)
        (defparameter input-clean-capitalized (remove #\_ (string-capitalize input)))
        (setf (aref input-clean-capitalized 0) (aref (string-downcase (aref input-clean-capitalized 0)) 0))
        input-clean-capitalized)

I also created a second variant but it is ~25% slower (measured 3 million executions using time):

(defun underscore-to-camel-v2 (input)
        (defparameter input-clean-capitalized (remove #\_ (string-capitalize input)))
        (concatenate 
            'string 
            (string-downcase (aref input-clean-capitalized 0)) 
            (subseq input-clean-capitalized 1)))
3
  • 1
    Why not simply use a regexp replace of \([a-z]\)_\([a-z]\) with \1\,(upcase \2)? Commented Aug 3, 2013 at 16:00
  • I am using lispbox (Clozure Common Lisp). Regular expressions are not available by default. Which library you suggest that support upcase? Commented Aug 3, 2013 at 16:51
  • 1
    You might want to properly indent the code. Then defparameter is not used that way. defparameter is for global functions Commented Aug 3, 2013 at 18:08

4 Answers 4

2

First of, defparameter is not what you want to use. You should really refactor your code like this:

(defun underscore-to-camel (input)
  (let ((input-clean-capitalized (remove #\_ (string-capitalize input))))
    (setf (aref input-clean-capitalized 0)
          (aref (string-downcase (aref input-clean-capitalized 0)) 0))
    input-clean-capitalized))

Second: You could approach the problem like this:

(defun underscore-to-camel-eff (input)
  (declare (optimize (debug 1) (speed 3) (safety 1)))
  (loop :with length = (length input)
        :with i = 0
        :while (< i length)
        :for c = (aref input i)
        :if (or (= i (- length 1))
                (char/= c #\_))
        :collect (prog1 c (incf i)) :into result
        :else
        :collect (prog1
                   (char-upcase (aref input (+ i 1)))
                   (incf i 2))
        :into result
        :finally (return (concatenate 'string result))))

which runs, on my PC with SBCL, in half the time of your solution.

And here's a solution using regular expression, albeit slower than any of the other solutions:

(defun underscore-to-camel-ppcre (input)
  (declare (optimize (debug 1) (speed 3) (safety 1)))
  (ppcre:regex-replace-all "_([a-z])"
                           input
                           (lambda (target-string
                                    start
                                    end
                                    match-start
                                    match-end
                                    reg-starts
                                    reg-ends)
                             (declare (ignore start
                                              end
                                              match-end
                                              reg-starts
                                              reg-ends))
                             (string
                              (char-upcase
                               (aref target-string (+ 1 match-start)))))))

The necessary package is called "ppcre". You can install it via

(ql:quickload "cl-ppcre")

Once you went to http://www.quicklisp.org/beta/ and installed quicklisp.

Sign up to request clarification or add additional context in comments.

Comments

2

I would propose to use character-level functions. They start with char-. Then you can get rid of STRING-DOWNCASE and "CONCATENATE`.

DEFPARAMETER is not used for local variables. Use LET.

But a simple version is this:

(defun underscore-to-camel (input)
  (string-downcase (remove #\_ (string-capitalize input))
                   :start 0
                   :end 1))

Comments

2

One more way to do it:

(defun underscore-to-camel (input)
  (with-output-to-string (s)
    (loop
       :for c :across input
       :for upcase := (char= c #\_) :then (or upcase (char= c #\_)) :do
       (cond
         ((char= c #\_))
         (upcase (write-char (char-upcase c) s) (setf upcase nil))
         (t (write-char c s))))))

Comments

2

After experimenting a while with SBCL this is the fastest version I found

(defun camelcase (s)
  (do* ((n (length s))
        (i 0 (the fixnum (1+ i)))
        (wp 0)
        (target (make-array n :element-type 'character)))
      ((>= i n) (subseq target 0 wp))
    (declare (fixnum n i wp)
             (string s))
    (if (and (< i (the fixnum (1- n)))
             (char= (char s i) #\_)
             (char>= (char s (the fixnum (1+ i))) #\a)
             (char<= (char s (the fixnum (1+ i))) #\z))
        (setf (aref target (1- (the fixnum (incf wp))))
              (code-char (- (char-code (char s (the fixnum (incf i)))) 32)))
        (setf (aref target (1- (the fixnum (incf wp))))
              (char s i)))))

Instead of calling #'char-upcase I'm just subtracting 32 because the character is known to be in the a-z range and I'm supposing ASCII encoding. This shaves off some cycle.

Also for some reason I don't understand explicit array filling is faster than using vector-push.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.