How to split a string in Haskell?

Question

Is there a standard way to split a string in Haskell?

lines and words work great from splitting on a space or newline, but surely there is a standard way to split on a comma?

I couldn't find it on Hoogle.

To be specific, I'm looking for something where split "," "my,comma,separated,list" returns ["my","comma","separated","list"].

I would really like to such a function in a future release of Data.List or even Prelude. It's so common and nasty if not available for code-golf. — fuz
– fuz, Commented Feb 12, 2011 at 15:08

Steve · Accepted Answer · 2011-02-12 23:18:22Z

198

Remember that you can look up the definition of Prelude functions!

http://www.haskell.org/onlinereport/standard-prelude.html

Looking there, the definition of words is,

words   :: String -> [String]
words s =  case dropWhile Char.isSpace s of
                      "" -> []
                      s' -> w : words s''
                            where (w, s'') = break Char.isSpace s'

So, change it for a function that takes a predicate:

wordsWhen     :: (Char -> Bool) -> String -> [String]
wordsWhen p s =  case dropWhile p s of
                      "" -> []
                      s' -> w : wordsWhen p s''
                            where (w, s'') = break p s'

Then call it with whatever predicate you want!

main = print $ wordsWhen (==',') "break,this,string,at,commas"

answered Feb 12, 2011 at 23:18

Steve

8,3239 gold badges48 silver badges92 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Alex · Accepted Answer · 2016-07-23 12:45:39Z

161

There is a package for this called split.

cabal install split

Use it like this:

ghci> import Data.List.Split
ghci> splitOn "," "my,comma,separated,list"
["my","comma","separated","list"]

It comes with a lot of other functions for splitting on matching delimiters or having several delimiters.

edited Jul 23, 2016 at 12:45

Alex

8,3618 gold badges55 silver badges80 bronze badges

answered Feb 12, 2011 at 15:05

Jonno_FTW

8,8277 gold badges60 silver badges91 bronze badges

7 Comments

gawi Over a year ago

Cool. I wasn't aware of this package. This is the ultimate split package as it gives much control over the operation (trim space in results, leave separators in result, remove consecutive separators, etc...). There are so many ways of splitting lists, it is not possible to have in single split function that will answer every needs, you really need that kind of package.

Emmanuel Touzery Over a year ago

otherwise if external packages are acceptable, MissingH also provides a split function: hackage.haskell.org/packages/archive/MissingH/1.2.0.0/doc/html/… That package also provides plenty of other "nice-to-have" functions and I find that quite some packages depend on it.

The Internet Over a year ago

The split package is now apart of the haskell platform as of most recent release.

The Internet Over a year ago

import Data.List.Split (splitOn) and go to town. splitOn :: Eq a => [a] -> [a] -> [[a]]

expz Over a year ago

@RussAbbott the split package is included in the Haskell Platform when you download it (haskell.org/platform/contents.html), but it is not automatically loaded when building your project. Add split to the build-depends list in your cabal file, e.g. if your project is called hello, then in the hello.cabal file below the executable hello line put a line like ` build-depends: base, split` (note two space indent). Then build using the cabal build command. Cf. haskell.org/cabal/users-guide/…

|

Emmanuel Touzery · Accepted Answer · 2012-12-11 05:10:24Z

47

If you use Data.Text, there is splitOn:

http://hackage.haskell.org/packages/archive/text/0.11.2.0/doc/html/Data-Text.html#v:splitOn

This is built in the Haskell Platform.

So for instance:

import qualified Data.Text as T
main = print $ T.splitOn (T.pack " ") (T.pack "this is a test")

or:

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.Text as T
main = print $ T.splitOn " " "this is a test"

answered Dec 11, 2012 at 5:10

Emmanuel Touzery

9,2533 gold badges69 silver badges86 bronze badges

2 Comments

Emmanuel Touzery Over a year ago

@RussAbbott probably you need to a dependency to the text package or install it. Would belong in another question though.

Andrew Koster Over a year ago

Couldn't match type ‘T.Text’ with ‘Char’ Expected type: [Char] Actual type: [T.Text]

antimatter · Accepted Answer · 2014-07-22 02:40:29Z

23

Use Data.List.Split, which uses split:

[me@localhost]$ ghci
Prelude> import Data.List.Split
Prelude Data.List.Split> let l = splitOn "," "1,2,3,4"
Prelude Data.List.Split> :t l
l :: [[Char]]
Prelude Data.List.Split> l
["1","2","3","4"]
Prelude Data.List.Split> let { convert :: [String] -> [Integer]; convert = map read }
Prelude Data.List.Split> let l2 = convert l
Prelude Data.List.Split> :t l2
l2 :: [Integer]
Prelude Data.List.Split> l2
[1,2,3,4]

edited Jul 22, 2014 at 2:40

answered May 1, 2014 at 10:03

antimatter

3,5103 gold badges26 silver badges35 bronze badges

1 Comment

danvk Nov 25 at 16:20

Note that this is part of the split package, which must be installed: stackoverflow.com/a/34175246/388951

fp_mora · Accepted Answer · 2018-04-09 21:06:58Z

20

Without importing anything a straight substitution of one character for a space, the target separator for words is a space. Something like:

words [if c == ',' then ' ' else c|c <- "my,comma,separated,list"]

or

words let f ',' = ' '; f c = c in map f "my,comma,separated,list"

You can make this into a function with parameters. You can eliminate the parameter character-to-match my matching many, like in:

 [if elem c ";,.:-+@!$#?" then ' ' else c|c <-"my,comma;separated!list"]

answered Apr 9, 2018 at 21:06

fp_mora

7246 silver badges11 bronze badges

3 Comments

Yuri Kovalenko Over a year ago

That does not distinguish between new added spaces and spaces that were here originally, so for "my,comma separated,list" it will see 4 parts instead of 3 as intended.

fp_mora Over a year ago

@Yuri Kovalenko words does; try words [if c == ',' then ' ' else c|c <- "my, comma, separated, list "]

fp_mora Over a year ago

Yuri Kovalenko The question was a comma separated string. Are you referring to another question?

evilcandybag · Accepted Answer · 2011-02-12 17:49:51Z

19

In the module Text.Regex (part of the Haskell Platform), there is a function:

splitRegex :: Regex -> String -> [String]

which splits a string based on a regular expression. The API can be found at Hackage.

answered Feb 12, 2011 at 17:49

evilcandybag

1,94217 silver badges17 bronze badges

2 Comments

Andrew Koster Over a year ago

Could not find module ‘Text.Regex’ Perhaps you meant Text.Read (from base-4.10.1.0)

codybartfast Over a year ago

It may be in the module regex-compat-tdfa (but I'm a haskell newb)

sshine · Accepted Answer · 2017-10-31 15:19:42Z

14

Try this one:

import Data.List (unfoldr)

separateBy :: Eq a => a -> [a] -> [[a]]
separateBy chr = unfoldr sep where
  sep [] = Nothing
  sep l  = Just . fmap (drop 1) . break (== chr) $ l

Only works for a single char, but should be easily extendable.

edited Oct 31, 2017 at 15:19

sshine

16.2k1 gold badge45 silver badges71 bronze badges

answered Feb 12, 2011 at 15:04

fuz

94.7k27 gold badges216 silver badges391 bronze badges

Comments

Frank Meisschaert · Accepted Answer · 2014-07-17 04:45:26Z

13

split :: Eq a => a -> [a] -> [[a]]
split d [] = []
split d s = x : split d (drop 1 y) where (x,y) = span (/= d) s

E.g.

split ';' "a;bb;ccc;;d"
> ["a","bb","ccc","","d"]

A single trailing delimiter will be dropped:

split ';' "a;bb;ccc;;d;"
> ["a","bb","ccc","","d"]

edited Jul 17, 2014 at 4:45

answered Jul 16, 2014 at 22:51

Frank Meisschaert

1311 silver badge4 bronze badges

Comments

mxs · Accepted Answer · 2020-03-07 23:25:25Z

9

I find this simpler to understand:

split :: Char -> String -> [String]
split c xs = case break (==c) xs of 
  (ls, "") -> [ls]
  (ls, x:rs) -> ls : split c rs

edited Mar 7, 2020 at 23:25

answered Mar 7, 2020 at 23:12

mxs

1092 silver badges5 bronze badges

1 Comment

Jörg Brüggmann Over a year ago

...simpler than what? Which kind of answers is your solution better of? Background: There is already some other answers.

Robin Begbie · Accepted Answer · 2012-06-10 07:31:25Z

6

I started learning Haskell yesterday, so correct me if I'm wrong but:

split :: Eq a => a -> [a] -> [[a]]
split x y = func x y [[]]
    where
        func x [] z = reverse $ map (reverse) z
        func x (y:ys) (z:zs) = if y==x then 
            func x ys ([]:(z:zs)) 
        else 
            func x ys ((y:z):zs)

gives:

*Main> split ' ' "this is a test"
["this","is","a","test"]

or maybe you wanted

*Main> splitWithStr  " and " "this and is and a and test"
["this","is","a","test"]

which would be:

splitWithStr :: Eq a => [a] -> [a] -> [[a]]
splitWithStr x y = func x y [[]]
    where
        func x [] z = reverse $ map (reverse) z
        func x (y:ys) (z:zs) = if (take (length x) (y:ys)) == x then
            func x (drop (length x) (y:ys)) ([]:(z:zs))
        else
            func x ys ((y:z):zs)

answered Jun 10, 2012 at 7:31

Robin Begbie

4031 gold badge4 silver badges6 bronze badges

2 Comments

Eric Wilson Over a year ago

I was looking for a built-in split, being spoiled by languages with well-developed libraries. But thanks anyway.

Tony Morris Over a year ago

You wrote this in June, so I assume you've moved on in your journey :) As an exercise, trying rewriting this function without reverse or length as use of these functions incur an algorithmic complexity penalty and also prevent application to an infinite list. Have fun!

score 5 · Accepted Answer · 2012-08-21 13:48:36Z

5

I don’t know how to add a comment onto Steve’s answer, but I would like to recommend the
GHC libraries documentation,
and in there specifically the
Sublist functions in Data.List

Which is much better as a reference, than just reading the plain Haskell report.

Generically, a fold with a rule on when to create a new sublist to feed, should solve it too.

edited Aug 21, 2012 at 13:48

answered Apr 10, 2012 at 22:50

anon

Comments

Andrew · Accepted Answer · 2015-12-20 01:06:52Z

4

Example in the ghci:

>  import qualified Text.Regex as R
>  R.splitRegex (R.mkRegex "x") "2x3x777"
>  ["2","3","777"]

edited Dec 20, 2015 at 1:06

answered Dec 20, 2015 at 0:58

Andrew

38.3k14 gold badges149 silver badges120 bronze badges

5 Comments

kirelagin Over a year ago

Please, don’t use regular expressions to split strings. Thank you.

Enlico Over a year ago

@kirelagin, why this comment? I'm learning Haskell, and I'd like to know the rational behind your comment.

Enlico Over a year ago

@Andrey, is there a reason why I cannot even run the first line in my ghci?

kirelagin Over a year ago

@EnricoMariaDeAngelis Regular expressions are a powerful tool for string matching. It makes sense to use them when you are matching something non-trivial. If you just want to split a string on something as trivial as another fixed string, there is absolutely no need to use regular expressions – it will only make the code more complex and, likely, slower.

Andrew Koster Over a year ago

"Please, don’t use regular expressions to split strings." WTF, why not??? Splitting a string with a regular expression is a perfectly reasonable thing to do. There are lots of trivial cases where a string needs to be split but the delimiter isn't always exactly the same.

Irfan Hamid · Accepted Answer · 2014-12-11 02:21:26Z

In addition to the efficient and pre-built functions given in answers I'll add my own which are simply part of my repertory of Haskell functions I was writing to learn the language on my own time:

-- Correct but inefficient implementation
wordsBy :: String -> Char -> [String]
wordsBy s c = reverse (go s []) where
    go s' ws = case (dropWhile (\c' -> c' == c) s') of
        "" -> ws
        rem -> go ((dropWhile (\c' -> c' /= c) rem)) ((takeWhile (\c' -> c' /= c) rem) : ws)

-- Breaks up by predicate function to allow for more complex conditions (\c -> c == ',' || c == ';')
wordsByF :: String -> (Char -> Bool) -> [String]
wordsByF s f = reverse (go s []) where
    go s' ws = case ((dropWhile (\c' -> f c')) s') of
        "" -> ws
        rem -> go ((dropWhile (\c' -> (f c') == False)) rem) (((takeWhile (\c' -> (f c') == False)) rem) : ws)

The solutions are at least tail-recursive so they won't incur a stack overflow.

Microtribute · Accepted Answer · 2021-07-09 20:29:00Z

0

I am far late but would like to add it here for those interested, if you're looking for a simple solution without relying on any bloated packages:

split :: String -> String -> [String]
split _ "" = []
split delim str =
  split' "" str []
  where
    dl = length delim

    split' :: String -> String -> [String] -> [String]
    split' h t f
      | dl > length t = f ++ [h ++ t]
      | delim == take dl t = split' "" (drop dl t) (f ++ [h])
      | otherwise = split' (h ++ take 1 t) (drop 1 t) f

edited Jul 9, 2021 at 20:29

answered Jul 9, 2021 at 3:14

Microtribute

1,09213 silver badges25 bronze badges

3 Comments

Microtribute Over a year ago

Oh come on... Ultimately what matters is not that something is liked by thousands of people. I am NOT forcing you to use it. It's ONLY there for those interested. Sounds like you're none of them.

Eric Wilson Over a year ago

You say "liked by" -- I say "battle tested". It's fine if you enjoy sharing it. My question was for the standard way to do it, and that has been answersd.

Microtribute Over a year ago

Haskell does not come with the split function out of the box. Remember you asked a function that splits a string by a string (String -> String -> [String]), not by a char (Char->String->[String]). You have to install the split package, which is NOT a standard way EITHER. Installing the split package will also include a bunch of redundant functions. You only asked for a split function, and I gave exactly that to you and NO MORE.

Pavel.Zh · Accepted Answer · 2022-08-27 22:25:30Z

0

So many answers, but I don't like them all. I don't know Haskell actually, but I wrote much shorter and (as I think) cleaner version for 5 minutes;

splitString :: Char -> [Char] -> [[Char]]
splitString _ [] = []
splitString sep str = 
    let (left, right) = break (==sep) str 
    in left : splitString sep (drop 1 right)

answered Aug 27, 2022 at 22:25

Pavel.Zh

4773 silver badges15 bronze badges

Collectives™ on Stack Overflow

How to split a string in Haskell?

15 Answers 15

Comments

7 Comments

2 Comments

1 Comment

3 Comments

2 Comments

Comments

Comments

1 Comment

2 Comments

Comments

5 Comments

Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

15 Answers 15

Comments

7 Comments

2 Comments

1 Comment

3 Comments

2 Comments

Comments

Comments

1 Comment

2 Comments

Comments

5 Comments

Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related