2

Is there some Library in Haskell that has similar function like this?

> function "bal bla hu bla" ["bla","bal"]
[(2,"bla"),(1,"bal")]

5 Answers 5

4

Data.Text provides a count function, however it works on Text rather than String so we have to use pack.

import Data.Text (pack, count)

function haystack needles = map go needles
     where 
        packed = pack haystack
        go needle = (count (pack needle) packed, needle)
Sign up to request clarification or add additional context in comments.

1 Comment

We should add that using Text instead of String is probably a good idea anyway...
3

Using Data.List.Split:

occs ∷ Eq a ⇒ [a] → [[a]] → [(Int, [a])]
occs str = map (count str &&& id)
  where
    count s x = length (splitOn x s) - 1

And

 >occs "bal bla hu bla" ["bla","bal"]
 [(2,"bla"),(1,"bal")]

UPD:

Parsec can be useful here too.

{-# LANGUAGE NoMonomorphismRestriction,
             FlexibleContexts
  #-}
import Control.Arrow ((&&&))
import Data.Either (partitionEithers)
import Text.Parsec

occs :: String -> [String] -> [(Int, String)]
occs s = map (countP s &&& id)

countP str substr = either (const 0) occsNumber $ parse (parseMany substr) "" str
  where
    occsNumber = length . snd . partitionEithers

parseSingle :: Stream s m Char => String -> ParsecT s u m (Either Char String)
parseSingle s = fmap Right (try (string s)) <|> fmap Left anyChar

parseMany :: Stream s m Char => String -> ParsecT s u m [Either Char String]
parseMany = many . parseSingle

Result is still the same:

> occs "bal bla hu bla" ["bla","bal"]
[(2,"bla"),(1,"bal")]

1 Comment

You don't need s parameter in count since it's closure and have access to str.
2
import Control.Arrow ((&&&))
import Data.List (isPrefixOf, tails)

yourFunction :: Eq a => [a] -> [[a]] -> [(Int, [a])]
yourFunction haystack = map (count &&& id)
  where count needle = length . filter (needle `isPrefixOf`) . tails $ haystack

(Untested.)

1 Comment

Works, except yourFunction needs to have (Eq a) as a constraint.
1

OMG OMG! Every solution given so far has quadratic computational complexity! (even the Data.Text one has a worst-case quadratic running time)

Obviously you need to roll your own string search algorithm!

Here is my take. I think this is a variation of KMP.

data Searcher = Found | Initial Searcher | Searching Char Searcher Searcher

runSearcher :: Searcher -> Char -> Searcher
runSearcher (Searching c suc fail) s | c == s = suc
 | otherwise = runSearcher fail s
runSearcher (Initial s) _ = s

mkSearcher pattern = initial where
  initial = go (Initial initial) pattern

  go fallback [] = Found
  go fallback (c:t) = Searching c (go (runSearcher fallback c) t) fallback

search :: String -> String -> Integer
search pat = go searcher where
  searcher = mkSearcher pat
  go Found s = 1 + go searcher s
  go src (c:t) = go (runSearcher src c) t
  go src [] = 0

Still, there is much space for improvement! Searching for multiple patterns can be done more efficiently than doing that one by one if we preprocess the input string by building a prefix tree or something like that...

1 Comment

Newer versions of text have average case O(m+n) time complexity.
0

The "train wreck" solution:

import Data.List

f txt ws = map freq $ filter isElem $ group $ sort $ words txt where
    isElem (w:_) = w `elem` ws
    freq xs@(x:_) = (length xs, x)

From there, you can go "monadic"

import Data.List
import Control.Monad

f txt ws = do
    xs@(x:_) <- group $ sort $ words txt 
    guard $ x `elem` ws
    return (length xs, x)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.