In learning haskell, I'm writing a fuzzy menu. At the moment, my executable reads in a 'dictionary' from stdin, and ranks each word according to how well it fuzzily matches a search pattern given in the first CLI arg. The idea of the fuzzy matching algorithm is to split a pattern by its delimiters, and then match each character with a prefix of a token, accumulating a score to represent the quality of the match.
My main module looks like this:
module Main where
import Data.List
import Fuzzy
import System.Environment
main :: IO ()
main = do
contents <- getContents
let dict = lines contents
args <- getArgs
let pattern = splitWord (head args)
let scored = map (\x -> (score (x, pattern), x)) dict
print (sort scored)
I'm not sure whether or not I'm misusing the do block and/or some I/O primitives here: overall, I think it could be better but I don't know how to change it.
The Util module looks like this:
module Util
( splitWord
, boolToFloat
, nextChar
) where
splitWord :: String -> [String]
splitWord (l : '_' : r ) = splitWord ([l, '-'] ++ r)
splitWord (l : '.' : r ) = splitWord ([l, '-'] ++ r)
splitWord (l : ':' : r ) = splitWord ([l, '-'] ++ r)
splitWord (l : '-' : r ) = [[l]] ++ splitWord r
splitWord (c : []) = [[c]]
splitWord [] = []
splitWord s = do
let rest = splitWord (tail s)
let first = (head s) : (head rest)
return first ++ tail rest
boolToFloat :: Bool -> Float
boolToFloat True = 1.0
boolToFloat False = 0.0
nextChar :: [String] -> [String]
nextChar s = case tail (head s) of
[] -> tail s
n -> [n] ++ tail s
Especially in splitWord, I think the code here is somewhat repetitive, and again I don't really know how to make it simpler.
And finally, the Fuzzy module is as follows:
module Fuzzy
( score
) where
import Util
score :: (String, [String]) -> Float
score ([], _ ) = 0
score (_ , []) = 0
score (s , t ) = boolToFloat (head s == head (head t))
+ max (score (tail s, t) * 0.8) (score (tail s, nextChar t))
This module (and function) is the one I have the least concerns about - most of the problems in my code (as I perceive them) are about IO and redundancy in splitWord's pattern matching. Thanks for any advice!