I am trying to create a program that reads a text file and splits the text into a list and then creates a tuple containing each would with how many times it occurs in the text. I then need to be able to remove certain words from the list and print the final list.
I have tried different ways to try and filter Strings from a list of Strings in Haskell with no success. I have found that the filter function is the best for what I want to do, but am not sure how to implement it.
The code that I have so far is that splits up text read from a file into a list of Strings:
toWords :: String -> [String]
toWords s = words s
I then added this to remove specific Strings from the list:
toWords :: String -> [String]
toWords s = words s
toWords s = filter (`elem` "an")
toWords s = filter (`elem` "the")
toWords s = filter (`elem` "for")
Which I know is wrong, but am unsure as to how to do it. Please can anyone help me with this.
Here is my full code so far:
main = do
contents <- readFile "testFile.txt"
let lowContents = map toLower contents
let outStr = countWords (lowContents)
let finalStr = sortOccurrences (outStr)
print outStr
-- Counts all the words.
countWords :: String -> [(String, Int)]
countWords fileContents = countOccurrences (toWords fileContents)
-- Splits words.
toWords :: String -> [String]
toWords s = words s
toWords s = filter (`elem` "an")
toWords s = filter (`elem` "the")
toWords s = filter (`elem` "for")
-- Counts, how often each string in the given list appears.
countOccurrences :: [String] -> [(String, Int)]
countOccurrences xs = map (\xs -> (head xs, length xs)) . group . sort $ xs
-- Sort list in order of occurrences.
sortOccurrences :: [(String, Int)] -> [(String, Int)]
sortOccurrences sort = sortBy comparing snd
toWordsdo? Please give some sample input and output. The way it is nowtoWords = wordsby the first line intoWords(the other lines are ignored) which makes no sense because then you could usewordsinstead. BTW it won't even compile.toWordssplits a text file into a list of Strings. So 'hello my name is james' would be - [hello, my, name, is, james].