I'm quite new to Perl and I'm having immense difficulty writing a Perl script that will successfully parse a structured text file.
I have a collection of files that look like this:
name:
John Smith
occupation:
Electrician
date of birth:
2/6/1961
hobbies:
Boating
Camping
Fishing
And so on. The field name is always followed by a colon, and all the data associated with those fields is always indented by a single tab (\t).
I would like to create a hash that will directly associate the field contents with the field name, like this:
$contents{$name} = "John Smith"
$contents{$hobbies} = "Boating, Camping, Fishing"
Or something along those lines.
So far I've been able to get all the field names into a hash by themselves, but I've not had any luck wrangling the field data into a form that can be nicely stored in a hash. Clearly substituting/splitting newlines followed by tabs won't work (I've tried, somewhat naively). I've also tried a crude lookahead where I create a duplicate array of lines from the file and using that to figure out where the field boundaries are, but it's not that great in terms of memory consumption.
FWIW, currently I'm going through the file line by line, but I'm not entirely convinced that this is the best solution. Is there any way to do this parsing in a straightforward manner?