I am trying to recover the attachments in an email file that was recovered from my harddisk crash. The file is essentially a concatenation of messages, including the attachments in base-64 encoding.
One search here suggested the munpack could be used to convert a block of text to the relevant file and in a test that worked fine. But it cannot seem to take the entire mess of a file and go through it, extracting blocks as it comes across them. That is what I want to figure out how to do.
This post
Extract text between two specific lines
seemed to suggest a way to pull out relevant blocks of base-64 encoded text and put them in a file.
My attempt was to use this line
cat test.txt | sed -n "/Content-Type: image/,/--=/p" > test2.txt
and then to run munge on test2.txt
Depending on the trailing delimiter I use (-- or --= or --=_) I either get only the first image in the file or just a duplicate of test2.txt
Other searches here have turned up nothing, as well as Google searches.
I would think someone else has come across this problem before. Can anyone point me to the solution?
TIA,
Matt
PS I tried importing the file into Thunderbird and it comes in as only one message. So that approach has already been tried and fails miserably.
Update 1: This is a kludge, admittedly, but it kinda-sorta works:
cat test.txt | sed -n '/^Content-Type: image/,/--.*/ p' > test1.txt
cat test1.txt | sed 's/--.*/--_31415927/' > test2.txt
cat emailfmt2.eml test2.txt > test_images.eml
Where:
test.txt is the original text file with emails embedded,
_31415927"_31415927" is a customized boundary,
emailfmt2.eml is the header of an email:
Date: Sun, 01 Jul 2027 00:00:00 +0000
Subject: Test format
From: [email protected]
To: [email protected]
Content-Type: multipart/mixed; boundary="_31415927"
--_31415927
Content-Type: text/plain; charset="UTF-8"; format=flowed; delsp=yes
test format...
--_31415927
Subject: Test format
From: [email protected]
To: [email protected]
Content-Type: multipart/mixed; boundary="_31415927"
--_31415927
Content-Type: text/plain; charset="UTF-8"; format=flowed; delsp=yes
test format...
--_31415927
So what I'm doing here is (ideally) collecting everything between "image" tags and then replacing the different boundary markers with my customized one. That result is then appended to a template for an email (emailfmt2.eml).
When the resulting .eml file is brought into Thunderbird, it automatically parses everything into attachments. Which I can at least deal with in T-bird. (As an aside, it looks like T-bird doesnt' display GIF files. Or at least I don't have it set up to deal with them nicely).
I haven't tried figuring out how mungemunpack (could?) work, but it's a start.