Extracting attachments from a standalone mailbox

Extracting attachments from a standalone mailbox

am 23.04.2008 23:41:55 von Howard Kaikow

Both Eudora and Thunderbird store messages in a format that seems to be
compatible with the format used by sendmail.

Is there a standalone program that will proces a mailbpx directly, outside
of a mail program, and save all the attachments in the mailbox?

If not, which RFC defines the format of mailbox files?

Re: Extracting attachments from a standalone mailbox

am 24.04.2008 00:25:33 von Mark Crispin

On Wed, 23 Apr 2008, Howard Kaikow posted:
> If not, which RFC defines the format of mailbox files?

There is no such RFC; nor is the format of a mailbox file in scope for any
RFC. RFCs describe protocols, not file formats.

The most common mailbox file format is one in which all messages are in a
single file, delimited with lines which start with "From ", e.g.,
From user@example.com Wed Apr 23 15:18:20 2008

There are numerious variations on the format of this "From " line,
particularly in the syntax of the date/time; but this format and its
variations collectively is known as "traditional UNIX mailbox format" or
"mbox format".

Any message text which happens to look like a "From " line should have a
">" character inserted in front of it to prevent that line from looking
like a message delimiter line. Some mail programs do this for all lines
that happen to start with "From ", e.g.,
>From the above text, you can see ...
and others only do it with lines that match the syntax.

There may be documents in various UNIX standards and distributions that
define this format in further detail, but the above more or less says
what you need to know.

Note that there are many other mailbox formats, all of which have their
proponents (and detractors). Not all mailbox formats store all the
messages in a single file; for example a common variation stores each
message in its own file.

Nonetheless, traditional UNIX format is the closest there is to a
"universal format". It is inferior and problematic in many ways, but most
software on most systems is capable of reading it (and converting it to
the preferred format, whatever that is).

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.
Si vis pacem, para bellum.

Re: Extracting attachments from a standalone mailbox

am 24.04.2008 00:54:32 von Howard Kaikow

Thanx.

Then the question becomes "Where are the mailbox formats defined?"

About 12 years ago, I created a program that generated a mailbox that coulbe
used either by Eudora or Sendmail.
Now, I've modified that program so the mailbox will work with Thunderbird.

However, none of those include attachments. I might want to add attachments.

At this time, I'm looking for a program that will extract attachments from
an already existing mailbox.
The mailbox would likely be Thunderbird format.

If I can find the spec, I can write the code.

Re: Extracting attachments from a standalone mailbox

am 24.04.2008 02:43:03 von Sam

This is a MIME GnuPG-signed message. If you see this text, it means that
your E-mail or Usenet software does not support MIME signed messages.
The Internet standard for MIME PGP messages, RFC 2015, was published in 1996.
To open this message correctly you will need to install E-mail or Usenet
software that supports modern Internet standards.

--=_mimegpg-commodore.email-scan.com-31464-1208997783-0004
Content-Type: text/plain; format=flowed; charset="US-ASCII"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit

Howard Kaikow writes:

> Thanx.
>
> Then the question becomes "Where are the mailbox formats defined?"

By whatever program creates or reads the mailbox.

> About 12 years ago, I created a program that generated a mailbox that coulbe
> used either by Eudora or Sendmail.
> Now, I've modified that program so the mailbox will work with Thunderbird.
>
> However, none of those include attachments. I might want to add attachments.
>
> At this time, I'm looking for a program that will extract attachments from
> an already existing mailbox.

There's no such thing as an "attachments" in a given mailbox. Generally, a
mailbox contains messages, and nothing else. Each individual message may or
may not have attachments.

Generally, attachments in E-mail messages are MIME-formatted. Of course, a
given E-mail client may strip off the attachments from each message, and
store them using some other mechanism. It's entirely up to the E-mail
program.

> The mailbox would likely be Thunderbird format.

Thunderbird generally uses the common mbox format, and leaves the individual
messages as MIME-formatted.

> If I can find the spec, I can write the code.

MIME is specified by the following documents:

http://www.faqs.org/rfcs/rfc2045.html
http://www.faqs.org/rfcs/rfc2046.html
http://www.faqs.org/rfcs/rfc2047.html
http://www.faqs.org/rfcs/rfc2048.html
http://www.faqs.org/rfcs/rfc2049.html

There are also other documents that cover other minor parts of message
formatting, such as http://www.faqs.org/rfcs/rfc2231.html, but you probably
don't need to be concerned with those.

So, basically, you begin by extracting individual messages from the mailbox,
then take each individual messages, and parse its MIME content, then choose
what to do with the message's individual MIME components.


--=_mimegpg-commodore.email-scan.com-31464-1208997783-0004
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (GNU/Linux)

iD8DBQBID9eXx9p3GYHlUOIRApQSAJ96mFHiFtp9+94ZLKx+h3CdGnHzmQCf XfGt
CQcslSK6U6Q3Ss7nbFYgOJo=
=DqGV
-----END PGP SIGNATURE-----

--=_mimegpg-commodore.email-scan.com-31464-1208997783-0004--

Re: Extracting attachments from a standalone mailbox

am 24.04.2008 09:02:46 von Howard Kaikow

Thanx.