procmail conversion utf-8 to ISO-8859-1 with iconv

procmail conversion utf-8 to ISO-8859-1 with iconv

am 16.06.2006 16:49:24 von raffe

Hej,

I've a problem with my procmail conversion. When I receive email sent
from an HP printer-scanner, they arrived with UTF-8 encoding charset.
I need to convert all of them to ISO-8859-1, so I've found the procmail
recipe whose done just bottom.
But This doesn't work, my email stay in utf-8... When I attempt from
command line to launch iconv like this :

iconv --from=UTF-8 --to=ISO-8859-1 emailfile1 > emailfile-convert

I have the same result, no conversion ...

Does someone have an idea ?

Thanks in advance,

Raphael


PATH=/bin:/usr/bin:/usr/local/bin
LOGFILE=/var/log/procmail.log

# convert utf to latin
:0
* ^Content-Type: text/(plain|html); .*charset=.?utf-8
{
:0 fbw
|iconv -f UTF-8 -t ISO-8859-1//TRANSLIT

:0 fhw
* ^Content-Type: text/plain
|formail -c -i "Content-Type: text/plain; charset=ISO-8859-1"

:0 Efhw
* ^Content-Type: text/html
|formail -c -i "Content-Type: text/html; charset=ISO-8859-1"
}

Re: procmail conversion utf-8 to ISO-8859-1 with iconv

am 16.06.2006 23:04:03 von Sam

This is a MIME GnuPG-signed message. If you see this text, it means that
your E-mail or Usenet software does not support MIME signed messages.
The Internet standard for MIME PGP messages, RFC 2015, was published in 1996.
To open this message correctly you will need to install E-mail or Usenet
software that supports modern Internet standards.

--=_mimegpg-commodore.email-scan.com-32047-1150491842-0001
Content-Type: text/plain; format=flowed; charset="US-ASCII"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit

RaFFe writes:

> But This doesn't work, my email stay in utf-8... When I attempt from
> command line to launch iconv like this :
>
> iconv --from=UTF-8 --to=ISO-8859-1 emailfile1 > emailfile-convert
>
> I have the same result, no conversion ...
>
> Does someone have an idea ?

iconv does not understand the quoted-printable MIME encoding that UTF-8
messages are likely to use. You cannot do this with iconv and procmail
alone.

You will need to write a custom script that actually parses MIME headers
properly, instead of doing a stupid plain-text search for the "Content-Type"
header. Your parser will need to examine each MIME section individually,
select each MIME section that you want to transcode to a different character
set, decode the MIME section, if it uses quoted-printable or base64
encoding, perform the transcoding from UTF-8 to ISO-8859-1, reencode the
MIME section back to quoted-printable, if you so choose, and reassemble the
MIME-formatted message.

You might, just might get away with reencoding the entire message from
quoted-printable to 8bit, do a stupid, brute-force, raw transcoding of the
entire blob from UTF-8 to ISO-8859-1, manually hack the headers, then cross
your fingers and pray that the end result ends up vaguely resembling a MIME
message.



--=_mimegpg-commodore.email-scan.com-32047-1150491842-0001
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQBEkxzCx9p3GYHlUOIRAuOaAJ9IALO1YeR4FWaAtNapjgu4n971yQCe Ii8E
jBP0tWRhc/spZ4GaG82j2+I=
=8QjL
-----END PGP SIGNATURE-----

--=_mimegpg-commodore.email-scan.com-32047-1150491842-0001--