handling utf8 characters
am 31.08.2006 21:44:17 von tagsense--0-1545852602-1157053457=:834
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
hello,
i am trying to write a bot to download wkipedia artictles using WWW:Wikipedia, a subclass of LWP::UserAgent. pages returned by the wikipedia server contains utf8 characters such as LATIN CAPITAL LETTER O WITH DIAERESIS. however, i see that the lwp module is not handling the search results as utf8 encoded. i see that th e character Ö is treated as three individual bytes and not a single character. how do i specify that the lwp useragent must handle utf8 chars?
thanks in advance,
dave
---------------------------------
Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail.
--0-1545852602-1157053457=:834--