The Table Of Contents for Windows CHM files.

The Table Of Contents for Windows CHM files.

am 27.06.2011 18:29:29 von Richard Quadling

Hello all.

The Windows CHM files are built weekly and all output from the various
processes are logged accordingly.

When building the fa and ro translations, there are some issues with
the code that generates the table of contents for the CHM file.

In essence the UTF-8 text being incorporated into the TOC has to be
translated to a particular codepage (Windows-1254 for fa and
Windows-1250 for ro).

Unfortunately, these codepages don't contain all the symbols required
for the original text and a log entry is created saying ...

"[08:45:59 - E_NOTICE ]
C:\pear\phd-trunk\phpdotnet\phd\Package\PHP\CHM.php:263
iconv(): Detected an illegal character in input string".

This is as expected based upon
http://docs.php.net/manual/en/function.iconv.php (out_charset
parameter) ...

"The output charset.

If you append the string //TRANSLIT to out_charset transliteration is
activated. This means that when a character can't be represented in
the target charset, it can be approximated through one or several
similarly looking characters. If you append the string //IGNORE,
characters that cannot be represented in the target charset are
silently discarded. Otherwise, str is cut from the first illegal
character and an E_NOTICE is generated."


I think the codepage for fa is wrong (based upon
http://msdn.microsoft.com/en-us/goglobal/bb896001.aspx) and that it
should be Windows-1256 - it made no difference to the output in terms
of notices.




Is there a more appropriate codepage to be used? (I don't think so)

Should we be using //TRANSLIT//IGNORE? (I don't think so but may have to),

Or, for these 2 languages, should we use the English data (purely for
the CHM TOC, Search and index tabs).

If you have a native language CHM file for any purpose where the TOC
correctly shows your native language, then I could see how this has
been put together and make appropriate corrections (if possible).

As things stand though, it looks like the Windows CHM viewer has
certain limitations to the codepages it will support as part of the
TOC, search and index.

Richard.

--
Richard Quadling
Twitter : EE : Zend : PHPDoc
@RQuadling : e-e.com/M_248814.html : bit.ly/9O8vFY : bit.ly/lFnVea