Usage of strlen(tuf8_decode()) and "/u" regex modifier

Usage of strlen(tuf8_decode()) and "/u" regex modifier

am 21.09.2009 02:09:38 von GoForThisWorld

Hello,

As indicated below, the "strlen(tuf8_decode())" and the "/u" regex
modifier do not work as per my understanding.

1) What is my misunderstanding?


$the_string = 'Марина Орлова';
echo "

author (85 bytes):$the_string," . strlen($the_string) . ',' . strlen( utf8_decode( $the_string ) ) . ',' .
strlen( utf8_decode( utf8_encode($the_string) ) ) . ',' . "

";
// all the number echoed are 85, I expected at least one to be 13


$max_length = 20;
$is_short = preg_match( '/^.{1,$max_length}$/u', uft8_encode( $the_string ) ) );
// expect the above to return 1

$max_length = 10;
$is_short = preg_match( '/^.{1,$max_length}$/u', uft8_encode( $the_string ) ) );
// expect the above to return 0

?>

More generally, given a string $the_string:

2) how to determine what encoding is being used?

3) how to determine the number of visible characters?

4) if it has more than N visible characters, how to
truncate it after N visible characters?

Thanks!


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php