Special Character

Special Character

am 16.11.2006 14:10:56 von David Skyers

------_=_NextPart_001_01C70980.A6BB7CB9
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Hi ,

We have a problem with storing data into our oracle database that is
copied from Microsoft word. A typical example would be a hyphen, if
entered directly in an input box this is stored fine. However if copied
from Microsoft word it is displayed as a question mark.

Has anyone else experienced this and are there any resolutions?

Thanks

David=20


------_=_NextPart_001_01C70980.A6BB7CB9--

Re: Special Character

am 16.11.2006 14:26:25 von Niel Archer

Hi

Is it really a hyphen, or is it an en- or em-dash. The latter two could
mess up in in some character sets.

Niel

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Special Character

am 16.11.2006 15:27:26 von David Skyers

Hi Niel

That's the problem I don't know what it is, but I do know that I press
the hyphen/minus key on my keyborad. If I type in 'The -' then press
enter and copy and paste this from word, it then gets displayed as
question mark in Oracle.

David

=20

-----Original Message-----
From: Niel Archer [mailto:Niel Archer] On Behalf Of Niel Archer
Sent: 16 November 2006 13:26
To: php-db@lists.php.net
Subject: Re: [PHP-DB] Special Character

Hi

Is it really a hyphen, or is it an en- or em-dash. The latter two could
mess up in in some character sets.

Niel

--
PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit:
http://www.php.net/unsub.php

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Special Character

am 16.11.2006 15:45:40 von Niel Archer

Hi David

What you describe sounds like Word is auto replacing hyphens with either
en- or em-dashes. This is a configurable option in Word that often
defaults to on. Try using double quotes, If they get switched to 66's
and 99's style quotes, then that is likely the problem. I no longer use
MS Office for these and other reasons, so cannot tell you how to switch
off this formatting. But it can be switched off, somewhere within it.

The only other option I can think of would be to change your Db
character set to one that can accept these extended characters. That
might also mean changing some of Window's/Word's behaviour (to be using
UTF-8 for example).

Niel

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Special Character

am 16.11.2006 16:12:59 von Dan Shirah

------=_Part_90510_12043829.1163689979744
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

To turn off the auto formatting of hyphens:

In Microsoft Word 2003:

Open a new document.
Go to Tools>Auto Correct Options>
Select the "Auto Format As You Type" tab
Deselect the "Hyphens (--) with Dash (-)" option.

Even though it says it will replace a double hyphen (--) with a Dash (This
is an em dash) it also does the same thing for a single hyphen depending on
the sentance structure.

Hope this helps!

Dan


On 11/16/06, Niel Archer wrote:
>
> Hi David
>
> What you describe sounds like Word is auto replacing hyphens with either
> en- or em-dashes. This is a configurable option in Word that often
> defaults to on. Try using double quotes, If they get switched to 66's
> and 99's style quotes, then that is likely the problem. I no longer use
> MS Office for these and other reasons, so cannot tell you how to switch
> off this formatting. But it can be switched off, somewhere within it.
>
> The only other option I can think of would be to change your Db
> character set to one that can accept these extended characters. That
> might also mean changing some of Window's/Word's behaviour (to be using
> UTF-8 for example).
>
> Niel
>
> --
> PHP Database Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

------=_Part_90510_12043829.1163689979744--

RE: Special Character

am 16.11.2006 16:32:24 von David Skyers

------_=_NextPart_001_01C70994.6A5EFEBC
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Thanks,

The problem is, we will have hundreds of users using Microsoft Word and
we cannot switch it off for all of them. Ideally I need some type of
string replace function, so no matter what they enter it gets trapped an
replaced.

It's not the normal hyphens that cause a problem but the long hyphens.

Regards

David

-----Original Message-----
From: Dan Shirah [mailto:mrsquash2@gmail.com]
Sent: 16 November 2006 15:13
To: php-db@lists.php.net
Subject: Re: [PHP-DB] Special Character

To turn off the auto formatting of hyphens:

In Microsoft Word 2003:

Open a new document.
Go to Tools>Auto Correct Options>
Select the "Auto Format As You Type" tab Deselect the "Hyphens (--) with
Dash (-)" option.

Even though it says it will replace a double hyphen (--) with a Dash
(This is an em dash) it also does the same thing for a single hyphen
depending on the sentance structure.

Hope this helps!

Dan


On 11/16/06, Niel Archer wrote:
>
> Hi David
>
> What you describe sounds like Word is auto replacing hyphens with
> either
> en- or em-dashes. This is a configurable option in Word that often
> defaults to on. Try using double quotes, If they get switched to 66's
> and 99's style quotes, then that is likely the problem. I no longer
> use MS Office for these and other reasons, so cannot tell you how to
> switch off this formatting. But it can be switched off, somewhere
within it.
>
> The only other option I can think of would be to change your Db
> character set to one that can accept these extended characters. That
> might also mean changing some of Window's/Word's behaviour (to be
> using
> UTF-8 for example).
>
> Niel
>
> --
> PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit:
> http://www.php.net/unsub.php
>
>



------_=_NextPart_001_01C70994.6A5EFEBC--

Re: Special Character

am 16.11.2006 16:56:36 von Niel Archer

Hi David

off the top of my head, the best I can suggest is using PHP's MB
functions (lookup mb_string) to allow recognition and then convert them
to hyphens.

Niel

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Special Character

am 16.11.2006 17:01:31 von Niel Archer

Hi

Doh... that's 'mbstring'

Niel

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Special Character

am 16.11.2006 17:44:58 von Kevin Murphy

--Apple-Mail-5-704704362
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
charset=US-ASCII;
delsp=yes;
format=flowed

A solution I use is to do apply this function to all POST data
collected. The key here for your problem is the chr(150) and chr(151)
that are replaced with a normal hyphen. This also takes care of MS
Words Smart Quotes. If there are other MS Characters you need to
convert, just add them to the pattern field as the numeric version,
and then add one to the array what the replacement is.

function sanitize($data)
{
$pattern = array(chr(145),chr(146),chr(147),chr(148),chr(150),chr
(151));
$replacements = array("'","'",'"','"','-','--');
$data = str_replace($pattern,$replacements,$data);
$data = trim($data);
$data = preg_replace("/ +/", " ", $data);
$data = addslashes($data);
return $data;
}



--
Kevin Murphy
Webmaster: Information and Marketing Services
Western Nevada Community College
www.wncc.edu
775-445-3326


On Nov 16, 2006, at 7:32 AM, David Skyers wrote:

> Thanks,
>
> The problem is, we will have hundreds of users using Microsoft Word
> and
> we cannot switch it off for all of them. Ideally I need some type of
> string replace function, so no matter what they enter it gets
> trapped an
> replaced.
>
> It's not the normal hyphens that cause a problem but the long hyphens.
>
> Regards
>
> David
>
> -----Original Message-----
> From: Dan Shirah [mailto:mrsquash2@gmail.com]
> Sent: 16 November 2006 15:13
> To: php-db@lists.php.net
> Subject: Re: [PHP-DB] Special Character
>
> To turn off the auto formatting of hyphens:
>
> In Microsoft Word 2003:
>
> Open a new document.
> Go to Tools>Auto Correct Options>
> Select the "Auto Format As You Type" tab Deselect the "Hyphens (--)
> with
> Dash (-)" option.
>
> Even though it says it will replace a double hyphen (--) with a Dash
> (This is an em dash) it also does the same thing for a single hyphen
> depending on the sentance structure.
>
> Hope this helps!
>
> Dan
>
>
> On 11/16/06, Niel Archer wrote:
>>
>> Hi David
>>
>> What you describe sounds like Word is auto replacing hyphens with
>> either
>> en- or em-dashes. This is a configurable option in Word that often
>> defaults to on. Try using double quotes, If they get switched to
>> 66's
>> and 99's style quotes, then that is likely the problem. I no longer
>> use MS Office for these and other reasons, so cannot tell you how to
>> switch off this formatting. But it can be switched off, somewhere
> within it.
>>
>> The only other option I can think of would be to change your Db
>> character set to one that can accept these extended characters. That
>> might also mean changing some of Window's/Word's behaviour (to be
>> using
>> UTF-8 for example).
>>
>> Niel
>>
>> --
>> PHP Database Mailing List (http://www.php.net/) To unsubscribe,
>> visit:
>> http://www.php.net/unsub.php
>>
>>
>
>


--Apple-Mail-5-704704362--

RE: Special Character

am 22.11.2006 11:41:23 von David Skyers

------_=_NextPart_001_01C70E22.C1487846
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi Kevin,
=20
Thanks for this, it got rid of my special characters. However I now have =
a problem with foreign characters.
=20
Example=20
=20
'De l'=C2ge du fer au haut Moyen =C2ge.'=20
=20
gets inserted into oracle as=20
=20
De l'Âge du fer au haut Moyen Âge.
=20
I have an oracle procedure that inserts the data, if I run the oracle =
procedure directly in oracle it inserts the special characters okay. The =
problem seems to be with the way php execute the procedure.
=20
Any ideas?
=20
David

_____ =20

From: Kevin Murphy [mailto:php@stubborndonkey.com]=20
Sent: 16 November 2006 16:45
To: David Skyers
Cc: php-db@lists.php.net
Subject: Re: [PHP-DB] Special Character


A solution I use is to do apply this function to all POST data =
collected. The key here for your problem is the chr(150) and chr(151) =
that are replaced with a normal hyphen. This also takes care of MS Words =
Smart Quotes. If there are other MS Characters you need to convert, just =
add them to the pattern field as the numeric version, and then add one =
to the array what the replacement is. =20

function sanitize($data)
{
$pattern =3D =
array(chr(145),chr(146),chr(147),chr(148),chr(150),chr(151)) ;
$replacements =3D array("'","'",'"','"','-','--');
$data =3D str_replace($pattern,$replacements,$data);
$data =3D trim($data);
$data =3D preg_replace("/ +/", " ", $data);
$data =3D addslashes($data);
return $data;
}




--=20
Kevin Murphy
Webmaster: Information and Marketing Services
Western Nevada Community College
www.wncc.edu
775-445-3326


On Nov 16, 2006, at 7:32 AM, David Skyers wrote:


Thanks,

The problem is, we will have hundreds of users using Microsoft Word and
we cannot switch it off for all of them. Ideally I need some type of
string replace function, so no matter what they enter it gets trapped =
an
replaced.

It's not the normal hyphens that cause a problem but the long hyphens.

Regards

David

-----Original Message-----
From: Dan Shirah [mailto:mrsquash2@gmail.com]
Sent: 16 November 2006 15:13
To: php-db@lists.php.net
Subject: Re: [PHP-DB] Special Character

To turn off the auto formatting of hyphens:

In Microsoft Word 2003:

Open a new document.
Go to Tools>Auto Correct Options>
Select the "Auto Format As You Type" tab Deselect the "Hyphens (--) =
with
Dash (-)" option.

Even though it says it will replace a double hyphen (--) with a Dash
(This is an em dash) it also does the same thing for a single hyphen
depending on the sentance structure.

Hope this helps!

Dan


On 11/16/06, Niel Archer wrote:


Hi David

What you describe sounds like Word is auto replacing hyphens with
either
en- or em-dashes. This is a configurable option in Word that often
defaults to on. Try using double quotes, If they get switched to 66's
and 99's style quotes, then that is likely the problem. I no longer
use MS Office for these and other reasons, so cannot tell you how to
switch off this formatting. But it can be switched off, somewhere

within it.


The only other option I can think of would be to change your Db
character set to one that can accept these extended characters. That
might also mean changing some of Window's/Word's behaviour (to be
using
UTF-8 for example).

Niel

--
PHP Database Mailing List (http://www.php.net/) To unsubscribe, visit:
http://www.php.net/unsub.php







------_=_NextPart_001_01C70E22.C1487846--

Re: Special Character

am 22.11.2006 22:17:12 von Chris

David Skyers wrote:
> Hi Kevin,
>
> Thanks for this, it got rid of my special characters. However I now have a problem with foreign characters.
>
> Example
>
> 'De l'Âge du fer au haut Moyen Âge.'
>
> gets inserted into oracle as
>
> De l'Âge du fer au haut Moyen Âge.
>
> I have an oracle procedure that inserts the data, if I run the oracle procedure directly in oracle it inserts the special characters okay. The problem seems to be with the way php execute the procedure.

Are you calling htmlentities or htmlspecialchars before calling the
procedure? That looks like what's happening.

--
Postgresql & php tutorials
http://www.designmagick.com/

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Special Character

am 23.11.2006 07:39:00 von Vincent DUPONT

I you use a SQL adapter, like ADO, it could do the htmlentities it =
self...

However, if you would like to change this, note that oracle will =
consider some chars as 'special chars', like (if I remember well) the &.
To be able to insert a string with these chars, you need to =
search/replace them by their ascii value (use the Oracle function chr()
This is specially true when inserting urls.

vincent


-----Original Message-----
From: Chris [mailto:dmagick@gmail.com]
Sent: Wed 22/11/2006 22:17
To: David Skyers
Cc: php-db@lists.php.net
Subject: Re: [PHP-DB] Special Character
=20
David Skyers wrote:
> Hi Kevin,
> =20
> Thanks for this, it got rid of my special characters. However I now =
have a problem with foreign characters.
> =20
> Example=20
> =20
> 'De l'=C2ge du fer au haut Moyen =C2ge.'=20
> =20
> gets inserted into oracle as=20
> =20
> De l'Âge du fer au haut Moyen Âge.
> =20
> I have an oracle procedure that inserts the data, if I run the oracle =
procedure directly in oracle it inserts the special characters okay. The =
problem seems to be with the way php execute the procedure.

Are you calling htmlentities or htmlspecialchars before calling the=20
procedure? That looks like what's happening.

--=20
Postgresql & php tutorials
http://www.designmagick.com/

--=20
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Special Character

am 23.11.2006 10:29:24 von David Skyers

Hi Chris,

Yes, I was using htmlentities, I have now removed that and now=20

'De l'=C2ge du fer au haut Moyen =C2ge.'

gets inserted into oracle as
=20
'De l'Bge du fer au haut Moyen Bge.' =20

Any ideas?

David
-----Original Message-----
From: Chris [mailto:dmagick@gmail.com]=20
Sent: 22 November 2006 21:17
To: David Skyers
Cc: php-db@lists.php.net
Subject: Re: [PHP-DB] Special Character

David Skyers wrote:
> Hi Kevin,
> =20
> Thanks for this, it got rid of my special characters. However I now =
have a problem with foreign characters.
> =20
> Example
> =20
> 'De l'=C2ge du fer au haut Moyen =C2ge.'=20
> =20
> gets inserted into oracle as
> =20
> De l'Âge du fer au haut Moyen Âge.
> =20
> I have an oracle procedure that inserts the data, if I run the oracle =
procedure directly in oracle it inserts the special characters okay. The =
problem seems to be with the way php execute the procedure.

Are you calling htmlentities or htmlspecialchars before calling the =
procedure? That looks like what's happening.

--
Postgresql & php tutorials
http://www.designmagick.com/

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Special Character

am 23.11.2006 10:53:45 von Vincent DUPONT

Hi,

You can maybe have a problem between a UTF-8 webserver and a ASCII =
(latin or whatever) database?
Do you know the default_charset of your webserver (check the php.ini) =
and do you know the charset of your database? and the charset of your =
oracle client ? all 3 will play a role...

this : 'De l'Bge du fer au haut Moyen Bge.' really looks like inserting =
a UTF/unicode char into a ascii (latin, ...) database...


I think the best solution would be to keep the htmlentities to store the =
chars into the database, no?
this way would ensure you insert only 'simple' chars, at least as long =
as you do not enter chinese or foreign chars in your website.

vincent



-----Original Message-----
From: David Skyers [mailto:d.skyers@ucl.ac.uk]
Sent: Thu 23/11/2006 10:29
To: Chris
Cc: php-db@lists.php.net
Subject: RE: [PHP-DB] Special Character
=20
Hi Chris,

Yes, I was using htmlentities, I have now removed that and now=20

'De l'=C2ge du fer au haut Moyen =C2ge.'

gets inserted into oracle as
=20
'De l'Bge du fer au haut Moyen Bge.' =20

Any ideas?

David
-----Original Message-----
From: Chris [mailto:dmagick@gmail.com]=20
Sent: 22 November 2006 21:17
To: David Skyers
Cc: php-db@lists.php.net
Subject: Re: [PHP-DB] Special Character

David Skyers wrote:
> Hi Kevin,
> =20
> Thanks for this, it got rid of my special characters. However I now =
have a problem with foreign characters.
> =20
> Example
> =20
> 'De l'=C2ge du fer au haut Moyen =C2ge.'=20
> =20
> gets inserted into oracle as
> =20
> De l'Âge du fer au haut Moyen Âge.
> =20
> I have an oracle procedure that inserts the data, if I run the oracle =
procedure directly in oracle it inserts the special characters okay. The =
problem seems to be with the way php execute the procedure.

Are you calling htmlentities or htmlspecialchars before calling the =
procedure? That looks like what's happening.

--
Postgresql & php tutorials
http://www.designmagick.com/

--=20
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php