Mix of English and Cyrillic Characters

Mix of English and Cyrillic Characters

am 02.04.2011 15:56:46 von Barry-Home

Hi,

I am working on a script where I have strings that contain an English
string followed by the Cyrillic translation. For now, I am looking for a
way to strip out the Cyrillic characters and and leave the English ones.
I have tried a simple regular expression such as :

$text =~ s/Surname.+/Surname/g;

Which doesn't seem to Match.

Any help is appreciated.

Barry


--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Mix of English and Cyrillic Characters

am 03.04.2011 05:27:05 von Brian Fraser

--0016e6dd8f449b13c1049ffb379f
Content-Type: text/plain; charset=ISO-8859-1

I don't really know the first thing about Cyrillic, so you'll probably have
to play around with this before making it work like you want it to. It makes
use of Unicode character properties, which you can start learning from
perluniprops[0]:

$text =~ s/[\p{Cyrillic}\p{Block: Cyrillic}\p{Block:
Cyrillic_Extended_A}\p{Block: Cyrillic_Extended_B}\p{Block:
Cyrillic_Supplement}]+//g;

Brian.

[0] http://perldoc.perl.org/perluniprops.html

--0016e6dd8f449b13c1049ffb379f--

Re: Mix of English and Cyrillic Characters

am 04.04.2011 17:31:07 von merlyn

>>>>> "Barry-Home" == Barry-Home writes:

Barry-Home> I am working on a script where I have strings that contain
an English string Barry-Home> followed by the Cyrillic translation.

"perldoc perluniintro" would be a good start, since you're gonna be
knee-deep in unicode issues. And if you have it, "perlunitut" and
"perlunifaq".

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095

Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.posterous.com/ for Smalltalk discussion

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/