FAQ 6.5 How do I substitute case insensitively on the LHS while preserving case on the RHS?

FAQ 6.5 How do I substitute case insensitively on the LHS while preserving case on the RHS?

am 13.10.2007 15:03:06 von PerlFAQ Server

This is an excerpt from the latest version perlfaq6.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

------------------------------------------------------------ --------

6.5: How do I substitute case insensitively on the LHS while preserving case on the RHS?



Here's a lovely Perlish solution by Larry Rosler. It exploits properties
of bitwise xor on ASCII strings.

$_= "this is a TEsT case";

$old = 'test';
$new = 'success';

s{(\Q$old\E)}
{ uc $new | (uc $1 ^ $1) .
(uc(substr $1, -1) ^ substr $1, -1) x
(length($new) - length $1)
}egi;

print;

And here it is as a subroutine, modeled after the above:

sub preserve_case($$) {
my ($old, $new) = @_;
my $mask = uc $old ^ $old;

uc $new | $mask .
substr($mask, -1) x (length($new) - length($old))
}

$a = "this is a TEsT case";
$a =~ s/(test)/preserve_case($1, "success")/egi;
print "$a\n";

This prints:

this is a SUcCESS case

As an alternative, to keep the case of the replacement word if it is
longer than the original, you can use this code, by Jeff Pinyan:

sub preserve_case {
my ($from, $to) = @_;
my ($lf, $lt) = map length, @_;

if ($lt < $lf) { $from = substr $from, 0, $lt }
else { $from .= substr $to, $lf }

return uc $to | ($from ^ uc $from);
}

This changes the sentence to "this is a SUcCess case."

Just to show that C programmers can write C in any programming language,
if you prefer a more C-like solution, the following script makes the
substitution have the same case, letter by letter, as the original. (It
also happens to run about 240% slower than the Perlish solution runs.)
If the substitution has more characters than the string being
substituted, the case of the last character is used for the rest of the
substitution.

# Original by Nathan Torkington, massaged by Jeffrey Friedl
#
sub preserve_case($$)
{
my ($old, $new) = @_;
my ($state) = 0; # 0 = no change; 1 = lc; 2 = uc
my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new));
my ($len) = $oldlen < $newlen ? $oldlen : $newlen;

for ($i = 0; $i < $len; $i++) {
if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) {
$state = 0;
} elsif (lc $c eq $c) {
substr($new, $i, 1) = lc(substr($new, $i, 1));
$state = 1;
} else {
substr($new, $i, 1) = uc(substr($new, $i, 1));
$state = 2;
}
}
# finish up with any remaining new (for when new is longer than old)
if ($newlen > $oldlen) {
if ($state == 1) {
substr($new, $oldlen) = lc(substr($new, $oldlen));
} elsif ($state == 2) {
substr($new, $oldlen) = uc(substr($new, $oldlen));
}
}
return $new;
}



------------------------------------------------------------ --------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.