Regular Expression change

am 15.07.2011 17:42:26 von David.Wagner

I have the following map:
=09
map{[$_,(/^\d/ ? 1 : 0) . /^([^;]+)/,
/[^;]+;[^;]*;[^;]+;[^;]+;([^;]+);/]}

I had a failure during the night because some data field(s) had
a semi-colon in the data. So what I have is a pre-defined data separator
that would not normally appear in data. What I have selected and have
been using is ;'; . I was going to do this, until I got down to this
map and I am unsure how to change ([^;]+) or [^;]+ to have ;'; as the
separator of my fields. What I am doing is reports and scrapping the
data, collecting and then reformatting to send out as emails.

Any thoughts on what could be done??

Thanks for any insights you might on this...
=20
Wags ;)=20
David R. Wagner=20
Senior Programmer Analyst=20
FedEx Services=20
1.719.484.2097 Tel=20
1.719.484.2419 Fax=20
1.408.623.5963 Cell
http://Fedex.com/us

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Regular Expression change

am 17.07.2011 09:11:57 von rvtol+usenet

On 2011-07-15 17:42, Wagner, David --- Sr Programmer Analyst --- CFS wrote:

> I have the following map:
>
> map{[$_,(/^\d/ ? 1 : 0) . /^([^;]+)/,
> /[^;]+;[^;]*;[^;]+;[^;]+;([^;]+);/]}
>
> I had a failure during the night because some data field(s) had
> a semi-colon in the data. So what I have is a pre-defined data separator
> that would not normally appear in data. What I have selected and have
> been using is ;'; . I was going to do this, until I got down to this
> map and I am unsure how to change ([^;]+) or [^;]+ to have ;'; as the
> separator of my fields.

The easiest way is to pick a single character separator that doesn't
occur in your data. For example TAB, or $; which by default is chr(28).

Better yet, use a module that guards the data for you, like Text::CSV.
http://search.cpan.org/perldoc?Text::CSV

--
Ruud

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: Regular Expression change

am 20.07.2011 18:25:55 von Rob Dixon

On 15/07/2011 16:42, David Wagner wrote:
>
> I have the following map:
>
> map{[$_,(/^\d/ ? 1 : 0) . /^([^;]+)/, /[^;]+;[^;]*;[^;]+;[^;]+;([^;]+);/]}
>
> I had a failure during the night because some data field(s) had
> a semi-colon in the data. So what I have is a pre-defined data separator
> that would not normally appear in data. What I have selected and have
> been using is ;'; . I was going to do this, until I got down to this
> map and I am unsure how to change ([^;]+) or [^;]+ to have ;'; as the
> separator of my fields. What I am doing is reports and scrapping the
> data, collecting and then reformatting to send out as emails.
>
> Any thoughts on what could be done??
>
> Thanks for any insights you might on this...
>
> Wags ;)

Hello David.

Fiest of all, setting aside your embedded field separators, may I make
some comments on your code?

- I find it a little impregnable, and think you could make it more
readable by assing some whitespace.

- The second element of your anonymous array seems a little strange, but
it looks like you want the first field in the data, preceded by '1' or
'0' according to whether it starts with a digit. But your regex is in
scalar context so, instead of extracting the first field, you will get
'1' or '' according to the success of the match. To extract the value of
the field itself you must apply list context - something like

(/^\d/ ? 1 : 0) . (/^([^;]+)/)[0]

- The regex generating the third field can be written more readably as

/ (?: [^;]+ ;){4} ([^;]+); /x

So as a first improvement I suggest

map { [
$_,
(/^\d/ ? 1 : 0) . (/^([^;]+)/)[0],
/ (?: [^;]+ ;){4} ([^;]+); /x
] }

But I think it would be best to use split rather than regexes to first
separate the data into fields and then manipulate them individually.

map {
my @fields = split /;/;
[
$_,
($fields[0] =~ /^\d/ ? 1 : 0) . $fields[0],
$fields[4]
]
}

Finally, to handle the embedded semicolons properly, simply replace the
split with a call to Text::CSV as Ruud recommends. Without knowing how
your data distinguishes between separators and data I cannot be sure how
this should be coded, but by default the module assumes double-quotes
around fields that must not be split.

use Text::CSV;

my $csv = Text::CSV->new({sep_char => ';'});

map {
$csv->parse($_) or die $csv->error_diag;
my @fields = $csv->fields;
[
$_,
($fields[0] =~ /^\d/ ? 1 : 0) . $fields[0],
$fields[4]
]
}

One last thought - I think map is probably a poor choice in this case,
but I cannot tell from only a fragment of your code. I would prefer to
see a 'foreach' or a 'while iterating over the source data, and the
corresponding translation pushed onto a target array.

I hope this helps,

Rob

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/