csv split, field with embedded comma

csv split, field with embedded comma

am 23.06.2005 20:59:10 von Mark Tutt

I've got what I'm sure is an old problem, but my perl has gotten rusty
since I originally wrote these scripts.

I have a set of scripts that process various CSV files to generate
reports from a Unix based POS system. Recently a few sites started
reporting problems with strange results, and after some investigation
I discovered that in one user defined string field in a few records,
someone had been entering strings that contained commas. With my
simplistic split on commas, this shifted all of the fields over by
one, royally screwing up the reports.

The field in question is enclosed in quotes, so I believe I should be
able to work around this by modifying the split orwith a regex to
substitute out the comma in the quote delimited field, but a simple
solution is escaping me.

Has anyone got a quick fix?

Thanks!

Re: csv split, field with embedded comma

am 23.06.2005 21:21:54 von Paul Lalli

Mark Tutt wrote:
> I've got what I'm sure is an old problem, but my perl has gotten rusty
> since I originally wrote these scripts.
>
> I have a set of scripts that process various CSV files to generate
> reports from a Unix based POS system. Recently a few sites started
> reporting problems with strange results, and after some investigation
> I discovered that in one user defined string field in a few records,
> someone had been entering strings that contained commas. With my
> simplistic split on commas, this shifted all of the fields over by
> one, royally screwing up the reports.
>
> The field in question is enclosed in quotes, so I believe I should be
> able to work around this by modifying the split orwith a regex to
> substitute out the comma in the quote delimited field, but a simple
> solution is escaping me.
>
> Has anyone got a quick fix?

Hello Mark.

This Question is Asked Frequently, and so the answer comes
pre-installed with standard distributions of Perl. You can read the
answer by examining the Perl FAQ, by typing:
perldoc -q delimited
at your console.

There are a couple different suggestions contained therein. One is a
modification to your "split" to use a relatively complex regexp
instead. The other is to use an external module. Either Text::CSV
(from CPAN) or Text::ParseWords (standard) should do nicely.

Paul Lalli

Re: csv split, field with embedded comma

am 25.06.2005 09:19:52 von Dave Cross

On Thu, 23 Jun 2005 18:59:10 +0000, Mark Tutt wrote:

> I've got what I'm sure is an old problem, but my perl has gotten rusty
> since I originally wrote these scripts.
>
> I have a set of scripts that process various CSV files to generate
> reports from a Unix based POS system. Recently a few sites started
> reporting problems with strange results, and after some investigation
> I discovered that in one user defined string field in a few records,
> someone had been entering strings that contained commas. With my
> simplistic split on commas, this shifted all of the fields over by
> one, royally screwing up the reports.
>
> The field in question is enclosed in quotes, so I believe I should be
> able to work around this by modifying the split orwith a regex to
> substitute out the comma in the quote delimited field, but a simple
> solution is escaping me.
>
> Has anyone got a quick fix?

Don't use split. Instead use Text::ParseWords. It's a standard part of the
Perl distribution.

Dave...