changing $/ new line default character

changing $/ new line default character

am 18.12.2006 17:03:39 von hollyhawkins

I have a script that reads a directory of all files, and creates an
output file of one file with the contents of each individual file from
the directory. The input is an HL7 file. In each individual file, the
segments were separated with hex "0D", but now the segments have hex
"0A" as the separator. This is causing the perl script to read each
segment as a new line, and insert hex "0D0A" at the end.


=> QUESTION: How can I read in each individual file with all the data,
and write it out as one chunk, as opposed to reading each segment
ending with x"0A" and writing out each line to the output??? Any help
would be greatly appreciated

#!/usr/local/bin/perl

#This script will read a directory containing individual records, write
their
#data contents to a single output file for input to elink. This was
developed
#for the transactions from QUEST to Logician. Sept 2005 Holly Hawkins

#define the file directory paths

$datainpath = "C:\\YNHH_Files\\Quest\\Quest_IN";
$dataoutpath = "C:\\YNHH_Files\\Quest\\Quest_OUT";
$tempdir = "C:\\YNHH_Files\\Quest\\temp";
$archivedir = "C:\\YNHH_Files\\Quest\\archive";

#open the directory that has the input files

opendir THISDIR, "$datainpath" or die "Serious Error: $!";

#read the names of the individual files into an array "@allfiles"

@allfiles = grep !/^\.\.?$/, readdir THISDIR;

closedir THISDIR;
# the follwing line of code was entered by REV to prevent this script
# from running if the dummyrec is the only file existent in the
directory.
if (@allfiles <= 1) {exit}

#print "size of array: " . @allfiles . ".\n";

#writes the names from the directory to a file in tempdir

open RECORDNAMES, ">$tempdir\\recnames.txt" or die "Serious Error:
$!";

foreach $allfiles (@allfiles) {
# if ($allfiles != '99999dummyrec.txt')
{print RECORDNAMES "$allfiles\n"};
# print "$allfiles\n";
}

#fileout has the names of the records to be deleted
close "$tempdir\\recnames.txt";


# now take the record names in RECORDNAMES,
#and write the contents of the each of the records to another file.

#open dataout.txt for the data from the records
#the >> will open the file if it does not exist, or append to it
#if it is there already

#reopen filenames file as input

open (RECORDNAMES, "$tempdir\\recnames.txt") or die "Serious Error:
$!";
open (OUTFILE, ">>$dataoutpath\\questout.txt") or die "cannot open
questout.txt.\n";

#write to the dataout.txt file

select (OUTFILE);

# Read the file of record names
# open each file
#write contents of file to the dataout.txt OUTFILE
while () {
$filename = "$datainpath\\$_";
open (X, "$filename");
while ()
### THIS IS WHERE I GET INTO TROUBLE NOW, EACH RECORD IS TREATED AS A
'FILE'
### I HAVE A SAMPLE INPUT FILE AT THE END OF THIS
# {if ($filename != "$datainpath\\99999dummyrec.txt")
{print "\x0B$_\x1C\x0D"};


# }
}
# just wrote all the contents of the files, close the output file

HERE IS AN INPUT FILE: (it seems the editor inserts an extra x"0D"
after the x"0A" - the input file the script is reading just has the
x"0A"
MSH|^~\&|LAB|QWA||226964|2006REC 1
10449||ORU^R01|20061208578891130000|P|2.3|||||||
PID|1|1923975|VD441550||PESTANA^ALMA^G||19500506|F|||||||||| 2269640000212|047441174||||||||||||
NTE|1|TX|NON-FASTING |

Re: changing $/ new line default character

am 18.12.2006 18:42:19 von Paul Lalli

hollyhawkins wrote:

.... a question multi-posted to at least comp.lang.perl.misc, possibly
others.
http://groups.google.com/group/comp.lang.perl.misc/browse_fr m/thread/92940372814d8fe9/d1af7babd1ce3eee#d1af7babd1ce3eee

Please do not multi-post!!

If you *need* to post to more than one group, cross-post. If you don't
know the difference between the two, look up those two terms.

Paul Lalli

Re: changing $/ new line default character

am 19.12.2006 02:17:37 von paduille.4060.mumia.w

On 12/18/2006 10:03 AM, hollyhawkins wrote:
> I have a script that reads a directory of all files, and creates an
> output file of one file with the contents of each individual file from
> the directory. The input is an HL7 file. In each individual file, the
> segments were separated with hex "0D", but now the segments have hex
> "0A" as the separator. This is causing the perl script to read each
> segment as a new line, and insert hex "0D0A" at the end.
>
>
> => QUESTION: How can I read in each individual file with all the data,
> and write it out as one chunk, as opposed to reading each segment
> ending with x"0A" and writing out each line to the output??? Any help
> would be greatly appreciated
> [...]

Set $/ to undef and put the file handle into "raw" mode:

.... do some stuff ...
local $/ = undef;
open (my $X, '<', $filename) or die("Horribly: $!\n");
binmode($X, ':raw');
.... do more stuff ...

Or you could just use File::Slurp as «perldoc -q "all at once"» suggests:

This code is UNTESTED!

use strict;
use warnings;
use File::Slurp;

#define the file directory paths

my $datainpath = "C:\\YNHH_Files\\Quest\\Quest_IN";
my $dataoutpath = "C:\\YNHH_Files\\Quest\\Quest_OUT";
my $tempdir = "C:\\YNHH_Files\\Quest\\temp";
my $archivedir = "C:\\YNHH_Files\\Quest\\archive";

#read the names of the individual files into an array "@allfiles"

my @allfiles = read_dir($datainpath);
if (@allfiles < 1) { exit }

#writes the names from the directory to a file in tempdir

@allfiles = grep !/99999dummyrec\.txt/, @allfiles;
write_file "$tempdir\\recnames.txt", @allfiles;

unlink "$dataoutpath\\questout.txt";

# Open questout.txt for the data from the records.
# Append => 1 will cause data to be appended.
# Binmode => ':raw' will prevent spurious \x0A characters
# from being written to the file.

foreach my $filename (@allfiles) {
my $data = read_file $filename;
write_file "$dataoutpath\\questout.txt",
{ append => 1, binmode => ':raw' }, $data;
}

# What, we're done already?
# You DID post to comp.lang.perl.modules .
# WARNING: The above is untested code.



--
paduille.4060.mumia.w@earthlink.net
http://home.earthlink.net/~mumia.w.18.spam/