combining files
am 07.09.2007 14:58:51 von ginger.m.griffin
Hello
I have several text files which in which many of the files ending
records overlap with the beginning records of another file. I'd like
to combine two of the files so that the records are continuous, and
this means that the overlap in one of the files needs to be removed.
What's a good way to do this?
For instance, if I have the two text files
file1.txt
09/06/07 01:23:49 PM,1189113829,0,0,000170
09/06/07 01:25:29 PM,1189113929,100,1.66,000138
09/06/07 01:25:44 PM,1189113944,115,1.91,000135
09/06/07 01:26:04 PM,1189113964,135,2.25,000148
09/06/07 01:27:19 PM,1189114039,210,3.50,000116
file2.txt
09/06/07 01:25:44 PM,1189113944,115,1.91,000135
09/06/07 01:26:04 PM,1189113964,135,2.25,000148
09/06/07 01:27:19 PM,1189114039,210,3.50,000116
09/06/07 01:27:42 PM,1189114062,233,3.88,000114
09/06/07 01:27:52 PM,1189114072,243,4.05,000119
09/06/07 01:29:26 PM,1189114166,337,5.61,000105
They overlap in the last three lines of file1.txt and the first three
lines of file2.txt . I'd like to bandage these two together to get:
09/06/07 01:23:49 PM,1189113829,0,0,000170
09/06/07 01:25:29 PM,1189113929,100,1.66,000138
09/06/07 01:25:44 PM,1189113944,115,1.91,000135
09/06/07 01:26:04 PM,1189113964,135,2.25,000148
09/06/07 01:27:19 PM,1189114039,210,3.50,000116
09/06/07 01:27:42 PM,1189114062,233,3.88,000114
09/06/07 01:27:52 PM,1189114072,243,4.05,000119
09/06/07 01:29:26 PM,1189114166,337,5.61,000105
thanks!
Re: combining files
am 07.09.2007 15:02:00 von Miles
On Sep 7, 7:58 am, gin_g wrote:
> Hello
>
> I have several text files which in which many of the files ending
> records overlap with the beginning records of another file. I'd like
> to combine two of the files so that the records are continuous, and
> this means that the overlap in one of the files needs to be removed.
> What's a good way to do this?
>
> For instance, if I have the two text files
> file1.txt
> 09/06/07 01:23:49 PM,1189113829,0,0,000170
> 09/06/07 01:25:29 PM,1189113929,100,1.66,000138
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
>
> file2.txt
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
> 09/06/07 01:27:42 PM,1189114062,233,3.88,000114
> 09/06/07 01:27:52 PM,1189114072,243,4.05,000119
> 09/06/07 01:29:26 PM,1189114166,337,5.61,000105
>
> They overlap in the last three lines of file1.txt and the first three
> lines of file2.txt . I'd like to bandage these two together to get:
>
> 09/06/07 01:23:49 PM,1189113829,0,0,000170
> 09/06/07 01:25:29 PM,1189113929,100,1.66,000138
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
> 09/06/07 01:27:42 PM,1189114062,233,3.88,000114
> 09/06/07 01:27:52 PM,1189114072,243,4.05,000119
> 09/06/07 01:29:26 PM,1189114166,337,5.61,000105
>
> thanks!
cat file1 file2 | sort -u
09/06/07 01:23:49 PM,1189113829,0,0,000170
09/06/07 01:25:29 PM,1189113929,100,1.66,000138
09/06/07 01:25:44 PM,1189113944,115,1.91,000135
09/06/07 01:26:04 PM,1189113964,135,2.25,000148
09/06/07 01:27:19 PM,1189114039,210,3.50,000116
09/06/07 01:27:42 PM,1189114062,233,3.88,000114
09/06/07 01:27:52 PM,1189114072,243,4.05,000119
09/06/07 01:29:26 PM,1189114166,337,5.61,000105
Re: combining files
am 07.09.2007 15:06:09 von Jeroen van Nieuwenhuizen
On Fri, 07 Sep 2007 05:58:51 -0700
somebody claiming to be gin_g wrote:
>
> Hello
>
> I have several text files which in which many of the files ending
> records overlap with the beginning records of another file. I'd like
> to combine two of the files so that the records are continuous, and
> this means that the overlap in one of the files needs to be removed.
> What's a good way to do this?
>
> For instance, if I have the two text files
> file1.txt
> 09/06/07 01:23:49 PM,1189113829,0,0,000170
> 09/06/07 01:25:29 PM,1189113929,100,1.66,000138
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
>
> file2.txt
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
> 09/06/07 01:27:42 PM,1189114062,233,3.88,000114
> 09/06/07 01:27:52 PM,1189114072,243,4.05,000119
> 09/06/07 01:29:26 PM,1189114166,337,5.61,000105
>
> They overlap in the last three lines of file1.txt and the first three
> lines of file2.txt . I'd like to bandage these two together to get:
>
> 09/06/07 01:23:49 PM,1189113829,0,0,000170
> 09/06/07 01:25:29 PM,1189113929,100,1.66,000138
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
> 09/06/07 01:27:42 PM,1189114062,233,3.88,000114
> 09/06/07 01:27:52 PM,1189114072,243,4.05,000119
> 09/06/07 01:29:26 PM,1189114166,337,5.61,000105
sort file1.txt file2.txt | uniq
Kind regards,
Jeroen.
--
ir. Jeroen van Nieuwenhuizen
Email: jnieuwen [at] jeroen [dot] se
I know I'm not perfect but I can smile
Re: combining files
am 07.09.2007 15:53:02 von Glenn Jackman
At 2007-09-07 08:58AM, "gin_g" wrote:
> I have several text files which in which many of the files ending
> records overlap with the beginning records of another file. I'd like
> to combine two of the files so that the records are continuous, and
> this means that the overlap in one of the files needs to be removed.
> What's a good way to do this?
At 2007-09-07 09:02AM, "Miles" wrote:
> cat file1 file2 | sort -u
At 2007-09-07 09:06AM, "Jeroen van Nieuwenhuizen" wrote:
> sort file1.txt file2.txt | uniq
How about:
sort -u file1 file2
--
Glenn Jackman
"You can only be young once. But you can always be immature." -- Dave Barry
Re: combining files
am 07.09.2007 16:57:35 von Jeroen van Nieuwenhuizen
On 7 Sep 2007 13:53:02 GMT
somebody claiming to be Glenn Jackman wrote:
> At 2007-09-07 08:58AM, "gin_g" wrote:
>> I have several text files which in which many of the files ending
>> records overlap with the beginning records of another file. I'd like
>> to combine two of the files so that the records are continuous, and
>> this means that the overlap in one of the files needs to be removed.
>> What's a good way to do this?
>
> At 2007-09-07 09:02AM, "Miles" wrote:
>> cat file1 file2 | sort -u
>
> At 2007-09-07 09:06AM, "Jeroen van Nieuwenhuizen" wrote:
>> sort file1.txt file2.txt | uniq
>
> How about:
> sort -u file1 file2
Does not work on solaris 9 for example. thats why I always use
the | uniq construct.
kinds regards,
Jeroen.
--
ir. Jeroen van Nieuwenhuizen
Email: jnieuwen [at] jeroen [dot] se
I know I'm not perfect but I can smile
Re: combining files
am 07.09.2007 17:34:35 von Stephane CHAZELAS
2007-09-07, 14:57(+00), Jeroen van Nieuwenhuizen:
[...]
>> How about:
>> sort -u file1 file2
>
> Does not work on solaris 9 for example. thats why I always use
> the | uniq construct.
[...]
Should work on Solaris, it's just that for many other things,
you need to make sure you're in a POSIX environment. The
Unix/POSIX conformant sort is in /usr/xpg4/bin on Solaris.
--
Stéphane
Re: combining files
am 07.09.2007 18:27:16 von Dummy
gin_g wrote:
>
> I have several text files which in which many of the files ending
> records overlap with the beginning records of another file. I'd like
> to combine two of the files so that the records are continuous, and
> this means that the overlap in one of the files needs to be removed.
> What's a good way to do this?
>
> For instance, if I have the two text files
> file1.txt
> 09/06/07 01:23:49 PM,1189113829,0,0,000170
> 09/06/07 01:25:29 PM,1189113929,100,1.66,000138
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
>
> file2.txt
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
> 09/06/07 01:27:42 PM,1189114062,233,3.88,000114
> 09/06/07 01:27:52 PM,1189114072,243,4.05,000119
> 09/06/07 01:29:26 PM,1189114166,337,5.61,000105
>
> They overlap in the last three lines of file1.txt and the first three
> lines of file2.txt . I'd like to bandage these two together to get:
>
> 09/06/07 01:23:49 PM,1189113829,0,0,000170
> 09/06/07 01:25:29 PM,1189113929,100,1.66,000138
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
> 09/06/07 01:27:42 PM,1189114062,233,3.88,000114
> 09/06/07 01:27:52 PM,1189114072,243,4.05,000119
> 09/06/07 01:29:26 PM,1189114166,337,5.61,000105
perl -ne'$x{$_}++||print' file1.txt file2.txt
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
Re: combining files
am 07.09.2007 18:47:48 von William James
On Sep 7, 7:58 am, gin_g wrote:
> Hello
>
> I have several text files which in which many of the files ending
> records overlap with the beginning records of another file. I'd like
> to combine two of the files so that the records are continuous, and
> this means that the overlap in one of the files needs to be removed.
> What's a good way to do this?
>
> For instance, if I have the two text files
> file1.txt
> 09/06/07 01:23:49 PM,1189113829,0,0,000170
> 09/06/07 01:25:29 PM,1189113929,100,1.66,000138
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
>
> file2.txt
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
> 09/06/07 01:27:42 PM,1189114062,233,3.88,000114
> 09/06/07 01:27:52 PM,1189114072,243,4.05,000119
> 09/06/07 01:29:26 PM,1189114166,337,5.61,000105
>
> They overlap in the last three lines of file1.txt and the first three
> lines of file2.txt . I'd like to bandage these two together to get:
>
> 09/06/07 01:23:49 PM,1189113829,0,0,000170
> 09/06/07 01:25:29 PM,1189113929,100,1.66,000138
> 09/06/07 01:25:44 PM,1189113944,115,1.91,000135
> 09/06/07 01:26:04 PM,1189113964,135,2.25,000148
> 09/06/07 01:27:19 PM,1189114039,210,3.50,000116
> 09/06/07 01:27:42 PM,1189114062,233,3.88,000114
> 09/06/07 01:27:52 PM,1189114072,243,4.05,000119
> 09/06/07 01:29:26 PM,1189114166,337,5.61,000105
>
> thanks!
awk '!a[$0]++' file1.txt file2.txt
or
ruby -e 'puts ARGF.to_a.uniq' file1.txt file2.txt
Re: combining files
am 07.09.2007 21:15:21 von Jeroen van Nieuwenhuizen
On Fri, 07 Sep 2007 15:34:35 GMT
somebody claiming to be Stephane CHAZELAS wrote:
> 2007-09-07, 14:57(+00), Jeroen van Nieuwenhuizen:
> [...]
>>> How about:
>>> sort -u file1 file2
>>
>> Does not work on solaris 9 for example. thats why I always use
>> the | uniq construct.
> [...]
>
> Should work on Solaris, it's just that for many other things,
> you need to make sure you're in a POSIX environment. The
> Unix/POSIX conformant sort is in /usr/xpg4/bin on Solaris.
Your absolutely right when you say that it can be done under a Solaris
installation. But not without making assumptions about the environment.
Which I of course should have stated, instead of saying solaris 9 does
not support it.
Kind regards,
Jeroen.
--
ir. Jeroen van Nieuwenhuizen
Email: jnieuwen [at] jeroen [dot] se
I know I'm not perfect but I can smile