FAQ 5.3 How do I count the number of lines in a file?

FAQ 5.3 How do I count the number of lines in a file?

am 19.04.2008 09:03:02 von PerlFAQ Server

This is an excerpt from the latest version perlfaq5.pod, which
comes with the standard Perl distribution. These postings aim to
reduce the number of repeated questions as well as allow the community
to review and update the answers. The latest version of the complete
perlfaq is at http://faq.perl.org .

------------------------------------------------------------ --------

5.3: How do I count the number of lines in a file?


One fairly efficient way is to count newlines in the file. The following
program uses a feature of tr///, as documented in perlop. If your text
file doesn't end with a newline, then it's not really a proper text
file, so this may report one fewer line than you expect.

$lines = 0;
open(FILE, $filename) or die "Can't open `$filename': $!";
while (sysread FILE, $buffer, 4096) {
$lines += ($buffer =~ tr/\n//);
}
close FILE;

This assumes no funny games with newline translations.



------------------------------------------------------------ --------

The perlfaq-workers, a group of volunteers, maintain the perlfaq. They
are not necessarily experts in every domain where Perl might show up,
so please include as much information as possible and relevant in any
corrections. The perlfaq-workers also don't have access to every
operating system or platform, so please include relevant details for
corrections to examples that do not work on particular platforms.
Working code is greatly appreciated.

If you'd like to help maintain the perlfaq, see the details in
perlfaq.pod.

Re: FAQ 5.3 How do I count the number of lines in a file?

am 19.04.2008 09:26:54 von szr

PerlFAQ Server wrote:
[...]
> 5.3: How do I count the number of lines in a file?
>
>
> One fairly efficient way is to count newlines in the file. The
> following program uses a feature of tr///, as documented in
> perlop. If your text file doesn't end with a newline, then it's
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Shouldn't this be "If the lines in your text file do not end with a
newline", or something to that effect? A text file need not have a "\n"
at the very end :)

> not really a proper text file, so this may report one fewer line
> than you expect.
[...]

--
szr

Re: FAQ 5.3 How do I count the number of lines in a file?

am 19.04.2008 12:08:39 von jurgenex

"szr" wrote:
>PerlFAQ Server wrote:
>[...]
>> 5.3: How do I count the number of lines in a file?
>>
>>
>> One fairly efficient way is to count newlines in the file. The
>> following program uses a feature of tr///, as documented in
>> perlop. If your text file doesn't end with a newline, then it's
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>Shouldn't this be "If the lines in your text file do not end with a
>newline", or something to that effect?

No. It is a different way of saying "If there is stuff after the last
newline then that stuff doesn't constitute a proper line..."

>A text file need not have a "\n"
>at the very end :)
>
>> not really a proper text file, so this may report one fewer line
>> than you expect.

"... and therefore it is not be counted as a line."

jue

Re: FAQ 5.3 How do I count the number of lines in a file?

am 20.04.2008 00:05:49 von szr

Jürgen Exner wrote:
> "szr" wrote:
>> PerlFAQ Server wrote:
>> [...]
>>> 5.3: How do I count the number of lines in a file?
>>>
>>>
>>> One fairly efficient way is to count newlines in the file. The
>>> following program uses a feature of tr///, as documented in
>>> perlop. If your text file doesn't end with a newline, then it's
>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>
>> Shouldn't this be "If the lines in your text file do not end with a
>> newline", or something to that effect?
>
> No. It is a different way of saying "If there is stuff after the last
> newline then that stuff doesn't constitute a proper line..."

I don't recall there being any sort of rule saying a text file's last
character /has/ to be a line terminator, though I can see how it would
seem better formed if it does.

--
szr

Re: FAQ 5.3 How do I count the number of lines in a file?

am 20.04.2008 22:07:55 von brian d foy

In article , szr
wrote:

> Jürgen Exner wrote:

> > No. It is a different way of saying "If there is stuff after the last
> > newline then that stuff doesn't constitute a proper line..."
>
> I don't recall there being any sort of rule saying a text file's last
> character /has/ to be a line terminator, though I can see how it would
> seem better formed if it does.

There's no rule for it, but a lot of things expect it. Just Google
"no newline at end of file" to see for yourelf.

Re: FAQ 5.3 How do I count the number of lines in a file?

am 20.04.2008 23:47:05 von Martijn Lievaart

On Sun, 20 Apr 2008 15:07:55 -0500, brian d foy wrote:

> In article , szr
> wrote:
>
>> Jürgen Exner wrote:
>
>> > No. It is a different way of saying "If there is stuff after the last
>> > newline then that stuff doesn't constitute a proper line..."
>>
>> I don't recall there being any sort of rule saying a text file's last
>> character /has/ to be a line terminator, though I can see how it would
>> seem better formed if it does.
>
> There's no rule for it, but a lot of things expect it. Just Google "no
> newline at end of file" to see for yourelf.

As I understand it, this has always been the rule in unixy environments.
I even think that Dos/Windows is the exception to the rule, and (almost)
all other environments need a line terminator after the last line, but
I'm not to sure about that.

And it makes sens. it's a line terminator, not a line separator....

M4

Re: FAQ 5.3 How do I count the number of lines in a file?

am 21.04.2008 01:37:53 von 1usa

brian d foy wrote in
news:200420081507550257%brian.d.foy@gmail.com:

> In article , szr
> wrote:
>
>> Jürgen Exner wrote:
>
>> > No. It is a different way of saying "If there is stuff after
>> > the last newline then that stuff doesn't constitute a proper
>> > line..."
>>
>> I don't recall there being any sort of rule saying a text file's
>> last character /has/ to be a line terminator, though I can see
>> how it would seem better formed if it does.
>
> There's no rule for it, but a lot of things expect it. Just
> Google "no newline at end of file" to see for yourelf.

AFAIK, the C standard requires lines in text files to be terminated
with the platform specific newline character sequence.

Sinan

--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://www.rehabitation.com/clpmisc/

Re: FAQ 5.3 How do I count the number of lines in a file?

am 21.04.2008 02:14:33 von jurgenex

"szr" wrote:
>I don't recall there being any sort of rule saying a text file's last
>character /has/ to be a line terminator,

Well, it's really the question of if the newline is a seperator or a
terminator.
If you consider it to be a line separator then you don't need a closing
newling.
If you consider it to be a line terminator, then a line without its
terminator wouldn't be well formed.

> though I can see how it would seem better formed if it does.

There are tools and even editors which won't work properly if the last
line is not terminated properly.

jue

Re: FAQ 5.3 How do I count the number of lines in a file?

am 24.04.2008 14:57:40 von hjp-usenet2

On 2008-04-19 22:05, szr wrote:
> I don't recall there being any sort of rule saying a text file's last
> character /has/ to be a line terminator, though I can see how it would
> seem better formed if it does.

IEEE Standard 1003.1 (aka POSIX) defines "line" as:

| 3.205 Line
|
| A sequence of zero or more non- s plus a terminating
| .

And "text file" as:

| 3.392 Text File
|
| A file that contains characters organized into one or more lines. The
| lines do not contain NUL characters and none can exceed {LINE_MAX}
| bytes in length, including the . Although IEEE Std
| 1003.1-2001 does not distinguish between text files and binary files
| (see the ISO C standard), many utilities only produce predictable or
| meaningful output when operating on text files. The standard utilities
| that have such restrictions always specify "text files" in their STDIN
| or INPUT FILES sections.

Sinan already mentioned the C standard, but that doesn't specify how
files are represented on disk - a text file could be a series of fixed
or variable length records without any "newline characters" at all, as
long as the stdio library performs the necessary conversions.

hp