Problems with working with large text files

Problems with working with large text files

am 01.06.2007 01:35:20 von Adam Niedzwiedzki

Hi all,

I have a simple php script that I'm running from command line, it opens up a
http web log, proccess's it, then zips it when done.
If the http log is under 200MB (approx) this all hum's along nicely, as soon
as the files are up over 300MB php falls over.

Fatal error: Out of memory (allocated 378535936) (tried to allocate
381131220 bytes)
I'm running php5.2.2 on Windows 2003 64Bit Enterprise.
I have my php.ini memory_limit set to -1 and in my scripts I set the
following

;;;;;;;;;;;;;;;;;;;
; Resource Limits ;
;;;;;;;;;;;;;;;;;;;

max_execution_time = 30 ; Maximum execution time of each script, in seconds
max_input_time = 60 ; Maximum amount of time each script may spend
parsing request data
memory_limit = -1 ; Maximum amount of memory a script may
consume (128MB)

I have this in inline code..

ini_set("memory_limit",-1);
set_time_limit(0);

It seems to fall over on either fopen() or on gzcompress() or both if the
file is over 300MB.
Anyone know of another option to tell php to just be unlimtied on it's ram
usage?
The machine it's runnning on is an 8GB machine, has over 3GB free. (it's a
quad opteron box).

Anyone have any clues to help me out :(

Cheers
Ad

--
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Problems with working with large text files

am 01.06.2007 04:29:34 von Stut

Adam Niedzwiedzki wrote:
> I have a simple php script that I'm running from command line, it opens up a
> http web log, proccess's it, then zips it when done.
> If the http log is under 200MB (approx) this all hum's along nicely, as soon
> as the files are up over 300MB php falls over.
>
> Fatal error: Out of memory (allocated 378535936) (tried to allocate
> 381131220 bytes)
> I'm running php5.2.2 on Windows 2003 64Bit Enterprise.
> I have my php.ini memory_limit set to -1 and in my scripts I set the
> following
>
> ;;;;;;;;;;;;;;;;;;;
> ; Resource Limits ;
> ;;;;;;;;;;;;;;;;;;;
>
> max_execution_time = 30 ; Maximum execution time of each script, in seconds
> max_input_time = 60 ; Maximum amount of time each script may spend
> parsing request data
> memory_limit = -1 ; Maximum amount of memory a script may
> consume (128MB)
>
> I have this in inline code..
>
> ini_set("memory_limit",-1);
> set_time_limit(0);
>
> It seems to fall over on either fopen() or on gzcompress() or both if the
> file is over 300MB.
> Anyone know of another option to tell php to just be unlimtied on it's ram
> usage?
> The machine it's runnning on is an 8GB machine, has over 3GB free. (it's a
> quad opteron box).
>
> Anyone have any clues to help me out :(

Yeah, don't load the whole frickin' log into memory at the same time.
Refactor your code so it can process the log line by line and you'll
save yourself many many headaches in the future. I've never come across
a good reason to load a large file into memory all at just to "process" it.

-Stut

--
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: Problems with working with large text files

am 01.06.2007 05:14:17 von Adam Niedzwiedzki

Hi Stut,

(yeah ok.... But still doesn't explain why php ain't letting me (I'm
thinking BUG) :P)

Anyways this is how I'm handling the file...

if($fp = fopen($logfile, 'r')){
debug_log("$file has ".count(file($logfile))." lines to process");
while(!feof($fp)){
$line = fgets($fp);

You just made me relise I'm calling file() for the line count (That's a big
hit), any other way of doing it?
And YES I want a line count BEFORE I start looping through it...

But I can't read line for line to do the gzcompress I have to load the whole
file up to compress it don't I?

gzcompress ($data, 9)

$data being the string (the whole file) of text I need to compress.

Cheers
Ad

-----Original Message-----
From: Stut [mailto:stuttle@gmail.com]
Sent: Friday, 1 June 2007 12:30 PM
To: Adam Niedzwiedzki
Cc: php-windows@lists.php.net
Subject: Re: [PHP-WIN] Problems with working with large text files

Adam Niedzwiedzki wrote:
> I have a simple php script that I'm running from command line, it
> opens up a http web log, proccess's it, then zips it when done.
> If the http log is under 200MB (approx) this all hum's along nicely,
> as soon as the files are up over 300MB php falls over.
>
> Fatal error: Out of memory (allocated 378535936) (tried to allocate
> 381131220 bytes) I'm running php5.2.2 on Windows 2003 64Bit
> Enterprise.
> I have my php.ini memory_limit set to -1 and in my scripts I set the
> following
>
> ;;;;;;;;;;;;;;;;;;;
> ; Resource Limits ;
> ;;;;;;;;;;;;;;;;;;;
>
> max_execution_time = 30 ; Maximum execution time of each script, in
seconds
> max_input_time = 60 ; Maximum amount of time each script may spend
> parsing request data
> memory_limit = -1 ; Maximum amount of memory a script may
> consume (128MB)
>
> I have this in inline code..
>
> ini_set("memory_limit",-1);
> set_time_limit(0);
>
> It seems to fall over on either fopen() or on gzcompress() or both if
> the file is over 300MB.
> Anyone know of another option to tell php to just be unlimtied on it's
> ram usage?
> The machine it's runnning on is an 8GB machine, has over 3GB free.
> (it's a quad opteron box).
>
> Anyone have any clues to help me out :(

Yeah, don't load the whole frickin' log into memory at the same time.
Refactor your code so it can process the log line by line and you'll save
yourself many many headaches in the future. I've never come across a good
reason to load a large file into memory all at just to "process" it.

-Stut

--
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Problems with working with large text files

am 01.06.2007 21:21:26 von Stut

Please turn off read notifications when posting to a mailing list -
they're very annoying.

Adam Niedzwiedzki wrote:
> (yeah ok.... But still doesn't explain why php ain't letting me (I'm
> thinking BUG) :P)
>
> Anyways this is how I'm handling the file...
>
> if($fp = fopen($logfile, 'r')){
> debug_log("$file has ".count(file($logfile))." lines to process");
> while(!feof($fp)){
> $line = fgets($fp);
>
> You just made me relise I'm calling file() for the line count (That's a big
> hit), any other way of doing it?
> And YES I want a line count BEFORE I start looping through it...

Why?

Actually, why doesn't matter. If you really really need a line count
before you start processing it, then you'll have to run through the
while (feof) loop twice, once to count the lines and the second to
process them. If you were on a UNIX platform I would suggest calling out
to the wc utility - it's possible there's a version of that compiled for
windows somewhere and it would definitely be more efficient.

> But I can't read line for line to do the gzcompress I have to load the whole
> file up to compress it don't I?
>
> gzcompress ($data, 9)
>
> $data being the string (the whole file) of text I need to compress.

Two possible options... 1) write it out uncompressed and compress it
afterwards. Or 2) use the gz output buffer handler to do it (although
that may have the same effect on memory usage).

What is this script actually doing? There may be a better way[tm].

-Stut

--
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php