Hello: not so much php, more like HTML..?

Hello: not so much php, more like HTML..?

am 31.08.2007 14:14:53 von Courtney

Ok. I need a little guidance here.

Environment.
===========

Apache2/PHP5/Mysql/debian etch.

Application
===========

I want data files to be stored somewhere apache can't get to them
directly, for security, but be available for download via a PHP script
(after various authentication stuff that seems to work so far) by
clicking on web page button, this to invoke:-

(a new page, that is actually a file download HTML thingie?)

Now it seems that if the opened URL is say a GET type form that takes
some form of file ID, and is a PHP program itself, all I need to do is a
mix of http_send_file() type stuff to push the data down a new browser
'window'

I.e conceptually if the button is a link to say :-

or whatever (never mind the syntax:
Thats what manuals are for) then essentially what my 'download_php'
wants to do is:-

- validate the user (REMOTE_USER) has rights to access the file, in case
of spoofing by manually typing the above command..

- send a load of header data (this is where I am unclear)
- send the file
- go back home.

Now in most cases these are files that do not require an application to
open them.

I want to ensure they get downloaded to disk,and only if the local
browser recognizes them, should they get opened by a local app.

Now the standard blurb shows this fragment as most of what I want, I think:

http_send_content_disposition("document.pdf", true);
http_send_content_type("application/pdf");
http_throttle(0.1, 2048);
http_send_file("/my_inaccessible_to_apache_path/report.pdf") ;
?>

But I need someone to confirm that this is the general approach that
will get me what I want, and enlighten me as to what mime types I
should use..and how this will interact with the browser, and its
understanding of MIME types.

TIA

Re: Hello: not so much php, more like HTML..?

am 01.09.2007 05:52:18 von luiheidsgoeroe

On Fri, 31 Aug 2007 14:14:53 +0200, The Natural Philosopher wrot=
e:

> Ok. I need a little guidance here.
>
> Environment.
> ===========3D
>
> Apache2/PHP5/Mysql/debian etch.
>
> Application
> ===========3D
>
> I want data files to be stored somewhere apache can't get to them =

> directly, for security, but be available for download via a PHP script=


You mean Apache CAN get them, it's just not available with a straight HT=
TP =

request (allthough it might not be an issue here there is a serious =

difference).

> (after various authentication stuff that seems to work so far) by =

> clicking on web page button, this to invoke:-
>
> (a new page, that is actually a file download HTML thingie?)
>
> Now it seems that if the opened URL is say a GET type form that takes =
=

> some form of file ID, and is a PHP program itself, all I need to do is=
a =

> mix of http_send_file() type stuff to push the data down a new browser=
=

> 'window'

If you have the HTTP extention enabled (off be default) =

http://nl3.php.net/manual/en/ref.http.php

>
> I.e conceptually if the button is a link to say :-
>
>
or whatever (never mind the s=
yntax: =

> Thats what manuals are for) then essentially what my 'download_php' =

> wants to do is:-
>
> - validate the user (REMOTE_USER) has rights to access the file, in ca=
se =

> of spoofing by manually typing the above command..

Check.

> - send a load of header data (this is where I am unclear)

What 'load'? What is you intended behaviour?

> - send the file
> - go back home.

'Go back home' is not really an option PHP provides. If you however clic=
k =

a link that would start a download a lot of UA's default behaviour is to=
=

keep displaying the current page if the link just triggers a download =

instead of a new page. If you want more (complex), javascript (or any =

client side solution, java and flash could also be used) is the way to g=
o.

>
> Now in most cases these are files that do not require an application t=
o =

> open them.
>
> I want to ensure they get downloaded to disk,and only if the local =

> browser recognizes them, should they get opened by a local app.

Well, you cannot really control a browser. People might have preferences=
=

overriding your intended behaviour.

> Now the standard blurb shows this fragment as most of what I want, I =

> think:
>
> > http_send_content_disposition("document.pdf", true);
> http_send_content_type("application/pdf");
> http_throttle(0.1, 2048);
> http_send_file("/my_inaccessible_to_apache_path/report.pdf") ;
> ?>
>
> But I need someone to confirm that this is the general approach that =

> will get me what I want,

I'm not really familiar with the HTTP extention, a normal PHP approach =

would be:

header('Content-type: application/pdf');
header('Content-Disposition: attachment; filename=3D"document.pdf"');
readfile('/path/to/file');
?>

If you really want a chunked upload check out the user comments at =



> and enlighten me as to what mime types I should use..

The real mime-type as far as possible.

> and how this will interact with the browser, and its understanding of =
=

> MIME types.

Which is configurable by the user in most current UA's. For the most par=
t =

you'll just have to trust your users to have their browser configured =

according to their wishes. Offcourse, for some major browsers there are =
=

some 'tricks', but they are just that: tricks, not a reliable, certain =

configuration.
-- =

Rik Wasmus

My new ISP's newsserver sucks. Anyone recommend a good one? Paying for =

quality is certainly an option.

Re: Hello: not so much php, more like HTML..?

am 01.09.2007 11:42:59 von Courtney

Rik Wasmus wrote:
> On Fri, 31 Aug 2007 14:14:53 +0200, The Natural Philosopher wrote:
>
>> Ok. I need a little guidance here.
>>
>> Environment.
>> ===========
>>
>> Apache2/PHP5/Mysql/debian etch.
>>
>> Application
>> ===========
>>
>> I want data files to be stored somewhere apache can't get to them
>> directly, for security, but be available for download via a PHP script
>
> You mean Apache CAN get them, it's just not available with a straight
> HTTP request (allthough it might not be an issue here there is a serious
> difference).
>
>> (after various authentication stuff that seems to work so far) by
>> clicking on web page button, this to invoke:-
>>
>> (a new page, that is actually a file download HTML thingie?)
>>
>> Now it seems that if the opened URL is say a GET type form that takes
>> some form of file ID, and is a PHP program itself, all I need to do is
>> a mix of http_send_file() type stuff to push the data down a new
>> browser 'window'
>
> If you have the HTTP extention enabled (off be default)
> http://nl3.php.net/manual/en/ref.http.php
>
>>
>> I.e conceptually if the button is a link to say :-
>>
>>
or whatever (never mind the
>> syntax: Thats what manuals are for) then essentially what my
>> 'download_php' wants to do is:-
>>
>> - validate the user (REMOTE_USER) has rights to access the file, in
>> case of spoofing by manually typing the above command..
>
> Check.
>
>> - send a load of header data (this is where I am unclear)
>
> What 'load'? What is you intended behaviour?
>
>> - send the file
>> - go back home.
>
> 'Go back home' is not really an option PHP provides. If you however
> click a link that would start a download a lot of UA's default behaviour
> is to keep displaying the current page if the link just triggers a
> download instead of a new page. If you want more (complex), javascript
> (or any client side solution, java and flash could also be used) is the
> way to go.
>
>>
>> Now in most cases these are files that do not require an application
>> to open them.
>>
>> I want to ensure they get downloaded to disk,and only if the local
>> browser recognizes them, should they get opened by a local app.
>
> Well, you cannot really control a browser. People might have preferences
> overriding your intended behaviour.
>
>> Now the standard blurb shows this fragment as most of what I want, I
>> think:
>>
>> >> http_send_content_disposition("document.pdf", true);
>> http_send_content_type("application/pdf");
>> http_throttle(0.1, 2048);
>> http_send_file("/my_inaccessible_to_apache_path/report.pdf") ;
>> ?>
>>
>> But I need someone to confirm that this is the general approach that
>> will get me what I want,
>
> I'm not really familiar with the HTTP extention, a normal PHP approach
> would be:
>
> > header('Content-type: application/pdf');
> header('Content-Disposition: attachment; filename="document.pdf"');
> readfile('/path/to/file');
> ?>
>
> If you really want a chunked upload check out the user comments at
>
>

Are you dyslexic, I want to DOWNLOAD. Upload I have done already.

>> and enlighten me as to what mime types I should use..
>
> The real mime-type as far as possible.
>
>> and how this will interact with the browser, and its understanding of
>> MIME types.
>
> Which is configurable by the user in most current UA's. For the most
> part you'll just have to trust your users to have their browser
> configured according to their wishes. Offcourse, for some major browsers
> there are some 'tricks', but they are just that: tricks, not a reliable,
> certain configuration.

I found most of what I needed..the extra headers to more finely control
the download.

will piss around with it more when the next form is written..

Only issue left is whether I should split the output into time-outable
chunks. 99% of the downloads will be over the local ethernet,so I
probably won't bother.

Re: Hello: not so much php, more like HTML..?

am 01.09.2007 13:13:33 von luiheidsgoeroe

On Sat, 01 Sep 2007 11:42:59 +0200, The Natural Philosopher wrote:
>> If you really want a chunked upload check out the user comments at
>>
>>
>
> Are you dyslexic, I want to DOWNLOAD. Upload I have done already.


bah, 't was late last night... I meant to say download, allthough it
depends wether you think usercentric or usercentric which is which :P
--
Rik Wasmus

My new ISP's newsserver sucks. Anyone recommend a good one? Paying for
quality is certainly an option.

Re: Hello: not so much php, more like HTML..?

am 01.09.2007 13:28:13 von luiheidsgoeroe

On Sat, 01 Sep 2007 13:13:33 +0200, Rik Wasmus
wrote:

> On Sat, 01 Sep 2007 11:42:59 +0200, The Natural Philosopher
> wrote:
>>> If you really want a chunked upload check out the user comments at
>>>
>>>
>>
>> Are you dyslexic, I want to DOWNLOAD. Upload I have done already.
>
>
> bah, 't was late last night... I meant to say download, allthough it
> depends wether you think usercentric or usercentric which is which :P

Hmmmz, haven't made a full recovery yet I see, "usercentric or
servercentric"...

--
Rik Wasmus

My new ISP's newsserver sucks. Anyone recommend a good one? Paying for
quality is certainly an option.

Re: Hello: not so much php, more like HTML..?

am 01.09.2007 20:39:06 von Courtney

Rik Wasmus wrote:
> On Sat, 01 Sep 2007 13:13:33 +0200, Rik Wasmus
> wrote:
>
>> On Sat, 01 Sep 2007 11:42:59 +0200, The Natural Philosopher
>> wrote:
>>>> If you really want a chunked upload check out the user comments at
>>>>
>>>>
>>>
>>> Are you dyslexic, I want to DOWNLOAD. Upload I have done already.
>>
>>
>> bah, 't was late last night... I meant to say download, allthough it
>> depends wether you think usercentric or usercentric which is which :P
>
> Hmmmz, haven't made a full recovery yet I see, "usercentric or
> servercentric"...
>
its Saturday, Get a bottle of decent spirits and relax. ;-)

Anyway I have enough onfo op spec out that part of e jobn. Muy necxt
problem is wheher its more efficient to have 5000 files all called
00001, 00002...05000 in one directory, or whether to split them up over
several..and whether to keep their names and extensions intact, or just
be lazy, knowing the data base knows what they were called. hey ho. ;-)

Re: Hello: not so much php, more like HTML..?

am 01.09.2007 21:26:46 von luiheidsgoeroe

On Sat, 01 Sep 2007 20:39:06 +0200, The Natural Philosopher wrote:

> Rik Wasmus wrote:
>> On Sat, 01 Sep 2007 13:13:33 +0200, Rik Wasmus
>> wrote:
>>
>>> On Sat, 01 Sep 2007 11:42:59 +0200, The Natural Philosopher
>>> wrote:
>>>>> If you really want a chunked upload check out the user comments at
>>>>>
>>>>>
>>>>
>>>> Are you dyslexic, I want to DOWNLOAD. Upload I have done already.
>>>
>>>
>>> bah, 't was late last night... I meant to say download, allthough it
>>> depends wether you think usercentric or usercentric which is which :P
>> Hmmmz, haven't made a full recovery yet I see, "usercentric or
>> servercentric"...
>>
> its Saturday, Get a bottle of decent spirits and relax. ;-)
>
> Anyway I have enough onfo op spec out that part of e jobn. Muy necxt
> problem is wheher

Hehe, taking you own advice? :P

> its more efficient to have 5000 files all called 00001, 00002...05000 in
> one directory, or whether to split them up over several..and whether to
> keep their names and extensions intact, or just be lazy, knowing the
> data base knows what they were called.

Hmm, depends on the file system, not really an expert there. Jerry would
tell you to just serve them up straight from the database, and forget
about the filesystem, I'm not so sure :). You can do a little testing
offcourse, see what works best.
--
Rik Wasmus

My new ISP's newsserver sucks. Anyone recommend a good one? Paying for
quality is certainly an option.

Re: Hello: not so much php, more like HTML..?

am 01.09.2007 21:33:33 von Jerry Stuckle

Rik Wasmus wrote:
> On Sat, 01 Sep 2007 20:39:06 +0200, The Natural Philosopher wrote:
>
>> Rik Wasmus wrote:
>>> On Sat, 01 Sep 2007 13:13:33 +0200, Rik Wasmus
>>> wrote:
>>>
>>>> On Sat, 01 Sep 2007 11:42:59 +0200, The Natural Philosopher
>>>> wrote:
>>>>>> If you really want a chunked upload check out the user comments
>>>>>> at
>>>>>>
>>>>>
>>>>> Are you dyslexic, I want to DOWNLOAD. Upload I have done already.
>>>>
>>>>
>>>> bah, 't was late last night... I meant to say download, allthough it
>>>> depends wether you think usercentric or usercentric which is which :P
>>> Hmmmz, haven't made a full recovery yet I see, "usercentric or
>>> servercentric"...
>>>
>> its Saturday, Get a bottle of decent spirits and relax. ;-)
>>
>> Anyway I have enough onfo op spec out that part of e jobn. Muy necxt
>> problem is wheher
>
> Hehe, taking you own advice? :P
>
>> its more efficient to have 5000 files all called 00001, 00002...05000
>> in one directory, or whether to split them up over several..and
>> whether to keep their names and extensions intact, or just be lazy,
>> knowing the data base knows what they were called.
>
> Hmm, depends on the file system, not really an expert there. Jerry would
> tell you to just serve them up straight from the database, and forget
> about the filesystem, I'm not so sure :). You can do a little testing
> offcourse, see what works best.

Rik,

Yep, it works great - and the more images you have, the better it works.
Databases are far better at massive numbers of elements than file systems.

Buy he's already decided I don't know what I'm talking about because I
told him he should have E_NOTICE enabled.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================

Re: Hello: not so much php, more like HTML..?

am 01.09.2007 23:22:11 von unknown

Post removed (X-No-Archive: yes)

Re: Hello: not so much php, more like HTML..?

am 02.09.2007 00:58:09 von Courtney

Rik Wasmus wrote:

>> its more efficient to have 5000 files all called 00001, 00002...05000
>> in one directory, or whether to split them up over several..and
>> whether to keep their names and extensions intact, or just be lazy,
>> knowing the data base knows what they were called.
>
> Hmm, depends on the file system, not really an expert there. Jerry would
> tell you to just serve them up straight from the database, and forget
> about the filesystem, I'm not so sure :).

I was wondering about that. It has a certain appeal..Would MySQL handle
60Mbyte Bitmap? I suppose so, when all is said and done..a LONGBLOB
would be enough..I suppose my fear is that the database might get
corrupted..at least with the files stashed on disk one could have some
chance of recovery.. Hmm.

Also performance..how long does it take to take the uploaded temporary
file and insert it into the database?

To move its directory and change its name is trivial..

You can do a little testing
> offcourse, see what works best.


mmm.

Will php allow you to read a 60Mbyte file into a string, and toss it
around like a 16 byte customer name?

If I could guarantee all files were small, I'd go that route.

But I just know someone is going to want to upload a whole movie one day
:-) :-)

Re: Hello: not so much php, more like HTML..?

am 02.09.2007 01:14:55 von Courtney

Gary L. Burnore wrote:
> On Sat, 01 Sep 2007 21:26:46 +0200, "Rik Wasmus"
> wrote:
>
>> On Sat, 01 Sep 2007 20:39:06 +0200, The Natural Philosopher wrote:
>>
>>> Rik Wasmus wrote:
>>>> On Sat, 01 Sep 2007 13:13:33 +0200, Rik Wasmus
>>>> wrote:
>>>>
>>>>> On Sat, 01 Sep 2007 11:42:59 +0200, The Natural Philosopher
>>>>> wrote:
>>>>>>> If you really want a chunked upload check out the user comments at
>>>>>>>
>>>>>>>
>>>>>> Are you dyslexic, I want to DOWNLOAD. Upload I have done already.
>>>>>
>>>>> bah, 't was late last night... I meant to say download, allthough it
>>>>> depends wether you think usercentric or usercentric which is which :P
>>>> Hmmmz, haven't made a full recovery yet I see, "usercentric or
>>>> servercentric"...
>>>>
>>> its Saturday, Get a bottle of decent spirits and relax. ;-)
>>>
>>> Anyway I have enough onfo op spec out that part of e jobn. Muy necxt
>>> problem is wheher
>> Hehe, taking you own advice? :P
>>
>>> its more efficient to have 5000 files all called 00001, 00002...05000 in
>>> one directory, or whether to split them up over several..and whether to
>>> keep their names and extensions intact, or just be lazy, knowing the
>>> data base knows what they were called.
>> Hmm, depends on the file system, not really an expert there. Jerry would
>> tell you to just serve them up straight from the database, and forget
>> about the filesystem, I'm not so sure :). You can do a little testing
>> offcourse, see what works best.
>
> The more files you store in one directory, the harder the OS has to
> work to list through them.
Ah, but the more subdirectories you have in a file system, the more work...

;-)

I.e. I think that the case where each file is in its own subdirectory is
of similar order to no subdirs at all.

I suspect the answer is that for n files, us the pth root of n as the
number of subdirs, where p is the depth of the subdirs...but a lot
depends on caching algorithms in the directories. AND the way the OS
searches them.

I don't know what a directory entry is these days..256 bytes? well
10,000 of them is only 2.56Mytes or so. Should be well within cache
range. let's say it has to do maybe 1000 machine cycles for every
byte...thats 500K bytes a second searched at 500Mhz..4 seconds for a
linear search to find the last one. Mm. That's significant.

whereas a two level hierearchy? a 100 dirs in the first, and 100 files
in each? 80 milliseconds. Give or take...starting to get into disk
latency times anyway..

well I have no idea how the OS searches the directory, but it seems to
me that a hundred dirs of a hundred files each has to be the way to go.

Shouldn't be that hard either: just use an autoincrement on each tag
record, and every time it gets to modulo one hundred create a new directory.

Re: Hello: not so much php, more like HTML..?

am 02.09.2007 03:24:18 von Courtney

The Natural Philosopher wrote:
> Gary L. Burnore wrote:
>> On Sat, 01 Sep 2007 21:26:46 +0200, "Rik Wasmus"
>> wrote:
>>
>>> On Sat, 01 Sep 2007 20:39:06 +0200, The Natural Philosopher
>>> wrote:
>>>
>>>> Rik Wasmus wrote:
>>>>> On Sat, 01 Sep 2007 13:13:33 +0200, Rik Wasmus
>>>>> wrote:
>>>>>
>>>>>> On Sat, 01 Sep 2007 11:42:59 +0200, The Natural Philosopher
>>>>>> wrote:
>>>>>>>> If you really want a chunked upload check out the user comments
>>>>>>>> at
>>>>>>>>
>>>>>>> Are you dyslexic, I want to DOWNLOAD. Upload I have done already.
>>>>>>
>>>>>> bah, 't was late last night... I meant to say download, allthough
>>>>>> it depends wether you think usercentric or usercentric which is
>>>>>> which :P
>>>>> Hmmmz, haven't made a full recovery yet I see, "usercentric or
>>>>> servercentric"...
>>>>>
>>>> its Saturday, Get a bottle of decent spirits and relax. ;-)
>>>>
>>>> Anyway I have enough onfo op spec out that part of e jobn. Muy
>>>> necxt problem is wheher
>>> Hehe, taking you own advice? :P
>>>
>>>> its more efficient to have 5000 files all called 00001,
>>>> 00002...05000 in one directory, or whether to split them up over
>>>> several..and whether to keep their names and extensions intact, or
>>>> just be lazy, knowing the data base knows what they were called.
>>> Hmm, depends on the file system, not really an expert there. Jerry
>>> would tell you to just serve them up straight from the database, and
>>> forget about the filesystem, I'm not so sure :). You can do a little
>>> testing offcourse, see what works best.
>>
>> The more files you store in one directory, the harder the OS has to
>> work to list through them.
> Ah, but the more subdirectories you have in a file system, the more work...
>
> ;-)
>
> I.e. I think that the case where each file is in its own subdirectory is
> of similar order to no subdirs at all.
>
> I suspect the answer is that for n files, us the pth root of n as the
> number of subdirs, where p is the depth of the subdirs...but a lot
> depends on caching algorithms in the directories. AND the way the OS
> searches them.
>
> I don't know what a directory entry is these days..256 bytes? well
> 10,000 of them is only 2.56Mytes or so. Should be well within cache
> range. let's say it has to do maybe 1000 machine cycles for every
> byte...thats 500K bytes a second searched at 500Mhz..4 seconds for a
> linear search to find the last one. Mm. That's significant.
>
> whereas a two level hierearchy? a 100 dirs in the first, and 100 files
> in each? 80 milliseconds. Give or take...starting to get into disk
> latency times anyway..
>
> well I have no idea how the OS searches the directory, but it seems to
> me that a hundred dirs of a hundred files each has to be the way to go.
>
> Shouldn't be that hard either: just use an autoincrement on each tag
> record, and every time it gets to modulo one hundred create a new
> directory.
>
>
>
>
further info: Post some research EXT3 filesystems can have 'dir_index'
set which adds a hashed Btree into the equation. Mine appear to have
this..i.e a lot of what splitting the big directory into smaller ones
would do, is already done by this mechanism.

I suspect this means that retrieval time using the database itself, and
a ext3 hashed file structure, are pretty similar.

Ergo the answers would seem to be

- On a hashed treed fileystem there is little advantage to going to
subdirs provided you KNOW THE FILE YOU WANT.
- Access to a database LONGBLOB is probably no faster, and not much slower.
- however the real issue is the file storage time into the database
itself. And how PHP might cope with super large 'strings'

Here is an interesting fragment..complete with at least one error..they
don't delete the temporary file..

http://www.php-mysql-tutorial.com/php-mysql-upload.php

I shudder to think how long adding slashes to a 60Mbyte binary image
might take, or indeed how much virtual memory holding it in a 60Mbyte
php string might use..

Anyone ever handled BLOB objects that large, into a Mysql database?

I guess it's test time...it WOULD be kinda nice to store it all in the
database.

Lets see..whats an A0 image..48x24 inches - at 24 bit 300dpi? Hmm
Uncompressed, about 60Mbyte..

I guess I could set a limit, and force them to compress down to 10Mbyte
or so..

PHP appears to set no intrinsic limit on string sizes, so reading the
whole thing into a string is at least feasible in theory. What worries
me is the addslashes bit. all of which is then removed by the mysql
API..yuk. so instead of a byte for byte copy, there has to be a 'read
file into ram, create new slashed string and deallocate memory from
first, give to sql, create de-slashed BLOB in RAM, write to ISAM file,
deallocate memory,do garbage..etc etc..

What would be better is to write to a BLOB from a file directly.

LOAD DATA might work..not quite sure how..insert a record, and load data
into just the blob?

The MySQL server also has restrictions on the maxiumum data size it can
accept over the socket ..hmm. Mine's latest and is 16Mbyte..so thats
OK..configurable..upwards..

Oh..I think this may work and avoid all this addslashing nonsense

$query= ("UPDATE file_table SET
file_contents=LOAD_FILE('".$tmpfilname."') where id=$id");

Any comments?

Re: Hello: not so much php, more like HTML..?

am 04.09.2007 09:48:20 von Toby A Inkster

The Natural Philosopher wrote:

> I suspect the answer is that for n files, us the pth root of n as the
> number of subdirs, where p is the depth of the subdirs...but a lot
> depends on caching algorithms in the directories. AND the way the OS
> searches them.

Look at the way mail queue and news spools work. This is software that has
to maintain a large number of smallish files, which have to be created and
deleted a lot, but very rarely modified. That sounds like roughly what you
want to do.

For a piece of mail queued with ID A1B2C3D4 it would store it in:

/var/spool/postfix/deferred/A/A1B2C3D4

That is, it stores it in sub-directories based on the first character of
the queue ID. For larger numbers of files, you could go even further:

/var/spool/postfix/deferred/A/A1/A1B2C3D4

An important aspect of this is that queue IDs are eight-digit codes
assigned pseudo-randomly rather than sequentially, so you don't end up with
a directory called "/1/" brimming with files, and a directory called "/2/"
which is almost empty.

--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.12-12mdksmp, up 75 days, 11:19.]

TrivialEncoder/0.2
http://tobyinkster.co.uk/blog/2007/08/19/trivial-encoder/

Re: Hello: not so much php, more like HTML..?

am 04.09.2007 13:19:32 von Courtney

Toby A Inkster wrote:
> The Natural Philosopher wrote:
>
>> I suspect the answer is that for n files, us the pth root of n as the
>> number of subdirs, where p is the depth of the subdirs...but a lot
>> depends on caching algorithms in the directories. AND the way the OS
>> searches them.
>
> Look at the way mail queue and news spools work. This is software that has
> to maintain a large number of smallish files, which have to be created and
> deleted a lot, but very rarely modified. That sounds like roughly what you
> want to do.
>
> For a piece of mail queued with ID A1B2C3D4 it would store it in:
>
> /var/spool/postfix/deferred/A/A1B2C3D4
>
> That is, it stores it in sub-directories based on the first character of
> the queue ID. For larger numbers of files, you could go even further:
>
> /var/spool/postfix/deferred/A/A1/A1B2C3D4
>
> An important aspect of this is that queue IDs are eight-digit codes
> assigned pseudo-randomly rather than sequentially, so you don't end up with
> a directory called "/1/" brimming with files, and a directory called "/2/"
> which is almost empty.
>
Well thanks for he reply..I discovered that

1/. my ext3 filesystem uses a hashed btree index

2/. so does my database

3/. ergo, it was easier to stuff them direct in the database, or would
have been but for a weird issue that kept me up all last night. see
other thread.. I thought it was something simple, but it looks like deep
weirdness at the core of PHP to me..but then everything does untill you
find the explanation.