web-site database speed

web-site database speed

am 24.02.2006 22:00:23 von Justin C

I'm automating the website for work (automating in that the pages are
not static, they're created depending on clicked link - yeah, we are so
behind hte times, everyone has been doing it this way forever). We've
about 3500 images (growing by about 50 - 100 per week) that can be
grouped in different ways - about 400 different sub-sets and each image
could be in up to 12 sub-sets. Rather than create static pages to cope
with that the pages have to be generated on the fly. The problem is the
data for each of the images; it's in one Excel spreadsheet (well, I say
Excel, Excel edits it but it's saved as CSV) and the spreadsheet is 1.5
MB. My concern is the time taken to open and parse this file each time
someone clicks a link (it's all being held/run on our ISPs web-servers).

I've seen that Perl has built in database functionality (it's covered in
the back of the Llama). If I, before up-load, extract the data from the
CSV file and create a Perl database file, will the running be quicker?
How about if the data gets put into the page-creating script itself?

I know a 1.5MB file isn't *that* big but if someone is browsing the site
and wants to look at 100 images, that's 100 times it's opened and
closed, there's gotta be a performance hit there somewhere.

Thanks for any replies/suggestions.

Justin.

PS. The reason I've not given the Perl database (I wish I hadn't left
the Llama on my desk, I'd look smarter calling it by it's name) stuff
more a look is my Perl coding is slow and just the page generating stuff
is taking me a lot of time (I've been using Perl for a while but very
infrequently). I'm hoping to get some feedback to help steer me in the
right direction before I invest too much time in something that may be a
waste of time.

--
Justin C, by the sea.

Re: web-site database speed

am 24.02.2006 23:01:46 von Matt Garrish

"Justin C" wrote in message
news:7e3c.43ff73e7.a543f@stigmata...
>
> I'm automating the website for work (automating in that the pages are
> not static, they're created depending on clicked link - yeah, we are so
> behind hte times, everyone has been doing it this way forever). We've
> about 3500 images (growing by about 50 - 100 per week) that can be
> grouped in different ways - about 400 different sub-sets and each image
> could be in up to 12 sub-sets. Rather than create static pages to cope
> with that the pages have to be generated on the fly. The problem is the
> data for each of the images; it's in one Excel spreadsheet (well, I say
> Excel, Excel edits it but it's saved as CSV) and the spreadsheet is 1.5
> MB. My concern is the time taken to open and parse this file each time
> someone clicks a link (it's all being held/run on our ISPs web-servers).
>
> I've seen that Perl has built in database functionality (it's covered in
> the back of the Llama). If I, before up-load, extract the data from the
> CSV file and create a Perl database file, will the running be quicker?
> How about if the data gets put into the page-creating script itself?
>
> I know a 1.5MB file isn't *that* big but if someone is browsing the site
> and wants to look at 100 images, that's 100 times it's opened and
> closed, there's gotta be a performance hit there somewhere.
>

I assume you mean DBM? (As per the dbmopen/dbmclose functions in perlfunc?)
There's no such thing as a Perl database file that I've ever heard of. All
you'd be doing is converting to a defined db format. You'll still have to
load that file every time your script runs, and I don't see that it would be
faster than loading a CSV.

If you read the docs, you'll see that the dbm functions have been superseded
by the tie/untie functions, and that's what I would suggest to you to avoid
changing formats. Take a look at Tie-Handle-CSV. It's probably all you need.

Don't forget that 1.5 Mb *is* tiny, and reading the file from disk will
probably be faster than using a real database and making a connection
through the DBI. Nothing beats a benchmark, though, so you ought to write a
simple little script that ties the csv file and spits and image to the
browser, and then check to see how many times it will run a second. I
suspect you'll find it will more than adequately handle your traffic needs
for some time to come (and it shouldn't be more than a few lines to write
and maintain).

Matt

Re: web-site database speed

am 26.02.2006 20:25:55 von Joe Smith

Matt Garrish wrote:

>> I've seen that Perl has built in database functionality (it's covered in
>> the back of the Llama). If I, before up-load, extract the data from the
>> CSV file and create a Perl database file, will the running be quicker?
>> How about if the data gets put into the page-creating script itself?
>
> I assume you mean DBM? (As per the dbmopen/dbmclose functions in perlfunc?)
> There's no such thing as a Perl database file that I've ever heard of. All
> you'd be doing is converting to a defined db format. You'll still have to
> load that file every time your script runs, and I don't see that it would be
> faster than loading a CSV.

You are mistaken. Perl has five native database formats.

Because the database has an index, the entire file does not have to be
loaded when looking at a subset of its keys.
-Joe

linux% perldoc AnyDBM_File
AnyDBM_File(3) User Contributed Perl Documentation AnyDBM_File(3)

NAME
AnyDBM_File - provide framework for multiple DBMs

NDBM_File, DB_File, GDBM_File, SDBM_File, ODBM_File - various DBM
implementations

SYNOPSIS
use AnyDBM_File;

DESCRIPTION
This module is a "pure virtual base class"--it has nothing of its own.
It's just there to inherit from one of the various DBM packages. It
prefers ndbm for compatibility reasons with Perl 4, then Berkeley DB
(See DB_File), GDBM, SDBM (which is always there--it comes with Perl),
and finally ODBM. This way old programs that used to use NDBM via
dbmopen() can still do so, but new ones can reorder @ISA:

BEGIN { @AnyDBM_File::ISA = qw(DB_File GDBM_File NDBM_File) }
use AnyDBM_File;

Having multiple DBM implementations makes it trivial to copy database
formats:

use POSIX; use NDBM_File; use DB_File;
tie %newhash, 'DB_File', $new_filename, O_CREAT|O_RDWR;
tie %oldhash, 'NDBM_File', $old_filename, 1, 0;
%newhash = %oldhash;

DBM Comparisons

Here's a partial table of features the different packages offer:

odbm ndbm sdbm gdbm bsd-db
---- ---- ---- ---- ------
Linkage comes w/ perl yes yes yes yes yes
Src comes w/ perl no no yes no no
Comes w/ many unix os yes yes[0] no no no
Builds ok on !unix ? ? yes yes ?
Code Size ? ? small big big
Database Size ? ? small big? ok[1]
Speed ? ? slow ok fast
FTPable no no yes yes yes
Easy to build N/A N/A yes yes ok[2]
Size limits 1k 4k 1k[3] none none
Byte-order independent no no no no yes
Licensing restrictions ? ? no yes no

[0] on mixed universe machines, may be in the bsd compat library, which
is often shunned.

[1] Can be trimmed if you compile for one access method.

[2] See DB_File. Requires symbolic links.

[3] By default, but can be redefined.

SEE ALSO
dbm(3), ndbm(3), DB_File(3), perldbmfilter

perl v5.8.6 2005-12-14 AnyDBM_File(3)

Re: web-site database speed

am 26.02.2006 21:40:10 von Matt Garrish

"Joe Smith" wrote in message
news:3YCdnUmr6OVUnZ_ZRVn-pQ@comcast.com...
> Matt Garrish wrote:
>
>>> I've seen that Perl has built in database functionality (it's covered in
>>> the back of the Llama). If I, before up-load, extract the data from the
>>> CSV file and create a Perl database file, will the running be quicker?
>>> How about if the data gets put into the page-creating script itself?
>>
>> I assume you mean DBM? (As per the dbmopen/dbmclose functions in
>> perlfunc?) There's no such thing as a Perl database file that I've ever
>> heard of. All you'd be doing is converting to a defined db format. You'll
>> still have to load that file every time your script runs, and I don't see
>> that it would be faster than loading a CSV.
>
> You are mistaken. Perl has five native database formats.
>

Supports five database formats natively, which is what I found confusing
about the OP's post, having not read Learning Perl. They aren't particular
to Perl, and the OP would still have to change the excel/csv file to one of
those formats prior to loading the file to the server.

> Because the database has an index, the entire file does not have to be
> loaded when looking at a subset of its keys.

That was certainly bad on my part. I shouldn't have said load the file, but
a lookup will still have to occur (and Perl will have to tie the dbm file).
I couldn't say how much faster this will be than using Tie-Handle-CSV,
though. That's left as an excercise for the person with the problem... : )

Matt

Re: web-site database speed

am 28.02.2006 21:58:04 von Justin C

On 2006-02-26, Matt Garrish wrote:
>
> "Joe Smith" wrote in message
> news:3YCdnUmr6OVUnZ_ZRVn-pQ@comcast.com...
>> Matt Garrish wrote:
>>
>>>> I've seen that Perl has built in database functionality (it's covered in
>>>> the back of the Llama). If I, before up-load, extract the data from the
>>>> CSV file and create a Perl database file, will the running be quicker?
>>>> How about if the data gets put into the page-creating script itself?
>>>
>>> I assume you mean DBM? (As per the dbmopen/dbmclose functions in
>>> perlfunc?) There's no such thing as a Perl database file that I've ever
>>> heard of. All you'd be doing is converting to a defined db format. You'll
>>> still have to load that file every time your script runs, and I don't see
>>> that it would be faster than loading a CSV.
>>
>> You are mistaken. Perl has five native database formats.
>>
>
> Supports five database formats natively, which is what I found confusing
> about the OP's post, having not read Learning Perl. They aren't particular
> to Perl, and the OP would still have to change the excel/csv file to one of
> those formats prior to loading the file to the server.
>
>> Because the database has an index, the entire file does not have to be
>> loaded when looking at a subset of its keys.
>
> That was certainly bad on my part. I shouldn't have said load the file, but
> a lookup will still have to occur (and Perl will have to tie the dbm file).
> I couldn't say how much faster this will be than using Tie-Handle-CSV,
> though. That's left as an excercise for the person with the problem... : )

Thanks for the replies.

While Perl has no trouble corresponding with databases there will be no
database back-end on my ISPs server. I was just looking to speed up the
opening / reading of a 1.5MB file... but, like you say, it's *only*
1.5MB - on our ISPs server, that's gonna be nothing. Still, I'll give it
all a go and, if I need to revise that, I'll think on the comments here.

Thanks.

Justin.

--
Justin C, by the sea.