RE: optimizing PHP for microseconds

RE: optimizing PHP for microseconds

am 26.03.2010 03:07:38 von Daevid Vincent

> -----Original Message-----
> From: Robert Cummings [mailto:robert@interjinn.com]
> Sent: Thursday, March 25, 2010 6:25 AM
> To: Per Jessen
> Cc: php-general@lists.php.net
> Subject: Re: [PHP] Will PHP ever "grow up" and have threading?
>
> Per Jessen wrote:
> > Tommy Pham wrote:
> >
> >> (I remember a list member, not mentioning his name, does optimization
> >> of PHP coding for just microseconds. Do you think how much more he'd
> >> benefit from this?)
> >
> > Anyone who optimizes PHP for microseconds has lost touch with reality -
> > or at least forgotten that he or she is using an interpreted language.
>
> But sometimes it's just plain fun to do it here on the list with
> everyone further optimizing the last optimized snippet :)
>
> Cheers,
> Rob.

Was that someone me? I do that. And if you don't, then you're the kind of
person I would not hire (not saying that to sound mean). I use single
quotes instead of double where applicable. I use -- instead of ++. I use
$boolean = !$boolean to alternate (instead of mod() or other incrementing
solutions). I use "LIMIT 1" on select, update, delete where appropriate. I
use the session to cache the user and even query results. I don't use
bloated frameworks (like Symfony or Zend or Cake or whatever else tries to
be one-size-fits-all). The list goes on.

I would counter and say that if you are NOT optimizing every little drop of
performance from your scripts, then you're either not running a site
sufficiently large enough to matter, or you're doing your customers a
disservice.

I come from the video game world where gaining a frame or two of animation
per second matters. It makes your game feel less choppy and more fluid and
therefore more fun to play.

If I have to wait 3 seconds for a page to render, that wait is noticeable.
Dumb users will click refresh, and since (unbelievably in this day and age)
PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
therefore mySQL will execute the same query a second time. That's an
entirely different thread I've already ranted on about.

If you can shave off 0.1s from each row of a query result, after only 10
rows, you've saved the user 1 full second. But realistically, you are most
likely displaying hundreds (or in my case, thousands) of rows. Now I've
just saved this user 10s to 100s (that's a minute and a half!)

I'm dealing with TB databases with billions of rows and complex queries
that would make you (and often times me too) cringe in fright. Sure, if
you're dealing with your who-gives-a-shit "blog" website and all 20 entries
of crap-nobody-cares-about, then do whatever you want. But if you're doing
professional, enterprise level work, or have real customers who expect
performance, then you sure as hell better be considering all the ways to
speed up your page. They don't run in a vacuume. They don't just have a
single query.

d


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: RE: optimizing PHP for microseconds

am 26.03.2010 03:15:59 von Robert Cummings

Daevid Vincent wrote:
>
> If I have to wait 3 seconds for a page to render, that wait is noticeable.
> Dumb users will click refresh, and since (unbelievably in this day and age)
> PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
> therefore mySQL will execute the same query a second time. That's an
> entirely different thread I've already ranted on about.

You may find the following enlightening:

http://www.php.net/manual/en/function.ignore-user-abort.php
http://www.php.net/manual/en/function.connection-aborted.php
http://www.php.net/manual/en/function.connection-status.php

Cheers,
Rob.
--
http://www.interjinn.com
Application and Templating Framework for PHP

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

RE: RE: optimizing PHP for microseconds

am 26.03.2010 03:58:52 von Daevid Vincent

> -----Original Message-----
> From: Robert Cummings [mailto:robert@interjinn.com]
> Sent: Thursday, March 25, 2010 7:16 PM
>
> Daevid Vincent wrote:
> >
> > If I have to wait 3 seconds for a page to render, that wait
> is noticeable.
> > Dumb users will click refresh, and since (unbelievably in
> this day and age)
> > PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
> > therefore mySQL will execute the same query a second time. That's an
> > entirely different thread I've already ranted on about.
>
> You may find the following enlightening:
>
> http://www.php.net/manual/en/function.ignore-user-abort.php
> http://www.php.net/manual/en/function.connection-aborted.php
> http://www.php.net/manual/en/function.connection-status.php
>

Except there is no way to tell mySQL "cancel that last request/query".
Well, no graceful way.

We actually have a script that runs on a crontab and seeks and destroys
"long running" queries. As you may have guessed, just because a query takes
a long time, it's difficult to know if it's actually hung or just really
taking that long. So we do some smarts to compare against others and see if
it seems like the same one and stuff like that. Not great, but sure stops
the load from shooting through the roof.

Again, not going into the rant I've done before. Look in the archives
2009-06-02 for "Why doesn't mySQL stop a query when the browser tab is
closed" for that thread and even more indepth info on the
mysql@lists.mysql.com archives (same date and subject).


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: RE: optimizing PHP for microseconds

am 26.03.2010 04:15:59 von Robert Cummings

Daevid Vincent wrote:
>
>
>> -----Original Message-----
>> From: Robert Cummings [mailto:robert@interjinn.com]
>> Sent: Thursday, March 25, 2010 7:16 PM
>>
>> Daevid Vincent wrote:
>>> If I have to wait 3 seconds for a page to render, that wait
>> is noticeable.
>>> Dumb users will click refresh, and since (unbelievably in
>> this day and age)
>>> PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
>>> therefore mySQL will execute the same query a second time. That's an
>>> entirely different thread I've already ranted on about.
>> You may find the following enlightening:
>>
>> http://www.php.net/manual/en/function.ignore-user-abort.php
>> http://www.php.net/manual/en/function.connection-aborted.php
>> http://www.php.net/manual/en/function.connection-status.php
>>
>
> Except there is no way to tell mySQL "cancel that last request/query".
> Well, no graceful way.
>
> We actually have a script that runs on a crontab and seeks and destroys
> "long running" queries. As you may have guessed, just because a query takes
> a long time, it's difficult to know if it's actually hung or just really
> taking that long. So we do some smarts to compare against others and see if
> it seems like the same one and stuff like that. Not great, but sure stops
> the load from shooting through the roof.
>
> Again, not going into the rant I've done before. Look in the archives
> 2009-06-02 for "Why doesn't mySQL stop a query when the browser tab is
> closed" for that thread and even more indepth info on the
> mysql@lists.mysql.com archives (same date and subject).

That's a good point about MySQL, and in fact PHP would probably keep
running too until MySQL returned.

Cheers,
Rob.
--
http://www.interjinn.com
Application and Templating Framework for PHP

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: RE: optimizing PHP for microseconds

am 26.03.2010 07:05:36 von Tommy Pham

On Thu, Mar 25, 2010 at 8:15 PM, Robert Cummings wro=
te:
> Daevid Vincent wrote:
>>
>>
>>>
>>> -----Original Message-----
>>> From: Robert Cummings [mailto:robert@interjinn.com] Sent: Thursday, Mar=
ch
>>> 25, 2010 7:16 PM
>>>
>>> Daevid Vincent wrote:
>>>>
>>>> If I have to wait 3 seconds for a page to render, that wait
>>>
>>> is noticeable.
>>>>
>>>> Dumb users will click refresh, and since (unbelievably in
>>>
>>> this day and age)
>>>>
>>>> PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
>>>> therefore mySQL will execute the same query a second time. That's an
>>>> entirely different thread I've already ranted on about.
>>>
>>> You may find the following enlightening:
>>>
>>>     http://www.php.net/manual/en/function.ignore-user-abort.p=
hp
>>>     http://www.php.net/manual/en/function.connection-aborted.=
php
>>>     http://www.php.net/manual/en/function.connection-status.p=
hp
>>>
>>
>> Except there is no way to tell mySQL "cancel that last request/query".
>> Well, no graceful way.
>>
>> We actually have a script that runs on a crontab and seeks and destroys
>> "long running" queries. As you may have guessed, just because a query
>> takes
>> a long time, it's difficult to know if it's actually hung or just really
>> taking that long. So we do some smarts to compare against others and see
>> if
>> it seems like the same one and stuff like that. Not great, but sure stop=
s
>> the load from shooting through the roof.
>>
>> Again, not going into the rant I've done before. Look in the archives
>> 2009-06-02 for "Why doesn't mySQL stop a query when the browser tab is
>> closed" for that thread and even more indepth info on the
>> mysql@lists.mysql.com archives (same date and subject).
>
> That's a good point about MySQL, and in fact PHP would probably keep runn=
ing
> too until MySQL returned.
>
> Cheers,
> Rob.
> --
> http://www.interjinn.com
> Application and Templating Framework for PHP
>

What about 'SHOW FULL PROCESSLIST' and look through the 'INFO' for
that last matching query statement and kill the process?

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: RE: optimizing PHP for microseconds

am 26.03.2010 07:17:13 von Robert Cummings

Tommy Pham wrote:
> On Thu, Mar 25, 2010 at 8:15 PM, Robert Cummings wrote:
>> Daevid Vincent wrote:
>>>
>>>> -----Original Message-----
>>>> From: Robert Cummings [mailto:robert@interjinn.com] Sent: Thursday, March
>>>> 25, 2010 7:16 PM
>>>>
>>>> Daevid Vincent wrote:
>>>>> If I have to wait 3 seconds for a page to render, that wait
>>>> is noticeable.
>>>>> Dumb users will click refresh, and since (unbelievably in
>>>> this day and age)
>>>>> PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
>>>>> therefore mySQL will execute the same query a second time. That's an
>>>>> entirely different thread I've already ranted on about.
>>>> You may find the following enlightening:
>>>>
>>>> http://www.php.net/manual/en/function.ignore-user-abort.php
>>>> http://www.php.net/manual/en/function.connection-aborted.php
>>>> http://www.php.net/manual/en/function.connection-status.php
>>>>
>>> Except there is no way to tell mySQL "cancel that last request/query".
>>> Well, no graceful way.
>>>
>>> We actually have a script that runs on a crontab and seeks and destroys
>>> "long running" queries. As you may have guessed, just because a query
>>> takes
>>> a long time, it's difficult to know if it's actually hung or just really
>>> taking that long. So we do some smarts to compare against others and see
>>> if
>>> it seems like the same one and stuff like that. Not great, but sure stops
>>> the load from shooting through the roof.
>>>
>>> Again, not going into the rant I've done before. Look in the archives
>>> 2009-06-02 for "Why doesn't mySQL stop a query when the browser tab is
>>> closed" for that thread and even more indepth info on the
>>> mysql@lists.mysql.com archives (same date and subject).
>> That's a good point about MySQL, and in fact PHP would probably keep running
>> too until MySQL returned.
>>
>> Cheers,
>> Rob.
>> --
>> http://www.interjinn.com
>> Application and Templating Framework for PHP
>>
>
> What about 'SHOW FULL PROCESSLIST' and look through the 'INFO' for
> that last matching query statement and kill the process?

This is possible but then you don't know whose query you are killing. A
terminated PHP process or a actively running PHP process with a
connected user awaiting output. However, you could track PHP process IDs
and MySQL process IDs (via mysql_thread_id()) to know whose MySQL
process you are killing.

Cheers,
Rob.
--
http://www.interjinn.com
Application and Templating Framework for PHP

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: RE: optimizing PHP for microseconds

am 26.03.2010 08:33:55 von Per Jessen

Daevid Vincent wrote:

> Was that someone me? I do that. And if you don't, then you're the kin=
d
> of person I would not hire (not saying that to sound mean).=20

If you do, I'd would be careful about hiring you. To me, optimizing fo=
r
microseconds in PHP means loss of focus.

> I use single quotes instead of double where applicable. I use --
> instead of ++. I use $boolean =3D !$boolean to alternate (instead of
> mod() or other incrementing solutions). I use "LIMIT 1" on select,
> update, delete where appropriate. I use the session to cache the user=

> and even query results.=20

Most of that is just sound practice, not optimizing, imho. Optimizing
is what you do later. =20

> I come from the video game world where gaining a frame or two of
> animation per second matters. It makes your game feel less choppy and=

> more fluid and therefore more fun to play.

Well, if you were writing PHP video games, I can totally appreciate
optimizing for microseconds.=20



--=20
Per Jessen, Zürich (11.4°C)


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: optimizing PHP for microseconds

am 28.03.2010 18:04:11 von Nathan Rixham

mngghh, okay, consider me baited.

Daevid Vincent wrote:
>> Per Jessen wrote:
>>> Tommy Pham wrote:
>>>
>>>> (I remember a list member, not mentioning his name, does optimization
>>>> of PHP coding for just microseconds. Do you think how much more he'd
>>>> benefit from this?)
>>> Anyone who optimizes PHP for microseconds has lost touch with reality -
>>> or at least forgotten that he or she is using an interpreted language.
>> But sometimes it's just plain fun to do it here on the list with
>> everyone further optimizing the last optimized snippet :)
>>
>> Cheers,
>> Rob.
>
> Was that someone me? I do that. And if you don't, then you're the kind of
> person I would not hire (not saying that to sound mean). I use single
> quotes instead of double where applicable. I use -- instead of ++. I use
> $boolean = !$boolean to alternate (instead of mod() or other incrementing
> solutions). I use "LIMIT 1" on select, update, delete where appropriate. I
> use the session to cache the user and even query results. I don't use
> bloated frameworks (like Symfony or Zend or Cake or whatever else tries to
> be one-size-fits-all). The list goes on.

That's not optimization, at best it's just an awareness of PHP syntax
and a vague awareness of how the syntax will ultimately be interpreted.

Using "LIMIT 1" is not optimizing it's just saying you only want one
result returned, the SQL query could still take five hours to run if no
indexes, a poorly normalised database, wrong datatypes, and joins all
over the place.

Using the session to cache "the user" is the only thing that comes
anywhere near to application optimisation in all you've said; and
frankly I would take to be pretty obvious and basic stuff (yet pointless
in most scenario's where you have to cater for possible bans and
de-authorisations) - storing query results in a session cache is only
ever useful in one distinct scenario, when the results of that query are
only valid for the owner of the session, and only for the duration of
that session, nothing more, nothing less. This is a one in a million
scenario.

Bloated frameworks, most of the time they are not bloated, especially
when you use them properly and only include what you need on a need to
use basis; then the big framework can only be considered a class or two.
Sure the codebase seems more bloated, but at runtime it's easily
negated. You can use these frameworks for any size project, enterprise
included, provided you appreciated the strengths and weaknesses of the
full tech stack at your disposal. Further, especially on enterprise
projects it makes sense to drop development time by using a common
framework, and far more importantly, to have a code base developers know
well and can "hit the ground running" with.

Generally unless you have unlimited learning time and practically zero
budget constraints frameworks like the ones you mentioned should always
be used for large team enterprise applications, although perhaps
something more modular like Zend is suited. They also cover your own
back when you are the lead developer, because on the day when a more
experienced developer than yourself joins the project and points out all
your mistakes, you're going to feel pretty shite and odds are very high
that the project will go sour, get fully re-written or you'll have to
leave due to "stress" (of being wrong).

> I would counter and say that if you are NOT optimizing every little drop of
> performance from your scripts, then you're either not running a site
> sufficiently large enough to matter, or you're doing your customers a
> disservice.

Or you have no grasp of the tech stack available and certainly aren't
utilizing it properly; I'm not suggesting that knowing how to use your
language of choice well is a bad thing, it's great; knock yourself out.
However, suggesting that optimising a php script for microseconds will
boost performance in large sites (nay, any site) shows such a loss of
focus that it's hard to comprehend.

By also considering other posts from yourself (in reply to this and
other threads) I can firmly say the above is true of you.

Optimisation comes down to running the least amount of code possible,
and only when really needed. If you are running a script / query /
process which provides the same output more than once then you are not
optimising. This will be illustrated further down this reply perfectly.

The web itself is the ultimate scalable distributed application known to
man, and has been guided and created by those far more knowledgeable
than you or I (Berners-Lee, Fielding, Godel, Turing et al), everything
you need is right there (and specifically in HTTP). Failing to leverage
this is where a lack of focus and scope comes in to play, especially
with large scale sites, and means you are doing your customers a disservice.

For anything where the output can be used more than once, (at a granular
level), the output should be cached.

For example, if you run SELECT / UPDATE/INSERT queries at a ratio any
higher than 1 SELECT per UPDATE/INSERT then you *will* get a sizeable
performance upgrade by caching the output. Another less granular example
would be a simple "blog", you can generated the page every time, or you
can only "publish" the page every time the post is updated or a comment
is added; and thus you can leverage file system cache's which most
operating systems have now, and http server caching, and HTTP caching
itself by utilizing last-modified; etags and having 304 not modified
returned for any repeat requests.

> I come from the video game world where gaining a frame or two of animation
> per second matters. It makes your game feel less choppy and more fluid and
> therefore more fun to play.

Many lessons can be learned from the video game (and flash) worlds, but
these are generally just how to code well; most of the real
optimizations come from how you serialize data, minimise the amount of
output data + frequency at which it is sent; and moreover by compiler or
bytecode optimisations - some of this can cross over in to PHP world,
but not much since it's interpreted rather than compiled, and even less
since the same code isn't run hundreds of time per second - and if it
is; you are normally doing something wrong (in all but the most specific
of cases).

> If I have to wait 3 seconds for a page to render, that wait is noticeable.
> Dumb users will click refresh, and since (unbelievably in this day and age)
> PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
> therefore mySQL will execute the same query a second time. That's an
> entirely different thread I've already ranted on about.

Render time is a totally different subject, since css/images/javascript
and more come in to play, not to mention the users browser and machine
spec. This is usually improved by including image width and height in
your html (negate this and the user agent has to "sniff" all images to
get their dimensions before layout can be calculated and later
rendered), using static shared stylesheets which can be returned as 304
not modified; and including client-side scripts as deferred or after the
main body of content (hence why google analytics specifies the placing
of their javascript just before the tag).

Now if there was one sentence in all of the recent posts which conveys
the amount of misunderstanding at play here, it's this one: "Dumb users
will click refresh, and since (unbelievably in this day and age) PHP and
mySQL don't know the user clicked 'stop' or 'refresh', and therefore
mySQL will execute the same query a second time."

No no no no no! Unbelievably in this day and age developers are still
creating systems where the "same queries" (implying the same output) can
be executed by something as foreign a second time (and indeed multiple
times).

If you learn anything from this, learn that this is the crux of the
failings, the output of that query, at the very least, should be cache'd
- thankfully your rdmbs is partially saving your ass half the time by
using it's own cache.

PHP and MySQL are not being dumb here, you are in *full* control of what
happens in your application, and if you have it set up so that the same
things, producing the same results, are being run time after time, then
more fool you. That output should be saved, in memory or file, and used
the second time; ideally that full view (if accessed generally more than
once) should be persisted so that it can be served statically until part
of the view needs updated; then regenerate and repeat.

> If you can shave off 0.1s from each row of a query result, after only 10
> rows, you've saved the user 1 full second. But realistically, you are most
> likely displaying hundreds (or in my case, thousands) of rows. Now I've
> just saved this user 10s to 100s (that's a minute and a half!)


O.M.G. am I reading these numbers correctly? shave off 0.1 seconds from
each row? saving the user 10-100 seconds? Just how are you coding these
applications!


In my world, if a "heavy" script is taking any more than 0.1 seconds to
run in it's entirety we have a problem; honestly, I'm unsure what to
write here - the only constructive thought I have is, why don't we have
a "PHP week" on the list; where a standard application is created; then
we optimise the hell out of it and catalogue what was done for all to see.

We'd need:
2 temporary servers (one web, one db : any spec)
1 donated "application" w/ data

I'd be up for it; and would be interested to see who just quick we can
make the thing between us all.

Would suggest a few test scripts where made to call a series of
operations, user paths as it were, then run it through ab and get some
numbers.

> I'm dealing with TB databases with billions of rows and complex queries
> that would make you (and often times me too) cringe in fright. Sure, if
> you're dealing with your who-gives-a-shit "blog" website and all 20 entries
> of crap-nobody-cares-about, then do whatever you want. But if you're doing
> professional, enterprise level work, or have real customers who expect
> performance, then you sure as hell better be considering all the ways to
> speed up your page. They don't run in a vacuume. They don't just have a
> single query.

no comment; I'm doing the same and have done for years; and the words
you are coming out with just don't add up - if you are on TB datasets
why the hell are you using RDBMS and php/mysql?? you need to be on to
non relational databases; and considering the hadoops of the world.

Suffice to say, if you have a complex query - something is vastly wrong
with the full architecure and system design.

all from experience.

Finally, reading through the list posts from the last week or two I've
become rather concerned about just how much disinformation and lack of
understanding is floating about. Many of the long time posters on this
list who do know better have either kept quiet or not covered the points
properly, whilst many more have been baited in to discussing questions
and points which have no answer, because they are the wrong questions to
be asking in the first place.

Times like this call for a smart-ass, and today I'll be that smart-ass;
not because I want to be labelled as such, but so that the other
knowledgeable people on the list can hook up on anything I've got wrong
and challenge it; and hopefully, ultimately, we'll have a full positive
thread that all can read and gain positive insight from as to how to use
PHP and leverage the full stack of technologies we have available to
address most (if not all) the points raised recently.

And Daevid, specifically, don't think for a minute these aren't learning
curves many of us have taken - skip back a couple of years, look through
the posts, and you'll find another developer banging on about threads in
php and optimising for micro-seconds ;)

Many Regards,

Nathan

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Re: optimizing PHP for microseconds

am 29.03.2010 10:28:53 von Bastien Helders

--001636417451a625a00482ec4dd0
Content-Type: text/plain; charset=ISO-8859-1

I have a question as a relatively novice PHP developper.

Let's say you have this Intranet web application, that deals with the
generation of file bundles that could become quite large (let say in the 800
MB) after some kind of selection process. It should be available to many
users on this Intranet, but shouldn't require any installation. Would it be
a case where optimizing for microseconds would be recommended? Or would PHP
not be the language of choice?

I'm not asking to prove that there could be corner case where it could be
useful, but I am genuinely interested as I am in the development of such a
project, and increasing the performance of this web application is one of my
goal.

2010/3/28 Nathan Rixham

> mngghh, okay, consider me baited.
>
> Daevid Vincent wrote:
> >> Per Jessen wrote:
> >>> Tommy Pham wrote:
> >>>
> >>>> (I remember a list member, not mentioning his name, does optimization
> >>>> of PHP coding for just microseconds. Do you think how much more he'd
> >>>> benefit from this?)
> >>> Anyone who optimizes PHP for microseconds has lost touch with reality -
> >>> or at least forgotten that he or she is using an interpreted language.
> >> But sometimes it's just plain fun to do it here on the list with
> >> everyone further optimizing the last optimized snippet :)
> >>
> >> Cheers,
> >> Rob.
> >
> > Was that someone me? I do that. And if you don't, then you're the kind of
> > person I would not hire (not saying that to sound mean). I use single
> > quotes instead of double where applicable. I use -- instead of ++. I use
> > $boolean = !$boolean to alternate (instead of mod() or other incrementing
> > solutions). I use "LIMIT 1" on select, update, delete where appropriate.
> I
> > use the session to cache the user and even query results. I don't use
> > bloated frameworks (like Symfony or Zend or Cake or whatever else tries
> to
> > be one-size-fits-all). The list goes on.
>
> That's not optimization, at best it's just an awareness of PHP syntax
> and a vague awareness of how the syntax will ultimately be interpreted.
>
> Using "LIMIT 1" is not optimizing it's just saying you only want one
> result returned, the SQL query could still take five hours to run if no
> indexes, a poorly normalised database, wrong datatypes, and joins all
> over the place.
>
> Using the session to cache "the user" is the only thing that comes
> anywhere near to application optimisation in all you've said; and
> frankly I would take to be pretty obvious and basic stuff (yet pointless
> in most scenario's where you have to cater for possible bans and
> de-authorisations) - storing query results in a session cache is only
> ever useful in one distinct scenario, when the results of that query are
> only valid for the owner of the session, and only for the duration of
> that session, nothing more, nothing less. This is a one in a million
> scenario.
>
> Bloated frameworks, most of the time they are not bloated, especially
> when you use them properly and only include what you need on a need to
> use basis; then the big framework can only be considered a class or two.
> Sure the codebase seems more bloated, but at runtime it's easily
> negated. You can use these frameworks for any size project, enterprise
> included, provided you appreciated the strengths and weaknesses of the
> full tech stack at your disposal. Further, especially on enterprise
> projects it makes sense to drop development time by using a common
> framework, and far more importantly, to have a code base developers know
> well and can "hit the ground running" with.
>
> Generally unless you have unlimited learning time and practically zero
> budget constraints frameworks like the ones you mentioned should always
> be used for large team enterprise applications, although perhaps
> something more modular like Zend is suited. They also cover your own
> back when you are the lead developer, because on the day when a more
> experienced developer than yourself joins the project and points out all
> your mistakes, you're going to feel pretty shite and odds are very high
> that the project will go sour, get fully re-written or you'll have to
> leave due to "stress" (of being wrong).
>
> > I would counter and say that if you are NOT optimizing every little drop
> of
> > performance from your scripts, then you're either not running a site
> > sufficiently large enough to matter, or you're doing your customers a
> > disservice.
>
> Or you have no grasp of the tech stack available and certainly aren't
> utilizing it properly; I'm not suggesting that knowing how to use your
> language of choice well is a bad thing, it's great; knock yourself out.
> However, suggesting that optimising a php script for microseconds will
> boost performance in large sites (nay, any site) shows such a loss of
> focus that it's hard to comprehend.
>
> By also considering other posts from yourself (in reply to this and
> other threads) I can firmly say the above is true of you.
>
> Optimisation comes down to running the least amount of code possible,
> and only when really needed. If you are running a script / query /
> process which provides the same output more than once then you are not
> optimising. This will be illustrated further down this reply perfectly.
>
> The web itself is the ultimate scalable distributed application known to
> man, and has been guided and created by those far more knowledgeable
> than you or I (Berners-Lee, Fielding, Godel, Turing et al), everything
> you need is right there (and specifically in HTTP). Failing to leverage
> this is where a lack of focus and scope comes in to play, especially
> with large scale sites, and means you are doing your customers a
> disservice.
>
> For anything where the output can be used more than once, (at a granular
> level), the output should be cached.
>
> For example, if you run SELECT / UPDATE/INSERT queries at a ratio any
> higher than 1 SELECT per UPDATE/INSERT then you *will* get a sizeable
> performance upgrade by caching the output. Another less granular example
> would be a simple "blog", you can generated the page every time, or you
> can only "publish" the page every time the post is updated or a comment
> is added; and thus you can leverage file system cache's which most
> operating systems have now, and http server caching, and HTTP caching
> itself by utilizing last-modified; etags and having 304 not modified
> returned for any repeat requests.
>
> > I come from the video game world where gaining a frame or two of
> animation
> > per second matters. It makes your game feel less choppy and more fluid
> and
> > therefore more fun to play.
>
> Many lessons can be learned from the video game (and flash) worlds, but
> these are generally just how to code well; most of the real
> optimizations come from how you serialize data, minimise the amount of
> output data + frequency at which it is sent; and moreover by compiler or
> bytecode optimisations - some of this can cross over in to PHP world,
> but not much since it's interpreted rather than compiled, and even less
> since the same code isn't run hundreds of time per second - and if it
> is; you are normally doing something wrong (in all but the most specific
> of cases).
>
> > If I have to wait 3 seconds for a page to render, that wait is
> noticeable.
> > Dumb users will click refresh, and since (unbelievably in this day and
> age)
> > PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
> > therefore mySQL will execute the same query a second time. That's an
> > entirely different thread I've already ranted on about.
>
> Render time is a totally different subject, since css/images/javascript
> and more come in to play, not to mention the users browser and machine
> spec. This is usually improved by including image width and height in
> your html (negate this and the user agent has to "sniff" all images to
> get their dimensions before layout can be calculated and later
> rendered), using static shared stylesheets which can be returned as 304
> not modified; and including client-side scripts as deferred or after the
> main body of content (hence why google analytics specifies the placing
> of their javascript just before the tag).
>
> Now if there was one sentence in all of the recent posts which conveys
> the amount of misunderstanding at play here, it's this one: "Dumb users
> will click refresh, and since (unbelievably in this day and age) PHP and
> mySQL don't know the user clicked 'stop' or 'refresh', and therefore
> mySQL will execute the same query a second time."
>
> No no no no no! Unbelievably in this day and age developers are still
> creating systems where the "same queries" (implying the same output) can
> be executed by something as foreign a second time (and indeed multiple
> times).
>
> If you learn anything from this, learn that this is the crux of the
> failings, the output of that query, at the very least, should be cache'd
> - thankfully your rdmbs is partially saving your ass half the time by
> using it's own cache.
>
> PHP and MySQL are not being dumb here, you are in *full* control of what
> happens in your application, and if you have it set up so that the same
> things, producing the same results, are being run time after time, then
> more fool you. That output should be saved, in memory or file, and used
> the second time; ideally that full view (if accessed generally more than
> once) should be persisted so that it can be served statically until part
> of the view needs updated; then regenerate and repeat.
>
> > If you can shave off 0.1s from each row of a query result, after only 10
> > rows, you've saved the user 1 full second. But realistically, you are
> most
> > likely displaying hundreds (or in my case, thousands) of rows. Now I've
> > just saved this user 10s to 100s (that's a minute and a half!)
>
>
> O.M.G. am I reading these numbers correctly? shave off 0.1 seconds from
> each row? saving the user 10-100 seconds? Just how are you coding these
> applications!
>
>
> In my world, if a "heavy" script is taking any more than 0.1 seconds to
> run in it's entirety we have a problem; honestly, I'm unsure what to
> write here - the only constructive thought I have is, why don't we have
> a "PHP week" on the list; where a standard application is created; then
> we optimise the hell out of it and catalogue what was done for all to see.
>
> We'd need:
> 2 temporary servers (one web, one db : any spec)
> 1 donated "application" w/ data
>
> I'd be up for it; and would be interested to see who just quick we can
> make the thing between us all.
>
> Would suggest a few test scripts where made to call a series of
> operations, user paths as it were, then run it through ab and get some
> numbers.
>
> > I'm dealing with TB databases with billions of rows and complex queries
> > that would make you (and often times me too) cringe in fright. Sure, if
> > you're dealing with your who-gives-a-shit "blog" website and all 20
> entries
> > of crap-nobody-cares-about, then do whatever you want. But if you're
> doing
> > professional, enterprise level work, or have real customers who expect
> > performance, then you sure as hell better be considering all the ways to
> > speed up your page. They don't run in a vacuume. They don't just have a
> > single query.
>
> no comment; I'm doing the same and have done for years; and the words
> you are coming out with just don't add up - if you are on TB datasets
> why the hell are you using RDBMS and php/mysql?? you need to be on to
> non relational databases; and considering the hadoops of the world.
>
> Suffice to say, if you have a complex query - something is vastly wrong
> with the full architecure and system design.
>
> all from experience.
>
> Finally, reading through the list posts from the last week or two I've
> become rather concerned about just how much disinformation and lack of
> understanding is floating about. Many of the long time posters on this
> list who do know better have either kept quiet or not covered the points
> properly, whilst many more have been baited in to discussing questions
> and points which have no answer, because they are the wrong questions to
> be asking in the first place.
>
> Times like this call for a smart-ass, and today I'll be that smart-ass;
> not because I want to be labelled as such, but so that the other
> knowledgeable people on the list can hook up on anything I've got wrong
> and challenge it; and hopefully, ultimately, we'll have a full positive
> thread that all can read and gain positive insight from as to how to use
> PHP and leverage the full stack of technologies we have available to
> address most (if not all) the points raised recently.
>
> And Daevid, specifically, don't think for a minute these aren't learning
> curves many of us have taken - skip back a couple of years, look through
> the posts, and you'll find another developer banging on about threads in
> php and optimising for micro-seconds ;)
>
> Many Regards,
>
> Nathan
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>


--
haXe - an open source web programming language
http://haxe.org

--001636417451a625a00482ec4dd0--

Re: Re: optimizing PHP for microseconds

am 29.03.2010 10:45:55 von Per Jessen

Bastien Helders wrote:

> I have a question as a relatively novice PHP developper.
>=20
> Let's say you have this Intranet web application, that deals with the=

> generation of file bundles that could become quite large (let say in
> the 800 MB) after some kind of selection process. It should be
> available to many users on this Intranet, but shouldn't require any
> installation. Would it be a case where optimizing for microseconds
> would be recommended? Or would PHP not be the language of choice?

Not enough data. However, given that it will undoubtedly take seconds
to assemble one such bundle, microseconds are probably not important.=20=

Depends on how many of those bundles you expect to be able to produce
per minute/hour/day as well as what is supposed to happen with them
after they've been assembled.=20



--=20
Per Jessen, Zürich (10.8°C)


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Re: optimizing PHP for microseconds

am 29.03.2010 10:49:18 von Peter Lind

That's impossible to answer given the brief layout of what you've described=
..

However, rule of thumb: optimizing for microseconds only makes sense
when the microseconds together make up a significant amount of time.
An example might be in order:

for ($i =3D 0; $i < count($stuff); $i++)
{
// do other stuffs
}

The above loop is NOT optimal (as most people will tell you) because
you'll be doing a count() every loop. However, there's an enormous
difference between doing 100 counts and 1.000.000 counts. Microseconds
only count when there's enough of them to make up seconds.

The best thing to do is adopt the normal good coding standards: don't
using functions in loops like the above, for instance.

However, be skeptic about tips: single-quotes are not faster than
double-quotes, for instance.

Regards
Peter

On 29 March 2010 10:28, Bastien Helders wrote:
> I have a question as a relatively novice PHP developper.
>
> Let's say you have this Intranet web application, that deals with the
> generation of file bundles that could become quite large (let say in the =
800
> MB) after some kind of selection process. It should be available to many
> users on this Intranet, but shouldn't require any installation. Would it =
be
> a case where optimizing for microseconds would be recommended? Or would P=
HP
> not be the language of choice?
>
> I'm not asking to prove that there could be corner case where it could be
> useful, but I am genuinely interested as I am in the development of such =
a
> project, and increasing the performance of this web application is one of=
my
> goal.
>
> 2010/3/28 Nathan Rixham
>
>> mngghh, okay, consider me baited.
>>
>> Daevid Vincent wrote:
>> >> Per Jessen wrote:
>> >>> Tommy Pham wrote:
>> >>>
>> >>>> (I remember a list member, not mentioning his name, does optimizati=
on
>> >>>> of PHP coding for just microseconds.  Do you think how much mo=
re he'd
>> >>>> benefit from this?)
>> >>> Anyone who optimizes PHP for microseconds has lost touch with realit=
y -
>> >>> or at least forgotten that he or she is using an interpreted languag=
e.
>> >> But sometimes it's just plain fun to do it here on the list with
>> >> everyone further optimizing the last optimized snippet :)
>> >>
>> >> Cheers,
>> >> Rob.
>> >
>> > Was that someone me? I do that. And if you don't, then you're the kind=
of
>> > person I would not hire (not saying that to sound mean). I use single
>> > quotes instead of double where applicable. I use -- instead of ++. I u=
se
>> > $boolean =3D !$boolean to alternate (instead of mod() or other increme=
nting
>> > solutions). I use "LIMIT 1" on select, update, delete where appropriat=
e.
>> I
>> > use the session to cache the user and even query results. I don't use
>> > bloated frameworks (like Symfony or Zend or Cake or whatever else trie=
s
>> to
>> > be one-size-fits-all). The list goes on.
>>
>> That's not optimization, at best it's just an awareness of PHP syntax
>> and a vague awareness of how the syntax will ultimately be interpreted.
>>
>> Using "LIMIT 1" is not optimizing it's just saying you only want one
>> result returned, the SQL query could still take five hours to run if no
>> indexes, a poorly normalised database, wrong datatypes, and joins all
>> over the place.
>>
>> Using the session to cache "the user" is the only thing that comes
>> anywhere near to application optimisation in all you've said; and
>> frankly I would take to be pretty obvious and basic stuff (yet pointless
>> in most scenario's where you have to cater for possible bans and
>> de-authorisations) - storing query results in a session cache is only
>> ever useful in one distinct scenario, when the results of that query are
>> only valid for the owner of the session, and only for the duration of
>> that session, nothing more, nothing less. This is a one in a million
>> scenario.
>>
>> Bloated frameworks, most of the time they are not bloated, especially
>> when you use them properly and only include what you need on a need to
>> use basis; then the big framework can only be considered a class or two.
>> Sure the codebase seems more bloated, but at runtime it's easily
>> negated. You can use these frameworks for any size project, enterprise
>> included, provided you appreciated the strengths and weaknesses of the
>> full tech stack at your disposal. Further, especially on enterprise
>> projects it makes sense to drop development time by using a common
>> framework, and far more importantly, to have a code base developers know
>> well and can "hit the ground running" with.
>>
>> Generally unless you have unlimited learning time and practically zero
>> budget constraints frameworks like the ones you mentioned should always
>> be used for large team enterprise applications, although perhaps
>> something more modular like Zend is suited. They also cover your own
>> back when you are the lead developer, because on the day when a more
>> experienced developer than yourself joins the project and points out all
>> your mistakes, you're going to feel pretty shite and odds are very high
>> that the project will go sour, get fully re-written or you'll have to
>> leave due to "stress" (of being wrong).
>>
>> > I would counter and say that if you are NOT optimizing every little dr=
op
>> of
>> > performance from your scripts, then you're either not running a site
>> > sufficiently large enough to matter, or you're doing your customers a
>> > disservice.
>>
>> Or you have no grasp of the tech stack available and certainly aren't
>> utilizing it properly; I'm not suggesting that knowing how to use your
>> language of choice well is a bad thing, it's great; knock yourself out.
>> However, suggesting that optimising a php script for microseconds will
>> boost performance in large sites (nay, any site) shows such a loss of
>> focus that it's hard to comprehend.
>>
>> By also considering other posts from yourself (in reply to this and
>> other threads) I can firmly say the above is true of you.
>>
>> Optimisation comes down to running the least amount of code possible,
>> and only when really needed. If you are running a script / query /
>> process which provides the same output more than once then you are not
>> optimising. This will be illustrated further down this reply perfectly.
>>
>> The web itself is the ultimate scalable distributed application known to
>> man, and has been guided and created by those far more knowledgeable
>> than you or I (Berners-Lee, Fielding, Godel, Turing et al), everything
>> you need is right there (and specifically in HTTP). Failing to leverage
>> this is where a lack of focus and scope comes in to play, especially
>> with large scale sites, and means you are doing your customers a
>> disservice.
>>
>> For anything where the output can be used more than once, (at a granular
>> level), the output should be cached.
>>
>> For example, if you run SELECT / UPDATE/INSERT queries at a ratio any
>> higher than 1 SELECT per UPDATE/INSERT then you *will* get a sizeable
>> performance upgrade by caching the output. Another less granular example
>> would be a simple "blog", you can generated the page every time, or you
>> can only "publish" the page every time the post is updated or a comment
>> is added; and thus you can leverage file system cache's which most
>> operating systems have now, and http server caching, and HTTP caching
>> itself by utilizing last-modified; etags and having 304 not modified
>> returned for any repeat requests.
>>
>> > I come from the video game world where gaining a frame or two of
>> animation
>> > per second matters. It makes your game feel less choppy and more fluid
>> and
>> > therefore more fun to play.
>>
>> Many lessons can be learned from the video game (and flash) worlds, but
>> these are generally just how to code well; most of the real
>> optimizations come from how you serialize data, minimise the amount of
>> output data + frequency at which it is sent; and moreover by compiler or
>> bytecode optimisations - some of this can cross over in to PHP world,
>> but not much since it's interpreted rather than compiled, and even less
>> since the same code isn't run hundreds of time per second - and if it
>> is; you are normally doing something wrong (in all but the most specific
>> of cases).
>>
>> > If I have to wait 3 seconds for a page to render, that wait is
>> noticeable.
>> > Dumb users will click refresh, and since (unbelievably in this day and
>> age)
>> > PHP and mySQL don't know the user clicked 'stop' or 'refresh', and
>> > therefore mySQL will execute the same query a second time. That's an
>> > entirely different thread I've already ranted on about.
>>
>> Render time is a totally different subject, since css/images/javascript
>> and more come in to play, not to mention the users browser and machine
>> spec. This is usually improved by including image width and height in
>> your html (negate this and the user agent has to "sniff" all images to
>> get their dimensions before layout can be calculated and later
>> rendered), using static shared stylesheets which can be returned as 304
>> not modified; and including client-side scripts as deferred or after the
>> main body of content (hence why google analytics specifies the placing
>> of their javascript just before the tag).
>>
>> Now if there was one sentence in all of the recent posts which conveys
>> the amount of misunderstanding at play here, it's this one: "Dumb users
>> will click refresh, and since (unbelievably in this day and age) PHP and
>> mySQL don't know the user clicked 'stop' or 'refresh', and therefore
>> mySQL will execute the same query a second time."
>>
>> No no no no no! Unbelievably in this day and age developers are still
>> creating systems where the "same queries" (implying the same output) can
>> be executed by something as foreign a second time (and indeed multiple
>> times).
>>
>> If you learn anything from this, learn that this is the crux of the
>> failings, the output of that query, at the very least, should be cache'd
>> - thankfully your rdmbs is partially saving your ass half the time by
>> using it's own cache.
>>
>> PHP and MySQL are not being dumb here, you are in *full* control of what
>> happens in your application, and if you have it set up so that the same
>> things, producing the same results, are being run time after time, then
>> more fool you. That output should be saved, in memory or file, and used
>> the second time; ideally that full view (if accessed generally more than
>> once) should be persisted so that it can be served statically until part
>> of the view needs updated; then regenerate and repeat.
>>
>> > If you can shave off 0.1s from each row of a query result, after only =
10
>> > rows, you've saved the user 1 full second. But realistically, you are
>> most
>> > likely displaying hundreds (or in my case, thousands) of rows. Now I'v=
e
>> > just saved this user 10s to 100s (that's a minute and a half!)
>>
>>
>> O.M.G. am I reading these numbers correctly? shave off 0.1 seconds from
>> each row? saving the user 10-100 seconds? Just how are you coding these
>> applications!
>>
>>
>> In my world, if a "heavy" script is taking any more than 0.1 seconds to
>> run in it's entirety we have a problem; honestly, I'm unsure what to
>> write here - the only constructive thought I have is, why don't we have
>> a "PHP week" on the list; where a standard application is created; then
>> we optimise the hell out of it and catalogue what was done for all to se=
e.
>>
>> We'd need:
>> 2 temporary servers (one web, one db : any spec)
>> 1 donated "application" w/ data
>>
>> I'd be up for it; and would be interested to see who just quick we can
>> make the thing between us all.
>>
>> Would suggest a few test scripts where made to call a series of
>> operations, user paths as it were, then run it through ab and get some
>> numbers.
>>
>> > I'm dealing with TB databases with billions of rows and complex querie=
s
>> > that would make you (and often times me too) cringe in fright. Sure, i=
f
>> > you're dealing with your who-gives-a-shit "blog" website and all 20
>> entries
>> > of crap-nobody-cares-about, then do whatever you want. But if you're
>> doing
>> > professional, enterprise level work, or have real customers who expect
>> > performance, then you sure as hell better be considering all the ways =
to
>> > speed up your page. They don't run in a vacuume. They don't just have =
a
>> > single query.
>>
>> no comment; I'm doing the same and have done for years; and the words
>> you are coming out with just don't add up - if you are on TB datasets
>> why the hell are you using RDBMS and php/mysql?? you need to be on to
>> non relational databases; and considering the hadoops of the world.
>>
>> Suffice to say, if you have a complex query - something is vastly wrong
>> with the full architecure and system design.
>>
>> all from experience.
>>
>> Finally, reading through the list posts from the last week or two I've
>> become rather concerned about just how much disinformation and lack of
>> understanding is floating about. Many of the long time posters on this
>> list who do know better have either kept quiet or not covered the points
>> properly, whilst many more have been baited in to discussing questions
>> and points which have no answer, because they are the wrong questions to
>> be asking in the first place.
>>
>> Times like this call for a smart-ass, and today I'll be that smart-ass;
>> not because I want to be labelled as such, but so that the other
>> knowledgeable people on the list can hook up on anything I've got wrong
>> and challenge it; and hopefully, ultimately, we'll have a full positive
>> thread that all can read and gain positive insight from as to how to use
>> PHP and leverage the full stack of technologies we have available to
>> address most (if not all) the points raised recently.
>>
>> And Daevid, specifically, don't think for a minute these aren't learning
>> curves many of us have taken - skip back a couple of years, look through
>> the posts, and you'll find another developer banging on about threads in
>> php and optimising for micro-seconds ;)
>>
>> Many Regards,
>>
>> Nathan
>>
>> --
>> PHP General Mailing List (http://www.php.net/)
>> To unsubscribe, visit: http://www.php.net/unsub.php
>>
>>
>
>
> --
> haXe - an open source web programming language
> http://haxe.org
>



--=20

WWW: http://plphp.dk / http://plind.dk
LinkedIn: http://www.linkedin.com/in/plind
Flickr: http://www.flickr.com/photos/fake51
BeWelcome: Fake51
Couchsurfing: Fake51


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Re: optimizing PHP for microseconds

am 29.03.2010 11:19:02 von Nathan Rixham

Bastien Helders wrote:
> I have a question as a relatively novice PHP developper.
>
> Let's say you have this Intranet web application, that deals with the
> generation of file bundles that could become quite large (let say in the 800
> MB) after some kind of selection process. It should be available to many
> users on this Intranet, but shouldn't require any installation. Would it be
> a case where optimizing for microseconds would be recommended? Or would PHP
> not be the language of choice?
>
> I'm not asking to prove that there could be corner case where it could be
> useful, but I am genuinely interested as I am in the development of such a
> project, and increasing the performance of this web application is one of my
> goal.

Hi Bastien,

A good question, and a good use-case.

Firstly, to clarify (generally speaking), "optimising for microseconds"
is a real thing, but not how it's been conveyed previously. There is a
big difference between knowing your language / target of choice (e.g.
creating fast code); and the real "optimising for microseconds" which is
shaving off every microsecond possible once all other routes of
optimisation have been taken (and where it is needed).

There is always a case for creating code that executes quickly, that is
a big part of our job - but worrying about microseconds and completely
disregarding forms of optimisation in the full tech stack that shave of
hours of runtime per day isn't the best course of action :)

On to your specific use-case. It's all relative and without all the
details I can't really give an accurate opinion!

PHP can easily be leveraged to use the file system in order to create
the bundle too, move files over to a temp directory; tar/gzip everything
up and then redirect the user to (or store) the location of said file.
Taking an approach like this will considerably lower the amount of
resources consumed by php / web server and thus keep the system speedy
for all (and is more than likely quicker). And obviously all future hits
to said bundle won't need to touch php.

If you don't want to keep the user waiting then you could save the
instructions needed and have a cron job or daemon pick up on them and
then notify the user(s) when the file has been created.

Many, many approaches - generally speaking this is the best advice I can
give:

1: always look for the tech that has been designed to do the job you
need, and then use it - if possible.

2: test, time & measure. Try different ways of dealing with the
"heaviest" bit, get the numbers and take a note of what processes it
impacts - then compare.

For example if 3 users run the script at the same time with PHP doing
the heavy lifting, will it max out the processor or push the web server
in to using swap memory? If something fails (a known / handled
exception) then does the process take a double hit and need run a second
time?

You are your own best friend with this one, take time out, negate php
for a moment and just think of the fastest way to do what you need (all
things considered) then go try it out - PHP can still be the controller
for any factored out processes :)

Do hope that helps,

Nathan

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php