Performance (lots of tables / databases...)

Performance (lots of tables / databases...)

am 27.09.2008 16:53:19 von Martin Zvarik

Hi,
I am working on a blog system and I am currently thinking of what would
be the best DB approach.

I have read lots about wordpress and other blog's optimizations and DB
structure, but I have not found any mention of having separate database
for each blog/user.

So, my question is, which one is performance better (talking about 1000
blogs):

a) 1000 blogs * 5 (let's say we will have tables like comments, post...
for each blog) = 5000 tables in one database
.... this is Wordpress default

b) 1000 databases (for each blog) each having 5 tables

c) 5 databases by 1000 tables - in this case, won't this be an issue
when SELECTing like this: [db_comments].testblog, [db_posts].testblog ?


Is that a controversial topic? :-/

Thanks for ideas,
Martin

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Performance (lots of tables / databases...)

am 27.09.2008 20:47:40 von danaketh

--------------030802080208010902080507
Content-Type: text/plain; charset=ISO-8859-2; format=flowed
Content-Transfer-Encoding: 8bit

Hi,

the first choice is probably the best for you. When you think about
second solution, it will be a nightmare when you have 1000+ databases
and have to administrate them from one central system (if you're about
to do it like this). The third solution looks little complicated to me -
have one DB for comments, one for items etc.

But you can do it also in one database and six tables. Make one
table 'blogs' where the blogs names and ids will be stored. Then you can
just add one more field 'blog_id' to every table and identify items,
categories, whatever on this. However in your situation (1000+ blogs) it
may be not the best solution.


Martin Zvarík napsal(a):
> Hi,
> I am working on a blog system and I am currently thinking of what
> would be the best DB approach.
>
> I have read lots about wordpress and other blog's optimizations and DB
> structure, but I have not found any mention of having separate
> database for each blog/user.
>
> So, my question is, which one is performance better (talking about
> 1000 blogs):
>
> a) 1000 blogs * 5 (let's say we will have tables like comments,
> post... for each blog) = 5000 tables in one database
> ... this is Wordpress default
>
> b) 1000 databases (for each blog) each having 5 tables
>
> c) 5 databases by 1000 tables - in this case, won't this be an issue
> when SELECTing like this: [db_comments].testblog, [db_posts].testblog ?
>
>
> Is that a controversial topic? :-/
>
> Thanks for ideas,
> Martin
>

--------------030802080208010902080507--

Re: Performance (lots of tables / databases...)

am 27.09.2008 21:19:40 von Lester Caine

danaketh wrote:
> Hi,
>
> the first choice is probably the best for you. When you think about
> second solution, it will be a nightmare when you have 1000+ databases
> and have to administrate them from one central system (if you're about
> to do it like this). The third solution looks little complicated to me -
> have one DB for comments, one for items etc.
>
> But you can do it also in one database and six tables. Make one table
> 'blogs' where the blogs names and ids will be stored. Then you can just
> add one more field 'blog_id' to every table and identify items,
> categories, whatever on this. However in your situation (1000+ blogs) it
> may be not the best solution.

That alternate option is the easiest to manage, but since you make no mention
of WHICH database engine which is best would be affected by that choice. And
how you are hosting the site(s) may also limit your options. Properly indexed
tables will be fast and a single connection accessing that data will be faster
than having to make multiple connections to different databases, and if it
only has to manage a small number of table most links would probably be cached.

--
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/lsces/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Performance (lots of tables / databases...)

am 28.09.2008 00:42:11 von Martin Zvarik

--------------010805020508000703010907
Content-Type: text/plain; charset=ISO-8859-2; format=flowed
Content-Transfer-Encoding: 8bit

Hello,

the solution you mentioned came up on my mind too, but as you also said
- it doesn't seem efficient on high load.

Why do you think having 1000+ databases would be a nightmare?
I think it would be easy to backup, fast to read/write... although I
don't know what would that cause to the system - trying to imagine 1000
folders... is it a problem?

I supposed that on the third solution is based the optimized wordpress
(wordpress.com) - it does seem complicated, but better than having it
all in one database.

Martin


danaketh napsal(a):
> Hi,
>
> the first choice is probably the best for you. When you think
> about second solution, it will be a nightmare when you have 1000+
> databases and have to administrate them from one central system (if
> you're about to do it like this). The third solution looks little
> complicated to me - have one DB for comments, one for items etc.
>
> But you can do it also in one database and six tables. Make one
> table 'blogs' where the blogs names and ids will be stored. Then you
> can just add one more field 'blog_id' to every table and identify
> items, categories, whatever on this. However in your situation (1000+
> blogs) it may be not the best solution.
>
>
> Martin Zvarík napsal(a):
>> Hi,
>> I am working on a blog system and I am currently thinking of what
>> would be the best DB approach.
>>
>> I have read lots about wordpress and other blog's optimizations and
>> DB structure, but I have not found any mention of having separate
>> database for each blog/user.
>>
>> So, my question is, which one is performance better (talking about
>> 1000 blogs):
>>
>> a) 1000 blogs * 5 (let's say we will have tables like comments,
>> post... for each blog) = 5000 tables in one database
>> ... this is Wordpress default
>>
>> b) 1000 databases (for each blog) each having 5 tables
>>
>> c) 5 databases by 1000 tables - in this case, won't this be an issue
>> when SELECTing like this: [db_comments].testblog, [db_posts].testblog ?
>>
>>
>> Is that a controversial topic? :-/
>>
>> Thanks for ideas,
>> Martin
>>

--------------010805020508000703010907--

Re: Performance (lots of tables / databases...)

am 28.09.2008 00:46:10 von Martin Zvarik

Lester Caine:
> danaketh wrote:
>> Hi,
>>
>> the first choice is probably the best for you. When you think about
>> second solution, it will be a nightmare when you have 1000+ databases
>> and have to administrate them from one central system (if you're about
>> to do it like this). The third solution looks little complicated to me
>> - have one DB for comments, one for items etc.
>>
>> But you can do it also in one database and six tables. Make one
>> table 'blogs' where the blogs names and ids will be stored. Then you
>> can just add one more field 'blog_id' to every table and identify
>> items, categories, whatever on this. However in your situation (1000+
>> blogs) it may be not the best solution.
>
> That alternate option is the easiest to manage, but since you make no
> mention of WHICH database engine which is best would be affected by that
> choice. And how you are hosting the site(s) may also limit your options.
> Properly indexed tables will be fast and a single connection accessing
> that data will be faster than having to make multiple connections to
> different databases, and if it only has to manage a small number of
> table most links would probably be cached.
>


Sorry, it's MySQL.

I am talking about ONE server => one connection
and then switching between databases.

Example: SELECT * FROM [db_comments].testblog, [db_posts].testblog
Would this be a performance issue?


Thanks for joining,
Marti

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Performance (lots of tables / databases...)

am 28.09.2008 01:22:21 von Jack van Zanen

------=_Part_38303_21793263.1222557741233
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

If it were Oracle I'd go with one database and separate schema for each
blog.
For Mysql I think I'd go for a database each blog.

Jack

2008/9/28 Martin Zvar=EDk

> Hi,
> I am working on a blog system and I am currently thinking of what would b=
e
> the best DB approach.
>
> I have read lots about wordpress and other blog's optimizations and DB
> structure, but I have not found any mention of having separate database f=
or
> each blog/user.
>
> So, my question is, which one is performance better (talking about 1000
> blogs):
>
> a) 1000 blogs * 5 (let's say we will have tables like comments, post... f=
or
> each blog) =3D 5000 tables in one database
> ... this is Wordpress default
>
> b) 1000 databases (for each blog) each having 5 tables
>
> c) 5 databases by 1000 tables - in this case, won't this be an issue when
> SELECTing like this: [db_comments].testblog, [db_posts].testblog ?
>
>
> Is that a controversial topic? :-/
>
> Thanks for ideas,
> Martin
>
> --
> PHP Database Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>


--=20
J.A. van Zanen

------=_Part_38303_21793263.1222557741233--

Re: Performance (lots of tables / databases...)

am 28.09.2008 06:45:12 von Tim Hawkins

If you are looking at future expansion then the separate DB per blog =20
is defiantly the way to go.

Some notes:

1) Avoid joins like the plague, in fact most operations on a blog =20
application would consist of getting primary record and then =20
decorating it with secondary data, for example getting a blog post and =20=

then getting the comments associated with it. Do two queries for this, =20=

dont join them.

2) Consider putting some intelligence into the database selection, =20
have a master table/db that holds the database/server details for each =20=

blog, It will allow you to spread your blog application across =20
multiple DB servers, as load goes up you can reassign blogs to =20
different machines. You can memcache the data from the master db on a =20=

per blog basis, so you wont take a hit on accessing it, but it gives =20
you great flexibility of redistributing data to different machines/=20
clusters.

3) Use memcache, its a life saver.

4) If user registrations are common across all blogs, have a separate =20=

db for the users, again you can "shard" this, use a hashing algorithm =20=

to allow sections of the user database to be split across multiple =20
user databases on multiple db servers. Again memcache the hell out of =20=

the user lookup, its a fixed id=3D>db/datarecord mapping so its great =20=

for using memcache against as the mapping never changes.

5) Use an external indexer for any search functionality such as sphinx =20=

(http://www.sphinxsearch.com/), sphinx can index separate databases =20
and join the indices together to form a single distributed search, it =20=

also supports incremental indexing. Dont be tempted to use the mysql =20=

query system for searches.


On 28 Sep 2008, at 00:22, Jack van Zanen wrote:

> If it were Oracle I'd go with one database and separate schema for =20
> each
> blog.
> For Mysql I think I'd go for a database each blog.
>
> Jack
>
> 2008/9/28 Martin Zvar=EDk
>
>> Hi,
>> I am working on a blog system and I am currently thinking of what =20
>> would be
>> the best DB approach.
>>
>> I have read lots about wordpress and other blog's optimizations and =20=

>> DB
>> structure, but I have not found any mention of having separate =20
>> database for
>> each blog/user.
>>
>> So, my question is, which one is performance better (talking about =20=

>> 1000
>> blogs):
>>
>> a) 1000 blogs * 5 (let's say we will have tables like comments, =20
>> post... for
>> each blog) =3D 5000 tables in one database
>> ... this is Wordpress default
>>
>> b) 1000 databases (for each blog) each having 5 tables
>>
>> c) 5 databases by 1000 tables - in this case, won't this be an =20
>> issue when
>> SELECTing like this: [db_comments].testblog, [db_posts].testblog ?
>>
>>
>> Is that a controversial topic? :-/
>>
>> Thanks for ideas,
>> Martin
>>
>> --
>> PHP Database Mailing List (http://www.php.net/)
>> To unsubscribe, visit: http://www.php.net/unsub.php
>>
>>
>
>
> --=20
> J.A. van Zanen


--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Re: Performance (lots of tables / databases...)

am 28.09.2008 11:05:00 von Martin Zvarik

Thanks Tim, this helped me a lot.

Martin


Tim Hawkins napsal(a):
> If you are looking at future expansion then the separate DB per blog is
> defiantly the way to go.
>
> Some notes:
>
> 1) Avoid joins like the plague, in fact most operations on a blog
> application would consist of getting primary record and then decorating
> it with secondary data, for example getting a blog post and then getting
> the comments associated with it. Do two queries for this, dont join them.
>
> 2) Consider putting some intelligence into the database selection, have
> a master table/db that holds the database/server details for each blog,
> It will allow you to spread your blog application across multiple DB
> servers, as load goes up you can reassign blogs to different machines.
> You can memcache the data from the master db on a per blog basis, so you
> wont take a hit on accessing it, but it gives you great flexibility of
> redistributing data to different machines/clusters.
>
> 3) Use memcache, its a life saver.
>
> 4) If user registrations are common across all blogs, have a separate db
> for the users, again you can "shard" this, use a hashing algorithm to
> allow sections of the user database to be split across multiple user
> databases on multiple db servers. Again memcache the hell out of the
> user lookup, its a fixed id=>db/datarecord mapping so its great for
> using memcache against as the mapping never changes.
>
> 5) Use an external indexer for any search functionality such as sphinx
> (http://www.sphinxsearch.com/), sphinx can index separate databases and
> join the indices together to form a single distributed search, it also
> supports incremental indexing. Dont be tempted to use the mysql query
> system for searches.
>
>
> On 28 Sep 2008, at 00:22, Jack van Zanen wrote:
>
>> If it were Oracle I'd go with one database and separate schema for each
>> blog.
>> For Mysql I think I'd go for a database each blog.
>>
>> Jack
>>
>> 2008/9/28 Martin Zvarík
>>
>>> Hi,
>>> I am working on a blog system and I am currently thinking of what
>>> would be
>>> the best DB approach.
>>>
>>> I have read lots about wordpress and other blog's optimizations and DB
>>> structure, but I have not found any mention of having separate
>>> database for
>>> each blog/user.
>>>
>>> So, my question is, which one is performance better (talking about 1000
>>> blogs):
>>>
>>> a) 1000 blogs * 5 (let's say we will have tables like comments,
>>> post... for
>>> each blog) = 5000 tables in one database
>>> ... this is Wordpress default
>>>
>>> b) 1000 databases (for each blog) each having 5 tables
>>>
>>> c) 5 databases by 1000 tables - in this case, won't this be an issue
>>> when
>>> SELECTing like this: [db_comments].testblog, [db_posts].testblog ?
>>>
>>>
>>> Is that a controversial topic? :-/
>>>
>>> Thanks for ideas,
>>> Martin
>>>
>>> --
>>> PHP Database Mailing List (http://www.php.net/)
>>> To unsubscribe, visit: http://www.php.net/unsub.php
>>>
>>>
>>
>>
>> --
>> J.A. van Zanen
>

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php