Message3949
On Sun, Jan 03, 2010 at 02:25:06PM +0000, Stefan Seefeld wrote:
> > - lots of sql queries
> > - memory allocation for the cache
> >
> > only the first will be addressed by the proposed patch...
> > Maybe we return an iterator from getnodes?
>
> I agree that an iterator is a good idea. Though the bottleneck we
> observed (and which let me to propose a change in the API) is due to IPC
> overhead caused by many individual queries, instead of few fused ones.
I've also done some measurements. Roundup uses an internal node cache of
100 items by default (this is a config option). I've done some
experiments with a large query (which returns about 140000 items):
Cachesize Time (minutes)
2 1:54-2:36
10 1:29-1:35 (only 2 Measurements)
100 1:14-1:37
100000 1:23-1:35 (usually at the lower end)
so we see diminishing returns on larger cache sizes inside roundup.
The number of queries when going from cache-size 100 to 100000 goes down
from 240000 to 140000 (!) due to no re-querying the same nodes (I'm
using some of the nodes more than once in the computation). The time
requirement is memory-bound.
In all cases the initial SQL query that returns all the IDs takes less
than 1 second real-time.
So I guess my example can really profit from using an iterator --
especially since the whole database is in RAM because I configured the
PostgreSQL cache large enough. In these cases the database thread and
the roundup thread consuming the query data can run in parallel.
Ralf
--
Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16
Open Source Consulting Fax: +43/2243/26465-23
Reichergasse 131 www: http://www.runtux.com
A-3411 Weidling email: office@runtux.com
osAlliance member email: rsc@osalliance.com |
|
Date |
User |
Action |
Args |
2010-01-03 15:13:33 | schlatterbeck | set | recipients:
+ schlatterbeck, richard, stefan, rawler |
2010-01-03 15:13:33 | schlatterbeck | link | issue2550514 messages |
2010-01-03 15:13:32 | schlatterbeck | create | |
|