Roundup Tracker - Issues

Issue 1012781

classification
Extremely inefficient full text search query
Type: Severity: normal
Components: Database Versions:
process
Status: closed fixed
:
: richard : marlonvdb, richard
Priority: normal :

Created on 2004-08-20 09:14 by anonymous, last changed 2004-10-08 01:29 by richard.

Messages
msg1428 Author: [hidden] (anonymous) Date: 2004-08-20 09:14
Sometimes the full text search creates queries of the
following form (1):

select distinct(id) from _issue, issue_files,
issue_messages where __retired__ <> %s and
(((id=issue_files.nodeid and issue_files.linkid in
(%s))) or ((id=issue_messages.nodeid and
issue_messages.linkid in (%s,...,%s))))

and sometimes of this form (2):

select distinct(id) from _issue, issue_messages where
__retired__ <> %s and id=issue_messages.nodeid and
issue_messages.linkid in (%s,...,%s))

Form (1) queries are extremely inefficient and can take
minutes to execute (at least with the sqlite backend).
Here is an example query that can be tested using the
sqlite command line:

select distinct(id) from _issue, issue_files,
issue_messages where __retired__ <> 1 and
(((id=issue_files.nodeid and issue_files.linkid in
('4'))) or ((id=issue_messages.nodeid and
issue_messages.linkid in ('1789'))));

The query is slow because it is fetching the cartesian
product of issue_files and issue_messages which takes a
lot of time.

However, when I split it into two queries:

select id from _issue, issue_files where __retired__ <>
1 and (((id=issue_files.nodeid and issue_files.linkid
in ('4'))));

select id from _issue, issue_messages where __retired__
<> 1 and (((id=issue_messages.nodeid and
issue_messages.linkid in ('1789'))));

I can leave the "distinct" away and both queries return
in practically zero time.

Using "union", we can put the two queries together into
a single efficient one again:

select id from _issue, issue_files where __retired__ <>
1 and (((id=issue_files.nodeid and issue_files.linkid
in ('4')))) union select id from _issue, issue_messages
where __retired__ <> 1 and (((id=issue_messages.nodeid
and issue_messages.linkid in ('1789'))));
msg1429 Author: [hidden] (marlonvdb) Date: 2004-09-02 13:25
Logged In: YES 
user_id=1080231

We're experiencing something like this too. Our backend is 
MySQL (on WinNT 4), and if we do an 'All text' search, the 
mysql server will claim 99% of the CPU usage. It looks like it 
has entered an endless loop too. After 1 hour, it was still 
busy with the query and we needed to reboot the server.

Regards,
Marlon
msg1430 Author: [hidden] (richard) Date: 2004-10-08 01:29
Logged In: YES 
user_id=6405

Fixed, thanks for the detective work. 
History
Date User Action Args
2004-08-20 09:14:26anonymouscreate