Roundup Tracker - Issues

Issue 2550725

classification
VARCHAR in multilink tables on MySQL
Type: resource usage Severity: major
Components: Database Versions: 1.4
process
Status: new
:
: : richard, rouilj, schlatterbeck, whunger
Priority: :

Created on 2011-09-06 12:02 by whunger, last changed 2019-10-28 20:27 by whunger.

Messages
msg4416 Author: [hidden] (whunger) Date: 2011-09-06 12:02
We are using Roundup with the MySQL backend for a tracker instance with
almost half a million messages and about 32.000 issues. With this amount
of data, we saw a degradation of performance over time, mainly because
MySQL was unable to answer more complex filter queries without resorting
to table scans and on-disk sorting.

Wondering, why this should be necessary after all, i found the problem
to be the column data types in multilink tables - they are declared as
varchar(255) instead of integer. This, for sure, makes many queries
quite expensive, as e. g. joins require on-the-fly type conversions and
can't make use of the sort order in indexes and the table itself.

In our tracker instance, i changed the type to integer for all the
existing multilink tables, and the performance improvement really makes
a big difference.

Now i would like to ask: Is there any good reason for having linkid and
nodeid declared as a varchar(255) on MySQL and not going with the
default in rdbms_common?

    def create_multilink_table(self, spec, ml):
        sql = '''CREATE TABLE `%s_%s` (linkid VARCHAR(255),
            nodeid VARCHAR(255)) ENGINE=%s'''%(spec.classname, ml,
                self.mysql_backend)
        self.sql(sql)
        self.create_multilink_table_indexes(spec, ml)
msg4417 Author: [hidden] (schlatterbeck) Date: 2011-09-07 07:00
Richard, I guess the varchar in multilink tables is has historical
reasons (ID types used to be strings for some time). Do you remember the
reason for special-casing this in the mysql backend? Do you see things
that would break if we change this?

Ralf
msg4418 Author: [hidden] (richard) Date: 2011-09-07 07:40
Yes, ids stored as strings is a historical artefact going all the way back to when Roundup was 
implemented over Berkley DB style databases. I can think of no reason why the mysql backend 
should be special-cased like it is. Perhaps the mysql backend changelog might shed some light?
msg5650 Author: [hidden] (rouilj) Date: 2016-06-27 02:23
Open question if we make this change it breaks all mysql installations
right?

Is there some way we can change this for new installs and keep it varchar
for existing installs? (With the option for the existing installs
to do the conversion and get the newer schema).

Would an update to the migrate code in roundup-admin migrate be able to
do the conversion?
msg5680 Author: [hidden] (whunger) Date: 2016-06-28 21:50
> Open question if we make this change it breaks all mysql installations
> right?

If i remember correctly, the change would not break existing systems,
because it's only the multilink association tables that have the VARCHAR
type. I would almost go so far to call this a bug that is hidden by
automatic type conversion in MySQL.

I did not check again by looking at the current code, though.
msg5825 Author: [hidden] (rouilj) Date: 2016-07-10 23:43
Looked at the hg history for changes to lines in that function.
Looks like it has always been varchar. To get some
stuff working it required the bdb engine at first.
Maybe that's why it's a varchar?

In any case at this point we should only be using innodb.

Werner, was this a "simple":

ALTER TABLE tablename_of_mutillink MODIFY linkid INTEGER;
ALTER TABLE tablename_of_mutillink MODIFY nodeid INTEGER;

followed by a call to self.create_multilink_table_indexes(spec, ml)
or maybe just a reindex for the l_idx tables at the mysql level?
msg6702 Author: [hidden] (rouilj) Date: 2019-10-06 23:36
Werner, it may be too long ago for you to answer my question, but

====
was this a "simple":

ALTER TABLE tablename_of_mutillink MODIFY linkid INTEGER;
ALTER TABLE tablename_of_mutillink MODIFY nodeid INTEGER;

followed by a call to self.create_multilink_table_indexes(spec, ml)
or maybe just a reindex for the l_idx tables at the mysql level?
====

-- rouilj
msg6708 Author: [hidden] (whunger) Date: 2019-10-07 10:40
This is indeed long ago. I think i did it all directly in MySQL, i can't remember calling create_multilink_table_indexes().

I will try and see if i can find what i used when back at the office in a few days. The ALTER statements look very familiar in any case.
msg6768 Author: [hidden] (rouilj) Date: 2019-10-25 00:13
Hi Werner,

Were you able to track down what you did?

Have a great week. 

-- rouilj
msg6781 Author: [hidden] (whunger) Date: 2019-10-28 20:27
Thanks for reminding me!

> Were you able to track down what you did?

No, there's no SQL file or log left, sorry. I think i dropped the
indexes, altered the columns manually, then recreated the indexes, all
directly in MySQL.

This is what we are using in back_mysql.py since 2011:

    def create_multilink_table(self, spec, ml):
        sql = '''CREATE TABLE `%s_%s` (linkid INTEGER,
            nodeid INTEGER) ENGINE=%s'''%(spec.classname, ml,
                self.mysql_backend)
        self.sql(sql)
        self.create_multilink_table_indexes(spec, ml)

I went through our backend code changes, and this one seems to be the
only one regarding the linkid/nodeid types.
History
Date User Action Args
2019-10-28 20:27:27whungersetmessages: + msg6781
2019-10-25 00:13:23rouiljsetmessages: + msg6768
2019-10-07 10:40:31whungersetmessages: + msg6708
2019-10-06 23:36:17rouiljsetmessages: + msg6702
2016-07-10 23:43:06rouiljsetmessages: + msg5825
2016-06-28 21:50:06whungersetmessages: + msg5680
2016-06-27 02:23:36rouiljsetnosy: + rouilj
messages: + msg5650
2011-09-07 07:40:56richardsetmessages: + msg4418
2011-09-07 07:00:55schlatterbecksetnosy: + richard, schlatterbeck
messages: + msg4417
2011-09-06 12:02:14whungercreate