Roundup Tracker - Issues

Issue 2551245

classification
Add index creation hint to schema definition
Type: resource usage Severity: normal
Components: Database, API Versions: devel
process
Status: new
:
: : rouilj, schlatterbeck
Priority: normal : Effort-Medium

Created on 2022-11-26 09:47 by schlatterbeck, last changed 2024-02-28 12:20 by schlatterbeck.

Messages
msg7687 Author: [hidden] (schlatterbeck) Date: 2022-11-26 09:47
Currently roundup decides to produce indexes automatically. In some cases it makes sense that the user can specify indexes for some tables.

Proposed is the following parameter to the property definition that requests an index (proposed by John P. Rouillard):

 propname = String(addindex=True)

e.g.
msg = FileClass(db, "msg", ..., messageid=String(addindex=True), ...)

The implementation could check if all indexes exist and create missing ones if needed. Note that we do not want to remove unexpected indexes: The user might have added additional indexes by hand.
msg7688 Author: [hidden] (rouilj) Date: 2022-11-26 20:56
As a workaround a call can be made in initial_data.py:

     if hasattr (db, 'sql'):
        db.sql ('create index _issue_status_idx on _issue (_status);')
        ...

but this only works when the tracker is initialized. If any column is dropped
it will wipe out the index. For a single column this isn't an issue (there is
nothing to index). But it is unknown what happens to a multi-column index that
references a missing column.

Also the proposed addition only handle a single column index.
Multi-column indexes which might be needed for some use-cases (e.g to speed
up searches).
 
Ideally we would hook custom indexing into the schema update/create method.

However I wonder if we can provide a hook to run on schema change to
make this easier.

import roundup.backends.rdbms_common

class Database:

   create_supplimentary_indexes(classname, added_props[], ...):
       admin does his sql here.
       self.sql( <insert sql statement here> )

   drop_supplimentary_indexes(classname, dropped_props[], ...):
       admin does his sql here.
       self.sql( <insert sql statement here> )

and they gets called as part of update_class() and
create_class_table_indicies()/drop_class_table_indicies()?
msg7944 Author: [hidden] (rouilj) Date: 2024-02-19 23:59
Should this be a tri-value:

  True - add an index

  Unique - add an index with a unique constraint (issue 2551113)

  False - default don't add an index on the column

How should this look at the calling level? Strings "true", "unique" "false" isn't clean.
It allows misspelling compared to boolean True/False values enforced by the interpreter.
Maybe some type of enumeration (maybe python 3 only??) the interpreter can catch?
msg7949 Author: [hidden] (schlatterbeck) Date: 2024-02-28 12:20
I had replied by email to this but it bounced:
<issues@roundup-tracker.org>: host bugs.python.org[167.71.181.142] said: 454
    4.7.1 <issues@roundup-tracker.org>: Relay access denied (in reply to RCPT
    TO command)

Well here it goes:

On Mon, Feb 19, 2024 at 11:59:34PM +0000, John Rouillard wrote:
>
> Should this be a tri-value:
>
>   True - add an index
>   Unique - add an index with a unique constraint (issue 2551113)
>   False - default don't add an index on the column
>
> How should this look at the calling level? Strings "true", "unique"
> "false" isn't clean.  It allows misspelling compared to boolean
> True/False values enforced by the interpreter.  Maybe some type of
> enumeration (maybe python 3 only??) the interpreter can catch?

How about 'Yes, 'Unique', 'No'?
looks cleaner than two booleans plus 'Unique'?

Hmm, if we're thinking about index creation: How about indeces over
multiple columns? Maybe make this an attribute of a class not of a
property?

I'm usually putting these into initial_data.py in the top-level
directory of a tracker, something along the lines of

if hasattr(db, 'sql'):
    db.sql('create index _it_issue_status_idx on _it_issue (_status);')

But I *do* have more complicated ones like

    db.sql \
        ('create index _measurement_date_idx_ '
         'on _measurement using btree '
         '( _sensor'
         ', __retired__'
         ', (_measurement._date is not NULL) desc'
         ', _date desc'
         ', id'
         ');'
        )

I'm not sure we can generically formulate the latter one...

We might document index creation in sql as a permitted use-case...

Sometimes when tracking down a performance problem I'm experimenting
with indeces and most often document the result in initial_data.py but
there it is almost never really tested... because I'm dynamically
creating the index on an installed tracker.

Ralf
History
Date User Action Args
2024-02-28 12:20:45schlatterbecksetmessages: + msg7949
2024-02-19 23:59:34rouiljsetmessages: + msg7944
2023-11-06 03:39:05rouiljsettitle: Add index creation to schema definition -> Add index creation hint to schema definition
2022-11-26 20:56:24rouiljsetkeywords: + Effort-Medium
messages: + msg7688
2022-11-26 09:47:38schlatterbeckcreate