Roundup Tracker - Issues

Message7261

Author rouilj
Recipients rouilj
Date 2021-06-07.13:36:17
Message-id <1623072978.21.0.923795997756.issue2551142@roundup.psfhosted.org>
In-reply-to
User upgrading and changing database is importing an export file.
Trying to import an export they get:

   IntegrityError: UNIQUE constraint failed: _user.__retired__, 
_user._username

They have multiple retired users with the same username.

As a workaround, sorting the input user.csv file by (username, retired)
so that all retired=true values are first for a given username works.

The import has two steps in rdbms based systems (this issue
doesn't happen in anydbm). The key is username.

  1. create a new node that sets the unique composite index (key,
     __retired__) where __retired__ has the default value of 0.
  2. retire it by updating the unique composite index (key
     __retired__) setting __retired__ to the id.

If I have an export file ordered like:

  (2021, 6, 5, 22, 17, 28.615, 0, 0, 
0):'1':'dupe@example.com':...'22':...=
:'duplicate':True

  (2021, 6, 5, 22, 20, 5.85, 0, 0, 
0):'1':'dupl@example.com':...:'24':...:=
'duplicate':False

it will import correctly. As the unique index will see:

   duplicate, 0  (id 22)
   duplicate, 22 (id 22)
   duplicate, 0  (id 24)

However if the active entry is imported first:

  (2021, 6, 5, 22, 20, 5.85, 0, 0, 
0):'1':'dupl@example.com':...:'24':...:=
'duplicate':False

  (2021, 6, 5, 22, 17, 28.615, 0, 0, 
0):'1':'dupe@example.com':...'22':...=
:'duplicate':True

the unique index sees:

  duplicate,0   (id 24)
  duplicate,0   (id 22)  # conflict

and we get the error. But how do we fix thing for the future? I
think reusing a username is an edge case (and confusing), but we
should handle this better.

I can change the export to sort by (id, retired). Sorting by id
on the assumption that the active entry is the newest entry seems
a dangerous assumption, so sort on the tuple. That should fix it
for the future.

But this doesn't allow importing an unsorted/missorted export.
To read these:

1. Import could read an entire csv and sort properly (taking
possibly a large amount of memory). Not a great idea IMO.

2. Handle a retry when the exception is triggered. On exception,
   changing the non-retired index entry from
   (key1, 0) to (key1, -1). Then retry the failing insert.

When the retry succeeds, update the index for key1 back to 0. If
-1 doesn't work for some reason use 10000 or some other sentinel
number (that we hope is not a valid value for a retired user).

Or we could leave the -1 (sentry) value until all entries are
fully imported and do one update of the index changing -1 to
0. That is probably performs better.

3. The code could be rewritten to set the __retired__ property on
initial node creation, but that looks to need pretty invasive
changes.


Full thread at: https://sourceforge.net/p/roundup/mailman/roundup-
devel/thread/20210606192127.CF6986A0020%40pe15.cs.umb.edu/#msg37297018

Initial report/triage in irc.
History
Date User Action Args
2021-06-07 13:36:18rouiljsetrecipients: + rouilj
2021-06-07 13:36:18rouiljsetmessageid: <1623072978.21.0.923795997756.issue2551142@roundup.psfhosted.org>
2021-06-07 13:36:18rouiljlinkissue2551142 messages
2021-06-07 13:36:17rouiljcreate