Roundup Tracker - Issues

Issue 2550839

classification
Xapian, DatabaseLockError: Unable to get write lock on db/text-index: already locked
Type: crash Severity: normal
Components: Versions: devel, 1.5
process
Status: fixed fixed
:
: rouilj : ThomasAH, ber, ced, rouilj
Priority: normal : patch

Created on 2014-04-29 11:04 by ThomasAH, last changed 2016-07-13 23:47 by rouilj.

Messages
msg5092 Author: [hidden] (ThomasAH) Date: 2014-04-29 11:04
I got an error similar to the reports on
http://sourceforge.net/p/roundup/mailman/message/32025543/
https://mail.python.org/pipermail/python-dev/2012-April/118498.html
with Roundup devel (changeset e56047711df2 from April 2014).

Traceback (most recent call last):
  File "/home/roundup/current/lib/python/roundup/cgi/client.py", line
469, in inner_main
    html = self.handle_action()
  File "/home/roundup/current/lib/python/roundup/cgi/client.py", line
1243, in handle_action
    return action_klass(self).execute()
  File "/home/roundup/current/lib/python/roundup/cgi/actions.py", line
39, in execute
    return self.handle()
  File "/home/roundup/current/lib/python/roundup/cgi/actions.py", line
620, in handle
    message = self._editnodes(props, links)
  File "/home/roundup/current/lib/python/roundup/cgi/actions.py", line
463, in _editnodes
    newid = self._createnode(cn, props)
  File "/home/roundup/current/lib/python/roundup/cgi/actions.py", line
518, in _createnode
    return cl.create(**props)
  File
"/home/roundup/current/lib/python/roundup/backends/rdbms_common.py",
line 2981, in create
    content, mime_type)
  File
"/home/roundup/current/lib/python/roundup/backends/indexer_xapian.py",
line 58, in add_text
    database = self._get_database()
  File
"/home/roundup/current/lib/python/roundup/backends/indexer_xapian.py",
line 20, in _get_database
    return xapian.WritableDatabase(index, xapian.DB_CREATE_OR_OPEN)
  File "/usr/lib/python2.7/dist-packages/xapian/__init__.py", line 4303,
in __init__
   
_xapian.WritableDatabase_swiginit(self,_xapian.new_WritableDatabase(*args))
DatabaseLockError: Unable to get write lock on /path/to/db/text-index:
already locked

I used the tracker via the web interface to enter a single-line comment and
change the assignedto attribute. I did not get an error message in the web
interface and mails to other people in the nosy got sent without a problem,
too.

Adding a message to the same shortly before and a little bit later worked
fine.

Searching for a single word that appears in this issue only in this one
message using the "All test" field works.

The system is Debian wheezy x86, relevant packages are:
python-xapian 1.2.12-2
libxapian22 1.2.12-2
python2.7 2.7.3-6+deb7u2
postgresql-9.1 9.1.12-0wheezy1
msg5093 Author: [hidden] (ThomasAH) Date: 2014-04-29 11:08
It seems another user tried to submit changes in the same second (without
success, as the error 500 indicates):

[29/Apr/2014:11:02:44 +0200] "POST /thetracker/issue2362 HTTP/1.0" 302 603
[29/Apr/2014:11:02:44 +0200] "POST /thetracker/issue2355 HTTP/1.0" 500 401
[29/Apr/2014:11:02:44 +0200] "GET
/thetracker/issue2362?@ok_message=msg%2020537%20created%0Aissue%202362%20assignedto%2C%20messages%20edited%20ok&@template=item
HTTP/1.0" 200 4802
msg5094 Author: [hidden] (ThomasAH) Date: 2014-04-29 11:15
I asked the other user: She is not sure, but thinks she got an "Internal
Server Error" page, not a roundup error box.

I did not find an error message in access.log or error.log of apache2
for this incident.
msg5157 Author: [hidden] (ced) Date: 2014-11-26 10:07
I got the same issue from time to times especially with the mail gateway
when roundup receives many emails at the time.
msg5836 Author: [hidden] (rouilj) Date: 2016-07-11 23:40
I am planning on committing the following patch:

     def _get_database(self):
         index = os.path.join(self.db_path, 'text-index')
-        return xapian.WritableDatabase(index, xapian.DB_CREATE_OR_OPEN)
+        for n in range(10):
+            try:
+                # if successful return
+                return xapian.WritableDatabase(index,
xapian.DB_CREATE_OR_OPEN)
+            except xapian.DatabaseLockError as e:
+                # adaptive sleep. Get longer as count increases.
+                time_to_sleep = 0.01 * (2 << min(5, n))
+                time.sleep(time_to_sleep)
+                # we are back to the for loop
+
+        # Get here only if we dropped out of the for loop.
+        raise xapian.DatabaseLockError("Unable to get lock after 10
retries on %s."%index)

basically on lock error retry 10 times and back off. If I fail
generate a DatabaseLockError.

Thoughts?

-- rouilj
msg5846 Author: [hidden] (rouilj) Date: 2016-07-13 23:47
Committed the patch.

See: 93832cec4c31

The best I could for testing was to run:

./run_tests.py -k Xapian test/test_indexer.py &
./run_tests.py -k Xapian test/test_indexer.py

and confirmed that one of the processes seemed to hang on a test and
then threw a lock failure error.

If anybody knows how to test this and do controlled parallel
process invocation under pytest, I would love to figure out
how to code it.

I decided not to add a user configurable delay/number of cycles.
Hopefully the 10 cycles in the code will do the trick. If not
I suspect we need a better way to handle the error than a retry.
Simply increasing the number of cycles is not the way to do it.
History
Date User Action Args
2016-07-13 23:47:51rouiljsetstatus: new -> fixed
resolution: fixed
messages: + msg5846
2016-07-11 23:40:56rouiljsetkeywords: + patch
assignee: rouilj
messages: + msg5836
nosy: + rouilj
2014-12-11 09:00:14bersetnosy: + ber
title: DatabaseLockError: Unable to get write lock on db/text-index: already locked -> Xapian, DatabaseLockError: Unable to get write lock on db/text-index: already locked
2014-11-26 10:07:24cedsetnosy: + ced
messages: + msg5157
2014-04-29 11:15:20ThomasAHsetmessages: + msg5094
2014-04-29 11:08:48ThomasAHsetmessages: + msg5093
2014-04-29 11:04:11ThomasAHcreate