Issue 2550839: Xapian, DatabaseLockError: Unable to get write lock on db/text-index: already locked - Roundup tracker

classification

Title:	Xapian, DatabaseLockError: Unable to get write lock on db/text-index: already locked
Type:	crash	Severity:	normal
Components:		Versions:	devel, 1.5

process

Status:	fixed	Resolution:	fixed
Dependencies		Superseder:
Assigned To:	rouilj	Nosy List:	ThomasAH, ber, ced, rouilj
Priority:	normal	Keywords:	patch

Created on 2014-04-29 11:04 by ThomasAH, last changed 2016-07-13 23:47 by rouilj.

Messages
msg5092	Author: [hidden] (ThomasAH)	Date: 2014-04-29 11:04
I got an error similar to the reports on http://sourceforge.net/p/roundup/mailman/message/32025543/ https://mail.python.org/pipermail/python-dev/2012-April/118498.html with Roundup devel (changeset e56047711df2 from April 2014). Traceback (most recent call last): File "/home/roundup/current/lib/python/roundup/cgi/client.py", line 469, in inner_main html = self.handle_action() File "/home/roundup/current/lib/python/roundup/cgi/client.py", line 1243, in handle_action return action_klass(self).execute() File "/home/roundup/current/lib/python/roundup/cgi/actions.py", line 39, in execute return self.handle() File "/home/roundup/current/lib/python/roundup/cgi/actions.py", line 620, in handle message = self._editnodes(props, links) File "/home/roundup/current/lib/python/roundup/cgi/actions.py", line 463, in _editnodes newid = self._createnode(cn, props) File "/home/roundup/current/lib/python/roundup/cgi/actions.py", line 518, in _createnode return cl.create(*props) File "/home/roundup/current/lib/python/roundup/backends/rdbms_common.py", line 2981, in create content, mime_type) File "/home/roundup/current/lib/python/roundup/backends/indexer_xapian.py", line 58, in add_text database = self._get_database() File "/home/roundup/current/lib/python/roundup/backends/indexer_xapian.py", line 20, in _get_database return xapian.WritableDatabase(index, xapian.DB_CREATE_OR_OPEN) File "/usr/lib/python2.7/dist-packages/xapian/__init__.py", line 4303, in __init__ _xapian.WritableDatabase_swiginit(self,_xapian.new_WritableDatabase(args)) DatabaseLockError: Unable to get write lock on /path/to/db/text-index: already locked I used the tracker via the web interface to enter a single-line comment and change the assignedto attribute. I did not get an error message in the web interface and mails to other people in the nosy got sent without a problem, too. Adding a message to the same shortly before and a little bit later worked fine. Searching for a single word that appears in this issue only in this one message using the "All test" field works. The system is Debian wheezy x86, relevant packages are: python-xapian 1.2.12-2 libxapian22 1.2.12-2 python2.7 2.7.3-6+deb7u2 postgresql-9.1 9.1.12-0wheezy1
msg5093	Author: [hidden] (ThomasAH)	Date: 2014-04-29 11:08
It seems another user tried to submit changes in the same second (without success, as the error 500 indicates): [29/Apr/2014:11:02:44 +0200] "POST /thetracker/issue2362 HTTP/1.0" 302 603 [29/Apr/2014:11:02:44 +0200] "POST /thetracker/issue2355 HTTP/1.0" 500 401 [29/Apr/2014:11:02:44 +0200] "GET /thetracker/issue2362?@ok_message=msg%2020537%20created%0Aissue%202362%20assignedto%2C%20messages%20edited%20ok&@template=item HTTP/1.0" 200 4802
msg5094	Author: [hidden] (ThomasAH)	Date: 2014-04-29 11:15
I asked the other user: She is not sure, but thinks she got an "Internal Server Error" page, not a roundup error box. I did not find an error message in access.log or error.log of apache2 for this incident.
msg5157	Author: [hidden] (ced)	Date: 2014-11-26 10:07
I got the same issue from time to times especially with the mail gateway when roundup receives many emails at the time.
msg5836	Author: [hidden] (rouilj)	Date: 2016-07-11 23:40
I am planning on committing the following patch: def _get_database(self): index = os.path.join(self.db_path, 'text-index') - return xapian.WritableDatabase(index, xapian.DB_CREATE_OR_OPEN) + for n in range(10): + try: + # if successful return + return xapian.WritableDatabase(index, xapian.DB_CREATE_OR_OPEN) + except xapian.DatabaseLockError as e: + # adaptive sleep. Get longer as count increases. + time_to_sleep = 0.01 * (2 << min(5, n)) + time.sleep(time_to_sleep) + # we are back to the for loop + + # Get here only if we dropped out of the for loop. + raise xapian.DatabaseLockError("Unable to get lock after 10 retries on %s."%index) basically on lock error retry 10 times and back off. If I fail generate a DatabaseLockError. Thoughts? -- rouilj
msg5846	Author: [hidden] (rouilj)	Date: 2016-07-13 23:47
Committed the patch. See: 93832cec4c31 The best I could for testing was to run: ./run_tests.py -k Xapian test/test_indexer.py & ./run_tests.py -k Xapian test/test_indexer.py and confirmed that one of the processes seemed to hang on a test and then threw a lock failure error. If anybody knows how to test this and do controlled parallel process invocation under pytest, I would love to figure out how to code it. I decided not to add a user configurable delay/number of cycles. Hopefully the 10 cycles in the code will do the trick. If not I suspect we need a better way to handle the error than a retry. Simply increasing the number of cycles is not the way to do it.

History
Date	User	Action	Args
2016-07-13 23:47:51	rouilj	set	status: new -> fixed resolution: fixed messages: + msg5846
2016-07-11 23:40:56	rouilj	set	keywords: + patch assignee: rouilj messages: + msg5836 nosy: + rouilj
2014-12-11 09:00:14	ber	set	nosy: + ber title: DatabaseLockError: Unable to get write lock on db/text-index: already locked -> Xapian, DatabaseLockError: Unable to get write lock on db/text-index: already locked
2014-11-26 10:07:24	ced	set	nosy: + ced messages: + msg5157
2014-04-29 11:15:20	ThomasAH	set	messages: + msg5094
2014-04-29 11:08:48	ThomasAH	set	messages: + msg5093
2014-04-29 11:04:11	ThomasAH	create