Roundup Tracker - Issues

Issue 2550955

classification
Title: Roundup commits although a Reject exception is raised
Type: behavior Severity: critical
Components: Database, API Versions: devel
process
Status: new Resolution:
Dependencies: Superseder:
Assigned To: schlatterbeck Nosy List: rouilj, schlatterbeck
Priority: urgent Keywords: Blocker

Created on 2017-10-12 17:47 by schlatterbeck, last changed 2017-11-04 10:11 by schlatterbeck.

Messages
msg6028 Author: [hidden] (schlatterbeck) Date: 2017-10-12 17:47
In auditors we can raise a 'Reject' exception.
This should have the effect that changes are *NOT* committed to the
database.

In the current development version such an exception will commit changes
to the database anyway although an exception is raised.

I've gone back to the version e1e40674a0bc from end of 2016 -- I'm
running that version on a production system -- which doesn't show this
behaviour.

I haven't triaged this yet to find out when this change was introduced.

IMO this is release-critical as many auditors depend on the ability to
prevent erroneous changes to the database.

--
Ralf
msg6029 Author: [hidden] (rouilj) Date: 2017-10-13 00:02
Ralf, I am not seeing this with a working checkout of
5017c3422334 which is the tip.

I created an auditor that checks the title of the issue class.
If the word rejectme is in the title, it raises Reject.

I see the rejection in the web interface and I get the
bad title (with rejectme) redisplayed like I expect.

If I use the goto box to reload the same issue, I see the
original title and no history changes in the history.

Also I have an auditor for msg that check for more than 2 links in the 
body of the message. If I put in three, I get the reject and I do not 
have the message saved nor do I find the body of the message in the db 
directory.

How are you seeing the committed data that should be rejected?
I agree 100% what you describe is a critical failure.
msg6030 Author: [hidden] (schlatterbeck) Date: 2017-10-13 06:07
On Fri, Oct 13, 2017 at 12:02:43AM +0000, John Rouillard wrote:
> 
> Ralf, I am not seeing this with a working checkout of
> 5017c3422334 which is the tip.
> 
> I created an auditor that checks the title of the issue class.
> If the word rejectme is in the title, it raises Reject.
> 
> I see the rejection in the web interface and I get the
> bad title (with rejectme) redisplayed like I expect.
> 
> If I use the goto box to reload the same issue, I see the
> original title and no history changes in the history.
> 
> Also I have an auditor for msg that check for more than 2 links in the 
> body of the message. If I put in three, I get the reject and I do not 
> have the message saved nor do I find the body of the message in the db 
> directory.
> 
> How are you seeing the committed data that should be rejected?
> I agree 100% what you describe is a critical failure.

I have an application that has a "sign&send" button that calls a custom
web action. This does a commit(). Then several auditors are called which
should prevent the commit from going through. I will have to check the
exact path where this happens, I think I'm doing some actions on other
classes in the process.
The effect is that on the next sign&send I'm getting an assertion
violation because one change that shouldn't have made it through the
Reject was made.

I need to investigate this further, I've created the issue so that it
doesn't get lost. I feared it would not be reproduceable with a standard
setup...

I've already investigated to the point I'm sure this is a change in the
database (not just browser state). To get back to the expected state I
had to change a flag in the DB using psql...

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   http://www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg6031 Author: [hidden] (rouilj) Date: 2017-10-13 12:32
Hi Ralf:

In message <20171013060722.4ntvgkuw42hcpup2@runtux.com>,
Ralf Schlatterbeck writes:
>Ralf Schlatterbeck added the comment:
>On Fri, Oct 13, 2017 at 12:02:43AM +0000, John Rouillard wrote:
>> 
>> Ralf, I am not seeing this with a working checkout of
>> 5017c3422334 which is the tip.
>I have an application that has a "sign&send" button that calls a custom
>web action. This does a commit(). Then several auditors are called which
>should prevent the commit from going through. I will have to check the
>exact path where this happens, I think I'm doing some actions on other
>classes in the process.
>The effect is that on the next sign&send I'm getting an assertion
>violation because one change that shouldn't have made it through the
>Reject was made.
>
>I need to investigate this further, I've created the issue so that it
>doesn't get lost. I feared it would not be reproduceable with a standard
>setup...
>
>I've already investigated to the point I'm sure this is a change in the
>database (not just browser state). To get back to the expected state I
>had to change a flag in the DB using psql...

Hmm, what back end DB and what isolation level?
msg6032 Author: [hidden] (schlatterbeck) Date: 2017-10-13 14:33
On Fri, Oct 13, 2017 at 12:32:59PM +0000, John Rouillard wrote:
> Hmm, what back end DB and what isolation level?
Postgres, 
solation_level = read committed

But the config didn't change between versions. So something definitely
changed. I'm currently implementing a new feature in my tracker, once
that's finished I'll look more closely into this. Hopefully before
weekend :-)

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   http://www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg6036 Author: [hidden] (rouilj) Date: 2017-10-15 14:15
Ralf:

I just went through CHANGES.txt and

 issue2550826 - Capture some exceptions from auditors/reactors and
  raise a DetectorError instead.

 issue2550755 - exceptions.NotFound(msg) msg is not reported to user
  in cgi.

seem to touch exceptions directly:

 also I am not sure if:

  issue1408570 - Finally fix that form values are lost on edit
      exceptions.

may have changed something in the flow.

  Add RejectRaw exception to allow unescaped HTML error messages to be
  displayed to the user 

  Pythons cgi form code can return a TypeError, we now guard for this
  condition.

 also there were a bunch of changes to move from the:
    raise Exception, explanation to
    raise Exception(explanation)
and inherit from BaseException and not Exception for python 2 and 3 
support. This started at: 5248:bc16d91b7a50

maybe revert to bc16d91b7a50 and see if that is a quick fix

-- rouilj
msg6038 Author: [hidden] (schlatterbeck) Date: 2017-10-19 09:57
I've investigated a little further. In the working version calling the
db.close in main (around line 386 in cgi/client.py) will *not* commit
the wrong value that was set before a Reject occurs, while the latest
tip will commit the erroneous value at that point.

So it looks like this *only* occurs in the web-interface. I guess we
don't have too much test coverage in that area ;-)

Ralf
msg6039 Author: [hidden] (schlatterbeck) Date: 2017-10-19 18:02
I've triaged this now.
The bug happens first in 
5202:47bd81998ddc
interestingly it happens there only after the *second* sign&send call in
my application. This version has grown several db.commit() calls for
one-time key handling. Then 
5203:9f490cc0effe
grows another db.commit() and in that version the bug happens every time
for me.

My suggestion (forward recovery :-) is to add a new cursor for one-time
key and session handling. I've wanted to do this *forever* because I see
 *lots* of failing transactions due to colliding session updates in a
busy tracker. It also happens with some browsers when the browser is
restarted with lots of open windows and expired session info. So it's a
good idea to separate session and otk handling from data updates to the DB.

But first I want to create a test that reproduces the problem after I've
found out what's going on and why this doesn't happen in our tests.

My testing "framework" is quite simple: I have the database open in psql
and press "sign&send" which raises a Reject. Then I press "sign&send"
again. If this raises an assertion error a change was made to the DB
that shouldn't have gone through (Bug present). Then I undo the change
in psql and I'm ready for the next round :-)
msg6040 Author: [hidden] (rouilj) Date: 2017-10-20 23:44
Hi Ralf:

Ok, you have pinpointed the problem in my csrf code. It's probably
the commit's I am doing when creating the nonce or deleting the nonce 
after it is used in the templating frontend.
I thought those code paths were run and committed before the data was 
processed from the web interface.

I wonder if the problem happens when I generate the new nonce
for the reply the user. Maybe if a reject is raised, the database
session holds the rejected data and when I commit the nonce, I also
commit the rejected data?

Having a different session only for OTK/nonce db table updates
sounds like a good way to separate the two flows.

Alternatively if it is the commit for the creation of the new OTK,
maybe that creation and commit could be moved before any of the user 
data is processed.

-- rouilj
msg6041 Author: [hidden] (rouilj) Date: 2017-10-20 23:47
Actually forget moving the OTK. The OTK lifetime isn't known until
the response template is processed. This happens after the reject is 
raised.

So creating an OTK at the same time as the old OTK is deleted (before 
data is processed) isn't possible because the template gets to choose
the OTK lifetime. 8-(.
msg6047 Author: [hidden] (schlatterbeck) Date: 2017-11-03 10:29
I can finally reproduce this. When I have a web action with a handle()
method like 

    def handle(self):
        cl = self.db.getclass(self.classname)
        cl.set(self.nodeid, status = self.db.status.lookup("feedback"))
        cl.set(self.nodeid, title = "Just a test")
        self.db.commit()

and we have an auditor (or reactor) that prevents the title change with
a Reject exception, then the first change (setting the status) will be
committed. Note that the commit() in handle() is never reached because
the second set() call will raise Reject. Note that the two changes above
will not usually be for the same class (they could be combined into a
single set call which is what the web-interface does when changing
several values).
Next step is to try to come up with a regression test for this.
msg6048 Author: [hidden] (rouilj) Date: 2017-11-04 00:05
Hi Ralf:

In message
<1509704965.32.0.213398074469.issue2550955@psf.upfronthosting.co.za> ,
Ralf Schlatterbeck writes:
>Ralf Schlatterbeck added the comment:
>I can finally reproduce this. When I have a web action with a handle()
>method like 
>
>    def handle(self):
>        cl = self.db.getclass(self.classname)
>        cl.set(self.nodeid, status = self.db.status.lookup("feedback"))
>        cl.set(self.nodeid, title = "Just a test")
>        self.db.commit()
>
>and we have an auditor (or reactor) that prevents the title change with
>a Reject exception

Umm, I thought raising reject in a reactor was useless? By the time
the reactor runs the data is committed to the database.

>then the first change (setting the status) will be committed. Note
>that the commit() in handle() is never reached because
>the second set() call will raise Reject.

Agreed.

>Note that the two changes above will not usually be for the same
>class (they could be combined into a single set call which is what
>the web-interface does when changing several values).

Right you expect a transaction that is either atomically applied
across multiple tables/classes or rolled back. No split changes like
you describe.
msg6049 Author: [hidden] (schlatterbeck) Date: 2017-11-04 10:11
On Sat, Nov 04, 2017 at 12:05:35AM +0000, John Rouillard wrote:
> 
> Umm, I thought raising reject in a reactor was useless? By the time
> the reactor runs the data is committed to the database.
No, this was never the case. Conceptionally the auditor runs before the
commit but if something fails in the reactor the change is still rolled
back.

> Right you expect a transaction that is either atomically applied
> across multiple tables/classes or rolled back. No split changes like
> you describe.
Yes. And this was the case until we added additional commit calls due to
otk handling.

BTW: I currently have a hard time writing a test for that :-)

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   http://www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
History
Date User Action Args
2017-11-04 10:11:46schlatterbecksetmessages: + msg6049
2017-11-04 00:05:35rouiljsetmessages: + msg6048
2017-11-03 10:29:25schlatterbecksetmessages: + msg6047
2017-10-27 22:01:38rouiljsetkeywords: + Blocker
2017-10-20 23:47:49rouiljsetmessages: + msg6041
2017-10-20 23:44:07rouiljsetmessages: + msg6040
2017-10-19 18:02:15schlatterbecksetmessages: + msg6039
2017-10-19 09:57:11schlatterbecksetmessages: + msg6038
2017-10-15 14:15:16rouiljsetassignee: schlatterbeck
messages: + msg6036
2017-10-13 14:33:06schlatterbecksetmessages: + msg6032
2017-10-13 12:32:59rouiljsetmessages: + msg6031
2017-10-13 06:07:28schlatterbecksetmessages: + msg6030
2017-10-13 00:02:43rouiljsetnosy: + rouilj
messages: + msg6029
2017-10-12 17:47:26schlatterbeckcreate