Roundup Tracker - Issues

Issue 2551062

classification
Title: AddPermission doesn't validate property names.
Type: behavior Severity: minor
Components: Database Versions: 2.0.0alpha
process
Status: open Resolution:
Dependencies: Superseder:
Assigned To: rouilj Nosy List: joseph_myers, rouilj
Priority: Keywords: Effort-Low

Created on 2019-10-05 15:59 by rouilj, last changed 2019-10-06 01:40 by rouilj.

Messages
msg6685 Author: [hidden] (rouilj) Date: 2019-10-05 15:59
See issue2551032. The class was defined using organisation while the
Edit permission for the class was defined with organization.

db.security.addPermission doesn't validate the list of permissions.

There is a chicken and egg issue here. Calls to addPermision occur
while loading the schema. The db can't be opened until after the
schema is loaded, so there is no way to see what classes/properties
are defined in the db.  Also the db may still need to be modified
based on changes to the schema.

I can validate the data on first use in security::hasPermission.
The plan is:

  add new prop: _properties_valid to the Permission class/object
  add new method: validate_properties to the Permission class

  In security.hasPermission check each permission. If
  _properties_valid is not True, see if the permission has
  properties. In this case call a property validator method on the
  permission, passing the (now valid) db from the security
  object. Then set the _properties_valid to True if all properties are
  valid, otherwise raise ValueError.

Recording the validation state in the object should reduce the impact
of validating at use time rather than creation time.

However this is a breaking change. E.G. this very issue tracker uses:

  for cl in ('file', 'msg'):
    p = db.security.addPermission(name='View', klass=cl,
       description="allowed to see metadata object regardless of spam status",
       properties=('id', 'creation', 'activity',
                   'creator', 'actor',
                   'name', 'spambayes_score',
                   'spambayes_misclassified',
                   'author', 'recipients',
                   'date', 'files', 'messageid',
                   'inreplyto', 'type',
                   'description', ))

would break since author is a property of msg not file and name is a
prop of file the class not msg.

I have code that implements this, but should I put it behind a tracker
config option to enable/disable validating the properties as there may
be trackers that use this unitended mechanism?

Thoughts?
msg6686 Author: [hidden] (rouilj) Date: 2019-10-05 16:00
Another possibility would be to add property checking as a feature of the 
roundup-admin security command.
msg6687 Author: [hidden] (rouilj) Date: 2019-10-05 16:39
I committed property checking as part of roundup-admin security command.
Also updated docs a tad rev:5897:d0aebd4aec72.

Still not sure if doing it inline is a good idea or not....
msg6690 Author: [hidden] (joseph_myers) Date: 2019-10-05 17:33
I think the natural time to validate this is immediately after loading the 
schema (or some such time in the Roundup startup process), for all 
permissions that have been created at that point, so you always get an 
immediate error for this mistake rather than needing either (a) to run a 
special command to detect such problems or (b) to wait for someone to use 
the particular action etc. that checks a particular permission, which 
might not happen for some time.
msg6692 Author: [hidden] (rouilj) Date: 2019-10-05 23:55
Hi Joseph:

In message <alpine.DEB.2.21.1910051730310.6859@digraph.polyomino.org.uk>,
Joseph Myers writes:
>I think the natural time to validate this is immediately after loading the 
>schema (or some such time in the Roundup startup process), for all 
>permissions that have been created at that point, so you always get an 
>immediate error for this mistake [...]

I hadn't considered putting in the valiadtion at that point. IIRC
there is something that can be set to run after the schema loads. That
could be used to implement the validation maybe??

Previous work had indicated that the permissions application code can
be a bottleneck (i.e. linear search through all permission items even
if they are not associated with the class you are trying to validate.)
My idea was to validate only the permissions that were actually used
to reduce the load of implementing a complex permission schema.

>rather than needing either (a) to run a special command to detect
>such problems or (b) to wait for someone to use the particular action
>etc. that checks a particular permission, which might not happen for
>some time.

As you say, my suggestions do have the downside, of leaving
unvalidated perms in the config.

Validating all perms in my tracker means that > 2/3 of the validation
would be thrown away on every rest/xmlrpc/web page hit. E.G. If the
user's role is User, there is no need to validate the Agent,
Provisional User, Anonymous, or Admin roles of permissions every
single time.

Hmm, there are two code paths depending on whether the instance is
optimized or not. IIRC it compiles the schema. I wonder if there is a
path to create an optimized instance that will run validation only
during schema compilation. Then future requests will use the compiled
copy bypassing the validation.

Do you have any ideas on how to validate the permissions in the schema
only when it changes? Is there a pre-existing code path that could be
used?

Thanks for your feedback.
msg6693 Author: [hidden] (joseph_myers) Date: 2019-10-06 00:03
On Sat, 5 Oct 2019, John Rouillard wrote:

> Validating all perms in my tracker means that > 2/3 of the validation
> would be thrown away on every rest/xmlrpc/web page hit. E.G. If the
> user's role is User, there is no need to validate the Agent,
> Provisional User, Anonymous, or Admin roles of permissions every
> single time.

I was supposing Roundup is being used in a form where the schema is loaded 
once at startup, not once for each page hit.  E.g. in the normal case 
roundup-server uses roundup.instance.open (which is where the check would 
go, somewhere after loading the schema) only once for each named tracker, 
before starting listening for http and forking for each request (and 
thus if you change the schema you have to restart roundup-server).
msg6694 Author: [hidden] (rouilj) Date: 2019-10-06 00:31
In message <alpine.DEB.2.21.1910052358030.28559@digraph.polyomino.org.uk>,
Joseph Myers writes:
>On Sat, 5 Oct 2019, John Rouillard wrote:
>
>> Validating all perms in my tracker means that > 2/3 of the validation
>> would be thrown away on every rest/xmlrpc/web page hit. E.G. If the
>> user's role is User, there is no need to validate the Agent,
>> Provisional User, Anonymous, or Admin roles of permissions every
>> single time.
>
>I was supposing Roundup is being used in a form where the schema is loaded 
>once at startup, not once for each page hit.  E.g. in the normal case 
>roundup-server uses roundup.instance.open (which is where the check would 
>go, somewhere after loading the schema) only once for each named tracker,

I am running a tracker using demo.py (basicaly roundup_server) and I
put:

  print("In open")

at the top of roundup/instance.py Tracker::open which I think is the
source of your roundup.instance.open

I get this output:

  then restart demo. If you want to change backend types, you must use "nuke".

  In open
  127.0.0.1 - - [06/Oct/2019 00:19:37] "GET /demo/ HTTP/1.1" 200 -
  127.0.0.1 - - [05/Oct/2019 20:19:38] "GET /favicon.ico HTTP/1.1" 200 -
  In open
  127.0.0.1 - - [06/Oct/2019 00:19:42] "GET /demo/ HTTP/1.1" 200 -
  127.0.0.1 - - [05/Oct/2019 20:19:42] "GET /favicon.ico HTTP/1.1" 200 -
  In open
  127.0.0.1 - - [06/Oct/2019 00:19:48] "GET /demo/issue?status=-1%2C1%2C2%2C4&@startwith=0&@filter=status&@search_text=&@dispname=Show%20All&@sort=-activity&@columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus&@group=priority&@pagesize=50 HTTP/1.1" 200 -
  In open
  127.0.0.1 - - [06/Oct/2019 00:20:02] "GET /demo/issue?@template=item HTTP/1.1" 200 -

it looks to me like open is getting called on every web hit.

I think this is why the schema is compiled in the optimize path of
Tracker::__init__ since instance.open() evaluates the schema everytime
using:

        if self.optimize:
            # execute preloaded schema object
            self._exec(self.schema, env)

(or similar in the non-optimize path). But I may be reading this wrong.

However the comment at the top of open:

        # load the database schema
        # we cannot skip this part even if self.optimize is set
        # because the schema has security settings that must be
        # applied to each database instance

seems to imply I am interpreting this correctly.

Comments?
msg6695 Author: [hidden] (joseph_myers) Date: 2019-10-06 00:58
OK, this is the opendb calls from roundup.cgi.client.determine_user 
reopening the database.  And we'd like the permissions somehow to be 
checked only on the load of the schema that happens at roundup-server 
initialization time and not on this reopening.  Although really checking 
the permissions ought to be very efficient, and a lot quicker than the 
rest of loading the schema.
msg6696 Author: [hidden] (rouilj) Date: 2019-10-06 01:40
In message <alpine.DEB.2.21.1910060052090.28559@digraph.polyomino.org.uk>,
Joseph Myers writes:
>OK, this is the opendb calls from roundup.cgi.client.determine_user 
>reopening the database.

Ah ok. I didn't know where the call came from. Thanks.

>And we'd like the permissions somehow to be 
>checked only on the load of the schema that happens at roundup-server 
>initialization time and not on this reopening.

Agreed. But I am not sure that Tracker::__init__ actually loads (as
opposed to compiles) the schema. Tracker::__init__ is only called once
AFAICT by roundup-server and I assume other persistence methods (wsgi
etc.).

>Although really checking the permissions ought to be very efficient,
>and a lot quicker than the rest of loading the schema.

The code I have doing the realtime check is short. The problem was
finding a valid initialized open database handle when code from
security.py is called. Much of it is caled with a weak reference to
the db that is incompletely initialized. The code below is called from
hasPermission. self.db in hasPermission is the db variable.

    def validate_properties(self, db):
        cl = db.getclass(self.klass)
        class_props = cl.getprops(protected=True)
        for p in self.properties:
            if p in class_props:
                continue
            d = dict(property=p, klass=self.klass, permission=self)
            raise ValueError(
                'In permision %(permission)s %(property)s is not a property of class %(klass)s'% d)
        self._properties_valid = True

This is doing a database getclass and getprops for every permission. I
hope these are cheap calls, but I'm not sure. With 110 permissions, I
am not seing a difference in running roundup-admin security with and
without property checks, so I assume cheap.
History
Date User Action Args
2019-10-06 01:40:01rouiljsetmessages: + msg6696
2019-10-06 00:58:07joseph_myerssetmessages: + msg6695
2019-10-06 00:31:04rouiljsetmessages: + msg6694
2019-10-06 00:03:45joseph_myerssetmessages: + msg6693
2019-10-05 23:55:22rouiljsetmessages: + msg6692
2019-10-05 17:33:52joseph_myerssetnosy: + joseph_myers
messages: + msg6690
title: AddPermission doesn't validate property names. (validate props in roundup-admin security) -> AddPermission doesn't validate property names.
2019-10-05 16:40:01rouiljsettitle: AddPermission doesn't validate property names. -> AddPermission doesn't validate property names. (validate props in roundup-admin security)
2019-10-05 16:39:39rouiljsetstatus: new -> open
messages: + msg6687
versions: + 2.0.0alpha
2019-10-05 16:00:40rouiljsetmessages: + msg6686
2019-10-05 15:59:48rouiljcreate