Roundup Tracker - Issues

Issue 2551330

classification
Improving permission performance - RBAC, ABAC push to db layer from hyperdb
Type: security Severity: normal
Components: Database, Web interface, API Versions:
process
Status: new
:
: : rouilj, schlatterbeck
Priority: :

Created on 2024-04-01 19:52 by rouilj, last changed 2024-04-29 18:24 by schlatterbeck.

Messages
msg7973 Author: [hidden] (schlatterbeck) Date: 2024-04-01 09:13
Hmm: When there are permissions with check methods on a Class, we are checking permissions *in python* not in the database. The consequence is that we do not know at SQL time which items will be displayed.

I do have several classes with permission methods in my trackers. I've often thought about this, if it would make sense to implement permissions that can be checked at SQL time so that we do not need to filter in python. On some of my trackers filtering in python is a real performance killer because the tables are quite big...

So in a first step we may want to use your new optimized implementation only if there are no check methods.

Also the check methods are not sorted: Some users may see everything but we are sometimes checking the low-performance (method-based) checks before the lightweight checks that may permit everything for a certain role. An optimized version of the permission checks that sorts permissions by role and tries permissions without check methods first would be nice here.

Sorry for making this issue into a wishlist :-)

Thanks
Ralf
msg7974 Author: [hidden] (rouilj) Date: 2024-04-01 15:24
Hi Ralf:

I committed changeset: 7853:03c1b7ae3a68. While I was composing
this response, it passed CI and all code paths are checked for
what that's worth.

Your performance concern can be addressed by
reducing the number of rows returned from the SQL (or
anydbm) db using the method incorporated in this changeset.
See admin_guide.txt look for max_response_row_size.

Reduce the returned data set to 100 or 1000 items. The only
down side is that the total size returned will be set to -1
indicating the total size is unknown. Which is an improvement
from the @total_size hallucination that used to be returned 8-).

Currently the max size is a constant for all classes. You
could experiment with making max size a function that takes
a class/classname and @page_size and returns a different
limit value for different inputs. I didn't do it because
YAGNI but if YNI.

> When there are permissions with check methods on a Class,
> we are checking permissions *in python* not in the
> database. The consequence is that we do not know at SQL
> time which items will be displayed.

AFAICT all permission checks are in Python. Regardless of a
check method.

This bit of code:

        for item_id in obj_list:
            r = {}
            if self.db.security.hasPermission(
                'View', uid, class_name, itemid=item_id, property='id',
            ):
                r = {'id': item_id, 'link': class_path + item_id}
            if display_props:
                # format_item does the permission checks
                r.update(self.format_item(class_obj.getnode(item_id),
                    item_id, props=display_props, verbose=verbose))
            if r:
                result['collection'].append(r)

in rest.py filters obj_list (returned by the database) by
permission.

The permissions generated from the user's Role is
checked in Python even if a permission has no check method.

> I do have several classes with permission methods in my
> trackers. I've often thought about this, if it would make
> sense to implement permissions that can be checked at SQL
> time so that we do not need to filter in python. On some
> of my trackers filtering in python is a real performance
> killer because the tables are quite big...

Pushing permissions down to the DB backend is an interesting
idea. Sadly I can't think of a way to make it happen. With
the check method, Roundup's access control changes from an
RBAC system to something more powerful. In theory I can use
a check method on a permission to deny access based on time
of day.

This makes it more of an ABAC (Attribute based access
control) system. I think that expressiveness makes it
difficult if not impossible to push permissions to the
database layer and keeps it in the hyperdb.

> So in a first step we may want to use your new optimized
> implementation only if there are no check methods.

As I stated above, I think the Permissions processing
happens in Python in all cases. Changing the max size of a
returned set to a number 1 more than @page_size should fix
the bug that triggered this issue with no performance impact.

> Also the check methods are not sorted: Some users may see
> everything but we are sometimes checking the low-performance
> (method-based) checks before the lightweight checks that may
> permit everything for a certain role. An optimized version
> of the permission checks that sorts permissions by role and
> tries permissions without check methods first would be nice
> here.

AFAIK, permissions aren't sorted by role. The role is used
to generate a list of permissions and then is discarded. Did
you mean permission type/name?

I agree with optimizing the application of the permissions
list. I think I proposed something of the sort many years
ago. IIRC all permissions are in a single list. So you could
have (args are type, class, properties, check, props_only):

  Permission('Edit', user, [name, phonenumber], during_daytime)  (1)
  Permission('View', user, [name, phonenumber], during_daytime)  (2)
  Permission('Edit', issue, [nosy,], None, props_only=True)      (3)
  Permission('View', issue,)                                     (4)
  ...
  Permission('View', issue, [title, nosy, priority,])            (5)

in a single list. if you are checking to see if somebody can
view the issue, you have to iterate over 1-3 for each item
in the database before reaching 4.

Grouping the permission list by type/class:

   P[edit][user] = [
       Permission('edit', user, [name, phonenumber], during_daytime)
   ]

   P[view][issue] = [Permission('view', issue,),
            Permission('view', issue, [title, nosy, priority,])]

   ...

makes selecting all permissions by type and class an O(1)
rather than O(n) cost. It should have no effect on the
expressed permissions. This should reduce the number of
permissions that have to be iterated over. But, I have no
idea if traversal of the list is a major issue.

Sorting the permissions can help as well. I am concerned
that sorting could result in changing the applied
permissions from the unsorted state. AFAIK, unlike detectors
permissions don't have a method to set ordering which plays
a role in what permission is expressed.

> Sorry for making this issue into a wishlist :-)

I agree permissions and their application can be a
bottleneck. I wish I had better developer documentation on
how they work. IIRC when I added props_only it was because a
class level check was being allowed by a property level
permission. Documenting the various check and permission
types and how they play together would be helpful.

I think further discussion of this should be on a new ticket.
Agreed?
msg7975 Author: [hidden] (rouilj) Date: 2024-04-01 19:52
Linked the first two messages from a discussion on issue2551328.
Ralf noted that performance when retrieving a lot of data tanks
due to permission checks in python at the hyperdb layer.
msg7977 Author: [hidden] (schlatterbeck) Date: 2024-04-02 08:03
I'm cut&pasting my email from yesterday here, my MTA reports

<issues@roundup-tracker.org>: host bugs.python.org[167.71.181.142] said: 454
    4.7.1 <issues@roundup-tracker.org>: Relay access denied (in reply to RCPT
    TO command)

So it seems we can no longer reply to messages from our bug tracker.

On Mon, Apr 01, 2024 at 03:24:35PM +0000, John Rouillard wrote:
> > When there are permissions with check methods on a Class,
> > we are checking permissions *in python* not in the
> > database. The consequence is that we do not know at SQL
> > time which items will be displayed.
>
> AFAICT all permission checks are in Python. Regardless of a
> check method.

Yes, I know, I was highlighting this because it's slow :-)

> in rest.py filters obj_list (returned by the database) by
> permission.
>
> The permissions generated from the user's Role is
> checked in Python even if a permission has no check method.

Yes, and the permissions are checked in arbitrary order. My proposal
would be to group permissions by role (so that we need to check only the
permissions where we know the user has that role) *and* order the easy
ones (without a check method) first. That way we can
- Check only the permissions for the roles we have
- Check the easy ones first. On first hit we stop. That way in cases
  where a user has permissions that are easy to check we stop early
  without touching the performance-problematic ones

>
> Pushing permissions down to the DB backend is an interesting
> idea. Sadly I can't think of a way to make it happen. With
> the check method, Roundup's access control changes from an
> RBAC system to something more powerful. In theory I can use
> a check method on a permission to deny access based on time
> of day.

Sure. I was thinking of having less expressive permissions than we
currently have with check methods. We may find that for many use-cases
we can formulate permissions that are checkable in SQL.

> AFAIK, permissions aren't sorted by role. The role is used
> to generate a list of permissions and then is discarded. Did
> you mean permission type/name?

I meant that as a proposed change, see above.

> Grouping the permission list by type/class:
>
>    P[edit][user] = [
>        Permission('edit', user, [name, phonenumber], during_daytime)
>    ]
>
>    P[view][issue] = [Permission('view', issue,),
>             Permission('view', issue, [title, nosy, priority,])]
>
>    ...
>
> makes selecting all permissions by type and class an O(1)
> rather than O(n) cost. It should have no effect on the
> expressed permissions. This should reduce the number of
> permissions that have to be iterated over. But, I have no
> idea if traversal of the list is a major issue.

The permissions with check methods are heavy on performance, we may want
to check the ones without method first and only if we found no
permission check the others. Would improve performance a lot for some of
my use-cases.

>
> Sorting the permissions can help as well. I am concerned
> that sorting could result in changing the applied
> permissions from the unsorted state. AFAIK, unlike detectors
> permissions don't have a method to set ordering which plays
> a role in what permission is expressed.

I do not think the semantics change. If a person has multiple
permissions on a Class (and/or properties) this should be equivalent to
just one permission.

> I think further discussion of this should be on a new ticket.
> Agreed?
Umm, that maybe should have been at the beginning :-)

But, yes, lets start a new ticket.

[Well, the last comment is obsolete because the mail never made it]
msg7978 Author: [hidden] (rouilj) Date: 2024-04-05 02:25
> So it seems we can no longer reply to messages from our bug
> tracker.

I have a ticket open with psf. It appears this has been broken
since their OS upgrade at the begining of th year.

> On Mon, Apr 01, 2024 at 03:24:35PM +0000, John Rouillard wrote:
> > AFAICT all permission checks are in Python. Regardless of a
> > check method.
>
> Yes, I know, I was highlighting this because it's slow :-)

Gotcha.

> > in rest.py filters obj_list (returned by the database) by
> > permission.
> >
> > The permissions generated from the user's Role is checked in
> > Python even if a permission has no check method.
> 
> Yes, and the permissions are checked in arbitrary order.

Reordering the permissions can help here.

> My proposal would be to group permissions by role (so that we
> need to check only the permissions where we know the user has
> that role) *and* order the easy ones (without a check method)
> first. That way we can - Check only the permissions for the
> roles we have

How would that work? Would you need to replace a
hasPermission('View', 'issue' ...) in the core code with a
hasRole() check? If so how would the code know what Role to use?
The same permission could be in multiple roles.

Also I wonder (see below) if the permissions are grouped by role
by default.

There is a hasRole() in templating.py, but I don't think that's
used outside of the templating system. Also as I remember it
checks for Anonymous and maybe Admin roles only. These are the
only two roles with predefined meanings.

> - Check the easy ones first. On first hit we stop. That way in
>   cases where a user has permissions that are easy to check we
>   stop early without touching the performance-problematic ones

This is my understanding of how things work, do you agree?

1) user authenticates
2) user's roles are looked up
3) security object is populated with the permissions associated
   with the user's roles
4) hasPermission(X) calls are made to see if the user can do X

Sorting to put "easy ones" first makes sense. See below on my
questions about ordering/reordering.

> > Pushing permissions down to the DB backend is an interesting
> > idea. Sadly I can't think of a way to make it happen. With
> > the check method, Roundup's access control changes from an
> > RBAC system to something more powerful. In theory I can use a
> > check method on a permission to deny access based on time of
> > day.
> 
> Sure. I was thinking of having less expressive permissions than
> we currently have with check methods. We may find that for many
> use-cases we can formulate permissions that are checkable in
> SQL.

AFAIK access is denied only if no permission matches. A
permission can't return a "deny", it only says "I don't
match". Is that your understanding as well?

I can see a way to do some of this if we define relationships
like:

  owner
  actor

which are done via check commands currently. These map to a
database check on 'creator' and 'actor'. But expressing a
permission like:

  If the issue is marked private the user needs to be on one of
     two access lists (nosy and verynosy in my case) to see the
     issue

requires a like check for the id number in the two fields. Not
insurmountable. But you would need to define some sort of method
to implement this using a join (maybe, my SQL isn't great).

> > AFAIK, permissions aren't sorted by role. The role is used to
> > generate a list of permissions and then is discarded. Did you
> > mean permission type/name?
> 
> I meant that as a proposed change, see above.

Ok. It's possible that the role does have an impact on
ordering. See below.

> > Grouping the permission list by type/class:
> >
> >    P[edit][user] = [
> >        Permission('edit', user, [name, phonenumber], during_daytime)
> >    ]
> >
> >    P[view][issue] = [Permission('view', issue,),
> >             Permission('view', issue, [title, nosy, priority,])]
> >
> >    ...
> >
> > makes selecting all permissions by type and class an O(1)
> > rather than O(n) cost. It should have no effect on the
> > expressed permissions. This should reduce the number of
> > permissions that have to be iterated over. But, I have no
> > idea if traversal of the list is a major issue.
> 
> The permissions with check methods are heavy on performance, we
> may want to check the ones without method first and only if we
> found no permission check the others. Would improve performance
> a lot for some of my use-cases.

Since no permission can cause a 'deny' action, I think sorting it
is safe, but I am not willing to put money on that.

Adding grouping by class (or None for tracker level permissions)
to the existing grouping by permission name I think is a
win. Then sorting within those buckets by:

   no properties assigned no check function
   no properties assigned check function
   properties assigned and properties defined
      without prop_only no check
   properties assigned and properties defined
      without prop_only with check
   properties assigned and defined with prop_only no check function
   properties assigned and defined with prop_only and check function

might be faster. I think this ordering can be expressed using a tuple
with one position for each criteria. So a permission list item for the
first bucket is:

   (properties=0, check=0, prop-only=0, permission object)

and the last bucket is

   (properties=1, check=1, prop-only=1, permission object)

So sorting should be doable.

I am not sure what the default order of permisions is. I think the
order of permissions assigned to a role is in the order it's assigned
in schema.py. I would expect the order for a user with roles User,
Role1, Role2 would be:

  User permissions in order they were added in schema.py
  Role1 permissions in order they were added in schema.py
  Role2 permissions in order they were added in schema.py

Reordering into buckets as above could help.

> > Sorting the permissions can help as well. I am concerned that
> > sorting could result in changing the applied permissions from
> > the unsorted state. AFAIK, unlike detectors permissions don't
> > have a method to set ordering which plays a role in what
> > permission is expressed.
> 
> I do not think the semantics change. If a person has multiple
> permissions on a Class (and/or properties) this should be
> equivalent to just one permission.

I think you are corect but it's because a permission can't result
in a deny. It can only result in a "I don't match" so the
permission checker moves on to the next item.

Permissions defined with properties (e.g. View) match a request
for View permissions (without properties) on the class. So:

  havePermission('View', 'issue')

matches:

  addPermission(name='View', klass='issue', property=['title']).

This may or may not make a difference. It's possible that the
order I outlined above might be more efficient if all permisions
with a check were moved after the permissions without
checks. Otherwise the bucket order stays as above.

Thoughts?
msg8004 Author: [hidden] (rouilj) Date: 2024-04-22 15:43
Ralf, do you have more thoughts on this?
msg8015 Author: [hidden] (schlatterbeck) Date: 2024-04-29 11:02
On Mon, Apr 22, 2024 at 03:43:23PM +0000, John Rouillard wrote:
> Ralf, do you have more thoughts on this?

I had, yes, but was hit again by my reply being bounced. My reply was April 5th. Hmpf. My workflow here really depends on being able to reply to an email. That's what I wrote a month ago:

On Fri, Apr 05, 2024 at 02:25:49AM +0000, John Rouillard wrote:
> > Yes, and the permissions are checked in arbitrary order.
>
> Reordering the permissions can help here.

Yes, that's the idea.

>
> > My proposal would be to group permissions by role (so that we
> > need to check only the permissions where we know the user has
> > that role) *and* order the easy ones (without a check method)
> > first. That way we can - Check only the permissions for the
> > roles we have
>
> How would that work? Would you need to replace a
> hasPermission('View', 'issue' ...) in the core code with a
> hasRole() check? If so how would the code know what Role to use?
> The same permission could be in multiple roles.

No the hasPermission would stay. It just would check the permissions in
an order that has least effort first. And it would not check permissions
for roles the user doesn't have.

> Also I wonder (see below) if the permissions are grouped by role
> by default.

Yes, the implementation would store permissions grouped by role (e.g. in
a dict). In each dict we would have containers of permissions.
Permissions that are for the whole Class (not individual properties of a
class) *without* check methods would be in the first container, followed
by permissions on individual attributes (without check methods) followed
by everything with check methods, probably in different containers. So
we could check for each role the first entry (for the whole Class
without check methods), then each role the second entry, etc. The search
-- as it is now -- terminates when a matching entry is found.

> This is my understanding of how things work, do you agree?
>
> 1) user authenticates
> 2) user's roles are looked up
> 3) security object is populated with the permissions associated
>    with the user's roles
> 4) hasPermission(X) calls are made to see if the user can do X

Hmm, yes, the 3) will build data structures that can be more effectively
searched, see above.

> AFAIK access is denied only if no permission matches. A
> permission can't return a "deny", it only says "I don't
> match". Is that your understanding as well?

Yes.

> I can see a way to do some of this if we define relationships
> like:
>
>   owner
>   actor
>
> which are done via check commands currently. These map to a
> database check on 'creator' and 'actor'. But expressing a
> permission like:
>
>   If the issue is marked private the user needs to be on one of
>      two access lists (nosy and verynosy in my case) to see the
>      issue
>
> requires a like check for the id number in the two fields. Not
> insurmountable. But you would need to define some sort of method
> to implement this using a join (maybe, my SQL isn't great).

Yes, the owner checks are easy (but also require a join I think) and
other checks may be more challenging or even impossible. I think the
challenge is to come up with a user-understandable formulation of
sql-checkable permissions. So that permissions can be defined in a way
that the user understands *and* that can be translated to an SQL query.

> Since no permission can cause a 'deny' action, I think sorting it
> is safe, but I am not willing to put money on that.

The order currently is arbitrary and depends on the order of definition.
I think we would have seen a lot of bugs if this was order-dependent.

So, yes, I would put some money on that :-)

> Adding grouping by class (or None for tracker level permissions)
> to the existing grouping by permission name I think is a
> win. Then sorting within those buckets by:
>
>    no properties assigned no check function
>    no properties assigned check function
>    properties assigned and properties defined
>       without prop_only no check
>    properties assigned and properties defined
>       without prop_only with check
>    properties assigned and defined with prop_only no check function
>    properties assigned and defined with prop_only and check function

What do you mean be
 properties assigned and properties defined without prop_only
etc.?

I would put the ones with check functions last. See above where I
outlined this. The check functions are the problematic part concerning
performance.

> might be faster. I think this ordering can be expressed using a tuple
> with one position for each criteria. So a permission list item for the
> first bucket is:
>
>    (properties=0, check=0, prop-only=0, permission object)
>
> and the last bucket is
>
>    (properties=1, check=1, prop-only=1, permission object)
>
> So sorting should be doable.

I agree the principle with buckets, just the order of the checks would
be different checking permissions with check functions last.

> I am not sure what the default order of permisions is. I think the
> order of permissions assigned to a role is in the order it's assigned
> in schema.py. I would expect the order for a user with roles User,
> Role1, Role2 would be:
>
>   User permissions in order they were added in schema.py
>   Role1 permissions in order they were added in schema.py
>   Role2 permissions in order they were added in schema.py
>
> Reordering into buckets as above could help.

Yes to both, order is order of definition in schema.py and reordering
could help :-)

> > > Sorting the permissions can help as well. I am concerned that
> > > sorting could result in changing the applied permissions from
> > > the unsorted state. AFAIK, unlike detectors permissions don't
> > > have a method to set ordering which plays a role in what
> > > permission is expressed.
> >
> > I do not think the semantics change. If a person has multiple
> > permissions on a Class (and/or properties) this should be
> > equivalent to just one permission.
>
> I think you are corect but it's because a permission can't result
> in a deny. It can only result in a "I don't match" so the
> permission checker moves on to the next item.

Yes. It iterates over permissions until a match or until end of
permissions.

> Permissions defined with properties (e.g. View) match a request
> for View permissions (without properties) on the class. So:
>
>   havePermission('View', 'issue')
>
> matches:
>
>   addPermission(name='View', klass='issue', property=['title']).
>
> This may or may not make a difference. It's possible that the
> order I outlined above might be more efficient if all permisions
> with a check were moved after the permissions without
> checks. Otherwise the bucket order stays as above.

Yes. We should definitely check entries with a check function last.

Thanks for looking into this!

Ralf
msg8016 Author: [hidden] (rouilj) Date: 2024-04-29 17:40
Hi Ralf:

In message <1714388568.86.0.772511523784.issue2551330@roundup.psfhosted.org>,
Ralf Schlatterbeck writes:
>Ralf Schlatterbeck added the comment:
>
>On Mon, Apr 22, 2024 at 03:43:23PM +0000, John Rouillard wrote:
>> Ralf, do you have more thoughts on this?
>
>I had, yes, but was hit again by my reply being bounced. My reply was
> April 5th. Hmpf. My workflow here really depends on being able to
> reply to an email. That's what I wrote a month ago:

EWDurbin fixed this on April 15th.

  https://github.com/python/psf-salt/issues/352

Email has been working for me since.

>On Fri, Apr 05, 2024 at 02:25:49AM +0000, John Rouillard wrote:
>> > My proposal would be to group permissions by role (so that we
>> > need to check only the permissions where we know the user has
>> > that role) *and* order the easy ones (without a check method)
>> > first. That way we can - Check only the permissions for the
>> > roles we have
>>
>> How would that work? Would you need to replace a
>> hasPermission('View', 'issue' ...) in the core code with a
>> hasRole() check? If so how would the code know what Role to use?
>> The same permission could be in multiple roles.
>
>No the hasPermission would stay. It just would check the permissions in
>an order that has least effort first. And it would not check permissions
>for roles the user doesn't have.

I'm still not getting it.

>> Also I wonder (see below) if the permissions are grouped by role
>> by default.
>
>Yes, the implementation would store permissions grouped by role (e.g. in
>a dict). In each dict we would have containers of permissions.
>Permissions that are for the whole Class (not individual properties of a
>class) *without* check methods would be in the first container, followed
>by permissions on individual attributes (without check methods) followed
>by everything with check methods, probably in different containers. So
>we could check for each role the first entry (for the whole Class
>without check methods), then each role the second entry, etc. The search
>-- as it is now -- terminates when a matching entry is found.

Hang on. By 'role' do you mean Permission like "View" or Role as in
the user class: Admin, User, Anonymous? I am interpreting Role as
Admin ....

>> This is my understanding of how things work, do you agree?
>>
>> 1) user authenticates
>> 2) user's roles are looked up
>> 3) security object is populated with the permissions associated
>>    with the user's roles
>> 4) hasPermission(X) calls are made to see if the user can do X
>
>Hmm, yes, the 3) will build data structures that can be more effectively
>searched, see above.

Ok, but there is no role lookup (Admin, Anonymous) at any time in
4. Roles just don't exist there.

>> AFAIK access is denied only if no permission matches. A
>> permission can't return a "deny", it only says "I don't
>> match". Is that your understanding as well?
>
>Yes.
>
>> I can see a way to do some of this if we define relationships
>> like:
>>
>>   owner
>>   actor
>>
>> which are done via check commands currently. These map to a
>> database check on 'creator' and 'actor'. But expressing a
>> permission like:
>>
>>   If the issue is marked private the user needs to be on one of
>>      two access lists (nosy and verynosy in my case) to see the
>>      issue
>>
>> requires a like check for the id number in the two fields. Not
>> insurmountable. But you would need to define some sort of method
>> to implement this using a join (maybe, my SQL isn't great).
>
>Yes, the owner checks are easy (but also require a join I think) and
>other checks may be more challenging or even impossible. I think the
>challenge is to come up with a user-understandable formulation of
>sql-checkable permissions. So that permissions can be defined in a way
>that the user understands *and* that can be translated to an SQL query.

This has to be an optimization unless we want to drop support for
non-sql databases. I don't like dropping support for non-sql databases
from the hyperdb.

>> Since no permission can cause a 'deny' action, I think sorting it
>> is safe, but I am not willing to put money on that.
>
>The order currently is arbitrary and depends on the order of definition.
>I think we would have seen a lot of bugs if this was order-dependent.
>
>So, yes, I would put some money on that :-)

Good.

>> Adding grouping by class (or None for tracker level permissions)
>> to the existing grouping by permission name I think is a
>> win. Then sorting within those buckets by:
>>
>>    no properties assigned no check function
>>    no properties assigned check function
>>    properties assigned and properties defined
>>       without prop_only no check
>>    properties assigned and properties defined
>>       without prop_only with check
>>    properties assigned and defined with prop_only no check function
>>    properties assigned and defined with prop_only and check function
>
>What do you mean be
> properties assigned and properties defined without prop_only
>etc.?

If I have a rule "View access to username in User class" it implicitly
allows "View User". So:

   db.security.addPermission(name='View', klass='user',
    properties=('activity', 'id', 'organisation', 'phone', 'realname',
            'timezone', 'username'),
            description="User can have limited view of other users",
            props_only=False)

causes hasPermission("View", "1", "user") to return true. This is a
case of "no prop_only". The permission:

   db.security.addPermission(name='View', klass='user',
    properties=('activity', 'id', 'organisation', 'phone', 'realname',
            'timezone', 'username'),
            description="User can have limited view of other users",
            props_only=True)

would not satisfy hasPermission("View", "1", "user"). It would satisfy
hasPermission("View", "1", "user", property="phone"). This last is an
example of a permission with prop_only.

>I would put the ones with check functions last. See above where I
>outlined this. The check functions are the problematic part concerning
>performance.

Agreed.

>> might be faster. I think this ordering can be expressed using a tuple
>> with one position for each criteria. So a permission list item for the
>> first bucket is:
>>
>>    (properties=0, check=0, prop-only=0, permission object)
>>
>> and the last bucket is
>>
>>    (properties=1, check=1, prop-only=1, permission object)
>>
>> So sorting should be doable.
>
>I agree the principle with buckets, just the order of the checks would
>be different checking permissions with check functions last.

Fair enough. IIRC, the check functions aren't called unless there is
an exact match. So check function is called for:

  hasPermission("View", "1", "keyword")

given:

  db.security.addPermission(name='View', klass='keyword',
      check=own_keyword,
      description="User is allowed to edit what they created")

but not for:

  db.security.addPermission(name='View', klass='keyword',
      check=own_keyword, properties="name")

because the check doesn't specify any properties (but still allows
hasPermission to pass). I am a little hazy on this but I think I'm
correct. I think something like this was what led to adding
props_only.

>> I am not sure what the default order of permissions is. I think the
>> order of permissions assigned to a role is in the order it's assigned
>> in schema.py. I would expect the order for a user with roles User,
>> Role1, Role2 would be:
>>
>>   User permissions in order they were added in schema.py
>>   Role1 permissions in order they were added in schema.py
>>   Role2 permissions in order they were added in schema.py
>>
>> Reordering into buckets as above could help.
>
>Yes to both, order is order of definition in schema.py and reordering
>could help :-)

As long as we preserve the ordering that sounds good to me.

>> > > Sorting the permissions can help as well. I am concerned that
>> > > sorting could result in changing the applied permissions from
>> > > the unsorted state. AFAIK, unlike detectors permissions don't
>> > > have a method to set ordering which plays a role in what
>> > > permission is expressed.
>> >
>> > I do not think the semantics change. If a person has multiple
>> > permissions on a Class (and/or properties) this should be
>> > equivalent to just one permission.
>>
>> I think you are correct but it's because a permission can't result
>> in a deny. It can only result in a "I don't match" so the
>> permission checker moves on to the next item.
>
>Yes. It iterates over permissions until a match or until end of
>permissions.

Understood.

>> Permissions defined with properties (e.g. View) match a request
>> for View permissions (without properties) on the class. So:
>>
>>   havePermission('View', 'issue')
>>
>> matches:
>>
>>   addPermission(name='View', klass='issue', property=['title']).
>>
>> This may or may not make a difference. It's possible that the
>> order I outlined above might be more efficient if all permisions
>> with a check were moved after the permissions without
>> checks. Otherwise the bucket order stays as above.
>
>Yes. We should definitely check entries with a check function last.

As I noted above I think a permission with a check can still cause a
check to pass even without running the check.

I still find:

   db.security.addPermission(name='View', klass='user',
    properties=('activity', 'id', 'organisation', 'phone', 'realname',
            'timezone', 'username'),
            description="User can have limited view of other users")

matching hasPermission("View", "1", "user") unexpected. It makes sense
that I would need View rights on user (traversal rights) to access the
phone but...

>Thanks for looking into this!

Yeah something to look at for 2.5. I think improving the permission
data structure is doable. I am less optimistic about being able to
push this down to SQL.
msg8017 Author: [hidden] (schlatterbeck) Date: 2024-04-29 18:24
On Mon, Apr 29, 2024 at 05:40:48PM +0000, John Rouillard wrote:
> >I had, yes, but was hit again by my reply being bounced. My reply was
> > April 5th. Hmpf. My workflow here really depends on being able to
> > reply to an email. That's what I wrote a month ago:
> 
> EWDurbin fixed this on April 15th.
...
> Email has been working for me since.

Great, thanks for pushing this!

> >> Also I wonder (see below) if the permissions are grouped by role
> >> by default.
> >
> >Yes, the implementation would store permissions grouped by role (e.g. in
> >a dict). In each dict we would have containers of permissions.
> >Permissions that are for the whole Class (not individual properties of a
> >class) *without* check methods would be in the first container, followed
> >by permissions on individual attributes (without check methods) followed
> >by everything with check methods, probably in different containers. So
> >we could check for each role the first entry (for the whole Class
> >without check methods), then each role the second entry, etc. The search
> >-- as it is now -- terminates when a matching entry is found.
> 
> Hang on. By 'role' do you mean Permission like "View" or Role as in
> the user class: Admin, User, Anonymous? I am interpreting Role as
> Admin ....

Yes role in the user class. My use-cases involve lots of roles.
And we're seeing large search times even for users that have full
permission on a class because apparently permissions of other roles
(which the current user *also* has) are checked first. So the idea would
be to group permission by role.

> 
> >> This is my understanding of how things work, do you agree?
> >>
> >> 1) user authenticates
> >> 2) user's roles are looked up
> >> 3) security object is populated with the permissions associated
> >>    with the user's roles
> >> 4) hasPermission(X) calls are made to see if the user can do X
> >
> >Hmm, yes, the 3) will build data structures that can be more effectively
> >searched, see above.
> 
> Ok, but there is no role lookup (Admin, Anonymous) at any time in
> 4. Roles just don't exist there.

They do, from roundup/security.py:

    def hasPermission(self, permission, userid, classname=None,
                      property=None, itemid=None):
        '''Look through all the Roles, and hence Permissions, and
           see if "permission" exists given the constraints of
           classname, property, itemid, and props_only.
[...]
        for rolename in self.db.user.get_roles(userid):
            if not rolename or (rolename not in self.role):
                continue
            # for each of the user's Roles, check the permissions
            for perm in self.role[rolename].permissions:

I propose to change the order of that iterations:
- First loop over the 'easy' targets, for each role check only the
  permissions without a permission method
Only if nothing is found:
- Loop over roles again this time using the costly checks *with* check
  methods

> >Yes, the owner checks are easy (but also require a join I think) and
> >other checks may be more challenging or even impossible. I think the
> >challenge is to come up with a user-understandable formulation of
> >sql-checkable permissions. So that permissions can be defined in a way
> >that the user understands *and* that can be translated to an SQL query.
> 
> This has to be an optimization unless we want to drop support for
> non-sql databases. I don't like dropping support for non-sql databases
> from the hyperdb.

I want them *in addition* to the current methods.
And if we come up with a suitable formulation this can be implemented
for non-sql backends, too. But for non-sql databases there is probably
no performance improvement.

> >What do you mean be
> > properties assigned and properties defined without prop_only
> >etc.?
> 
> If I have a rule "View access to username in User class" it implicitly
> allows "View User". So:
> 
>    db.security.addPermission(name='View', klass='user',
>     properties=('activity', 'id', 'organisation', 'phone', 'realname',
>             'timezone', 'username'),
>             description="User can have limited view of other users",
>             props_only=False)
> 
> causes hasPermission("View", "1", "user") to return true. This is a
> case of "no prop_only". The permission:
> 
>    db.security.addPermission(name='View', klass='user',
>     properties=('activity', 'id', 'organisation', 'phone', 'realname',
>             'timezone', 'username'),
>             description="User can have limited view of other users",
>             props_only=True)
> 
> would not satisfy hasPermission("View", "1", "user"). It would satisfy
> hasPermission("View", "1", "user", property="phone"). This last is an
> example of a permission with prop_only.

Thanks I forgot we had that additional prop_only flag.

> >I agree the principle with buckets, just the order of the checks would
> >be different checking permissions with check functions last.
> 
> Fair enough. IIRC, the check functions aren't called unless there is
> an exact match. So check function is called for:
> 
>   hasPermission("View", "1", "keyword")
> 
> given:
> 
>   db.security.addPermission(name='View', klass='keyword',
>       check=own_keyword,
>       description="User is allowed to edit what they created")
> 
> but not for:
> 
>   db.security.addPermission(name='View', klass='keyword',
>       check=own_keyword, properties="name")
> 
> because the check doesn't specify any properties (but still allows
> hasPermission to pass). I am a little hazy on this but I think I'm
> correct. I think something like this was what led to adding
> props_only.

Yes, this may be the case.
But I think when displaying the results of a search (e.g. in an index
template of the web interface) all properties are checked explicitly.

> >> I am not sure what the default order of permissions is. I think the
> >> order of permissions assigned to a role is in the order it's assigned
> >> in schema.py. I would expect the order for a user with roles User,
> >> Role1, Role2 would be:
> >>
> >>   User permissions in order they were added in schema.py
> >>   Role1 permissions in order they were added in schema.py
> >>   Role2 permissions in order they were added in schema.py
> >>
> >> Reordering into buckets as above could help.
> >
> >Yes to both, order is order of definition in schema.py and reordering
> >could help :-)
> 
> As long as we preserve the ordering that sounds good to me.

I do not think we need to preserve the ordering: If two permissions
(maybe even for different roles held by the same user) specify that a
user has access it doesn't matter which one triggers first.

> As I noted above I think a permission with a check can still cause a
> check to pass even without running the check.
> 
> I still find:
> 
>    db.security.addPermission(name='View', klass='user',
>     properties=('activity', 'id', 'organisation', 'phone', 'realname',
>             'timezone', 'username'),
>             description="User can have limited view of other users")
> 
> matching hasPermission("View", "1", "user") unexpected. It makes sense
> that I would need View rights on user (traversal rights) to access the
> phone but...

Yes, me too. I would call it a bug. And I *think* the props_only flag
was added for backwards compatibility when it was discovered that the
checks are un-untuitive (or should I say buggy). And I also think that
roundup core doesn't contain any permissions on properties without the
props_only flag, at least I think I remember this from the far past.

> Yeah something to look at for 2.5. I think improving the permission
> data structure is doable. I am less optimistic about being able to
> push this down to SQL.

Yes I think so, too. If I had an idea of how permission with check
methods could be specified more SQL-friendly I would already have
proposed it :-)

Thanks
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
History
Date User Action Args
2024-04-29 18:24:14schlatterbecksetmessages: + msg8017
2024-04-29 17:40:48rouiljsetmessages: + msg8016
2024-04-29 11:02:48schlatterbecksetmessages: + msg8015
2024-04-22 15:43:22rouiljsetmessages: + msg8004
2024-04-05 02:25:49rouiljsetmessages: + msg7978
2024-04-02 08:03:40schlatterbecksetmessages: + msg7977
2024-04-01 19:55:35adminsetmessages: + msg7973, msg7974
2024-04-01 19:53:00rouiljcreate