Roundup Tracker - Issues

Issue 2551330

classification
Improving permission performance - RBAC, ABAC push to db layer from hyperdb
Type: security Severity: normal
Components: Database, Web interface, API Versions:
process
Status: fixed fixed
:
: schlatterbeck : rouilj, schlatterbeck
Priority: : patch

Created on 2024-04-01 19:52 by rouilj, last changed 2024-11-26 20:35 by rouilj.

Files
File name Uploaded Description Edit Remove
security.perf_log.patch rouilj, 2024-10-17 03:32
security.perf_memo.patch rouilj, 2024-10-17 14:14
Messages
msg7973 Author: [hidden] (schlatterbeck) Date: 2024-04-01 09:13
Hmm: When there are permissions with check methods on a Class, we are checking permissions *in python* not in the database. The consequence is that we do not know at SQL time which items will be displayed.

I do have several classes with permission methods in my trackers. I've often thought about this, if it would make sense to implement permissions that can be checked at SQL time so that we do not need to filter in python. On some of my trackers filtering in python is a real performance killer because the tables are quite big...

So in a first step we may want to use your new optimized implementation only if there are no check methods.

Also the check methods are not sorted: Some users may see everything but we are sometimes checking the low-performance (method-based) checks before the lightweight checks that may permit everything for a certain role. An optimized version of the permission checks that sorts permissions by role and tries permissions without check methods first would be nice here.

Sorry for making this issue into a wishlist :-)

Thanks
Ralf
msg7974 Author: [hidden] (rouilj) Date: 2024-04-01 15:24
Hi Ralf:

I committed changeset: 7853:03c1b7ae3a68. While I was composing
this response, it passed CI and all code paths are checked for
what that's worth.

Your performance concern can be addressed by
reducing the number of rows returned from the SQL (or
anydbm) db using the method incorporated in this changeset.
See admin_guide.txt look for max_response_row_size.

Reduce the returned data set to 100 or 1000 items. The only
down side is that the total size returned will be set to -1
indicating the total size is unknown. Which is an improvement
from the @total_size hallucination that used to be returned 8-).

Currently the max size is a constant for all classes. You
could experiment with making max size a function that takes
a class/classname and @page_size and returns a different
limit value for different inputs. I didn't do it because
YAGNI but if YNI.

> When there are permissions with check methods on a Class,
> we are checking permissions *in python* not in the
> database. The consequence is that we do not know at SQL
> time which items will be displayed.

AFAICT all permission checks are in Python. Regardless of a
check method.

This bit of code:

        for item_id in obj_list:
            r = {}
            if self.db.security.hasPermission(
                'View', uid, class_name, itemid=item_id, property='id',
            ):
                r = {'id': item_id, 'link': class_path + item_id}
            if display_props:
                # format_item does the permission checks
                r.update(self.format_item(class_obj.getnode(item_id),
                    item_id, props=display_props, verbose=verbose))
            if r:
                result['collection'].append(r)

in rest.py filters obj_list (returned by the database) by
permission.

The permissions generated from the user's Role is
checked in Python even if a permission has no check method.

> I do have several classes with permission methods in my
> trackers. I've often thought about this, if it would make
> sense to implement permissions that can be checked at SQL
> time so that we do not need to filter in python. On some
> of my trackers filtering in python is a real performance
> killer because the tables are quite big...

Pushing permissions down to the DB backend is an interesting
idea. Sadly I can't think of a way to make it happen. With
the check method, Roundup's access control changes from an
RBAC system to something more powerful. In theory I can use
a check method on a permission to deny access based on time
of day.

This makes it more of an ABAC (Attribute based access
control) system. I think that expressiveness makes it
difficult if not impossible to push permissions to the
database layer and keeps it in the hyperdb.

> So in a first step we may want to use your new optimized
> implementation only if there are no check methods.

As I stated above, I think the Permissions processing
happens in Python in all cases. Changing the max size of a
returned set to a number 1 more than @page_size should fix
the bug that triggered this issue with no performance impact.

> Also the check methods are not sorted: Some users may see
> everything but we are sometimes checking the low-performance
> (method-based) checks before the lightweight checks that may
> permit everything for a certain role. An optimized version
> of the permission checks that sorts permissions by role and
> tries permissions without check methods first would be nice
> here.

AFAIK, permissions aren't sorted by role. The role is used
to generate a list of permissions and then is discarded. Did
you mean permission type/name?

I agree with optimizing the application of the permissions
list. I think I proposed something of the sort many years
ago. IIRC all permissions are in a single list. So you could
have (args are type, class, properties, check, props_only):

  Permission('Edit', user, [name, phonenumber], during_daytime)  (1)
  Permission('View', user, [name, phonenumber], during_daytime)  (2)
  Permission('Edit', issue, [nosy,], None, props_only=True)      (3)
  Permission('View', issue,)                                     (4)
  ...
  Permission('View', issue, [title, nosy, priority,])            (5)

in a single list. if you are checking to see if somebody can
view the issue, you have to iterate over 1-3 for each item
in the database before reaching 4.

Grouping the permission list by type/class:

   P[edit][user] = [
       Permission('edit', user, [name, phonenumber], during_daytime)
   ]

   P[view][issue] = [Permission('view', issue,),
            Permission('view', issue, [title, nosy, priority,])]

   ...

makes selecting all permissions by type and class an O(1)
rather than O(n) cost. It should have no effect on the
expressed permissions. This should reduce the number of
permissions that have to be iterated over. But, I have no
idea if traversal of the list is a major issue.

Sorting the permissions can help as well. I am concerned
that sorting could result in changing the applied
permissions from the unsorted state. AFAIK, unlike detectors
permissions don't have a method to set ordering which plays
a role in what permission is expressed.

> Sorry for making this issue into a wishlist :-)

I agree permissions and their application can be a
bottleneck. I wish I had better developer documentation on
how they work. IIRC when I added props_only it was because a
class level check was being allowed by a property level
permission. Documenting the various check and permission
types and how they play together would be helpful.

I think further discussion of this should be on a new ticket.
Agreed?
msg7975 Author: [hidden] (rouilj) Date: 2024-04-01 19:52
Linked the first two messages from a discussion on issue2551328.
Ralf noted that performance when retrieving a lot of data tanks
due to permission checks in python at the hyperdb layer.
msg7977 Author: [hidden] (schlatterbeck) Date: 2024-04-02 08:03
I'm cut&pasting my email from yesterday here, my MTA reports

<issues@roundup-tracker.org>: host bugs.python.org[167.71.181.142] said: 454
    4.7.1 <issues@roundup-tracker.org>: Relay access denied (in reply to RCPT
    TO command)

So it seems we can no longer reply to messages from our bug tracker.

On Mon, Apr 01, 2024 at 03:24:35PM +0000, John Rouillard wrote:
> > When there are permissions with check methods on a Class,
> > we are checking permissions *in python* not in the
> > database. The consequence is that we do not know at SQL
> > time which items will be displayed.
>
> AFAICT all permission checks are in Python. Regardless of a
> check method.

Yes, I know, I was highlighting this because it's slow :-)

> in rest.py filters obj_list (returned by the database) by
> permission.
>
> The permissions generated from the user's Role is
> checked in Python even if a permission has no check method.

Yes, and the permissions are checked in arbitrary order. My proposal
would be to group permissions by role (so that we need to check only the
permissions where we know the user has that role) *and* order the easy
ones (without a check method) first. That way we can
- Check only the permissions for the roles we have
- Check the easy ones first. On first hit we stop. That way in cases
  where a user has permissions that are easy to check we stop early
  without touching the performance-problematic ones

>
> Pushing permissions down to the DB backend is an interesting
> idea. Sadly I can't think of a way to make it happen. With
> the check method, Roundup's access control changes from an
> RBAC system to something more powerful. In theory I can use
> a check method on a permission to deny access based on time
> of day.

Sure. I was thinking of having less expressive permissions than we
currently have with check methods. We may find that for many use-cases
we can formulate permissions that are checkable in SQL.

> AFAIK, permissions aren't sorted by role. The role is used
> to generate a list of permissions and then is discarded. Did
> you mean permission type/name?

I meant that as a proposed change, see above.

> Grouping the permission list by type/class:
>
>    P[edit][user] = [
>        Permission('edit', user, [name, phonenumber], during_daytime)
>    ]
>
>    P[view][issue] = [Permission('view', issue,),
>             Permission('view', issue, [title, nosy, priority,])]
>
>    ...
>
> makes selecting all permissions by type and class an O(1)
> rather than O(n) cost. It should have no effect on the
> expressed permissions. This should reduce the number of
> permissions that have to be iterated over. But, I have no
> idea if traversal of the list is a major issue.

The permissions with check methods are heavy on performance, we may want
to check the ones without method first and only if we found no
permission check the others. Would improve performance a lot for some of
my use-cases.

>
> Sorting the permissions can help as well. I am concerned
> that sorting could result in changing the applied
> permissions from the unsorted state. AFAIK, unlike detectors
> permissions don't have a method to set ordering which plays
> a role in what permission is expressed.

I do not think the semantics change. If a person has multiple
permissions on a Class (and/or properties) this should be equivalent to
just one permission.

> I think further discussion of this should be on a new ticket.
> Agreed?
Umm, that maybe should have been at the beginning :-)

But, yes, lets start a new ticket.

[Well, the last comment is obsolete because the mail never made it]
msg7978 Author: [hidden] (rouilj) Date: 2024-04-05 02:25
> So it seems we can no longer reply to messages from our bug
> tracker.

I have a ticket open with psf. It appears this has been broken
since their OS upgrade at the begining of th year.

> On Mon, Apr 01, 2024 at 03:24:35PM +0000, John Rouillard wrote:
> > AFAICT all permission checks are in Python. Regardless of a
> > check method.
>
> Yes, I know, I was highlighting this because it's slow :-)

Gotcha.

> > in rest.py filters obj_list (returned by the database) by
> > permission.
> >
> > The permissions generated from the user's Role is checked in
> > Python even if a permission has no check method.
> 
> Yes, and the permissions are checked in arbitrary order.

Reordering the permissions can help here.

> My proposal would be to group permissions by role (so that we
> need to check only the permissions where we know the user has
> that role) *and* order the easy ones (without a check method)
> first. That way we can - Check only the permissions for the
> roles we have

How would that work? Would you need to replace a
hasPermission('View', 'issue' ...) in the core code with a
hasRole() check? If so how would the code know what Role to use?
The same permission could be in multiple roles.

Also I wonder (see below) if the permissions are grouped by role
by default.

There is a hasRole() in templating.py, but I don't think that's
used outside of the templating system. Also as I remember it
checks for Anonymous and maybe Admin roles only. These are the
only two roles with predefined meanings.

> - Check the easy ones first. On first hit we stop. That way in
>   cases where a user has permissions that are easy to check we
>   stop early without touching the performance-problematic ones

This is my understanding of how things work, do you agree?

1) user authenticates
2) user's roles are looked up
3) security object is populated with the permissions associated
   with the user's roles
4) hasPermission(X) calls are made to see if the user can do X

Sorting to put "easy ones" first makes sense. See below on my
questions about ordering/reordering.

> > Pushing permissions down to the DB backend is an interesting
> > idea. Sadly I can't think of a way to make it happen. With
> > the check method, Roundup's access control changes from an
> > RBAC system to something more powerful. In theory I can use a
> > check method on a permission to deny access based on time of
> > day.
> 
> Sure. I was thinking of having less expressive permissions than
> we currently have with check methods. We may find that for many
> use-cases we can formulate permissions that are checkable in
> SQL.

AFAIK access is denied only if no permission matches. A
permission can't return a "deny", it only says "I don't
match". Is that your understanding as well?

I can see a way to do some of this if we define relationships
like:

  owner
  actor

which are done via check commands currently. These map to a
database check on 'creator' and 'actor'. But expressing a
permission like:

  If the issue is marked private the user needs to be on one of
     two access lists (nosy and verynosy in my case) to see the
     issue

requires a like check for the id number in the two fields. Not
insurmountable. But you would need to define some sort of method
to implement this using a join (maybe, my SQL isn't great).

> > AFAIK, permissions aren't sorted by role. The role is used to
> > generate a list of permissions and then is discarded. Did you
> > mean permission type/name?
> 
> I meant that as a proposed change, see above.

Ok. It's possible that the role does have an impact on
ordering. See below.

> > Grouping the permission list by type/class:
> >
> >    P[edit][user] = [
> >        Permission('edit', user, [name, phonenumber], during_daytime)
> >    ]
> >
> >    P[view][issue] = [Permission('view', issue,),
> >             Permission('view', issue, [title, nosy, priority,])]
> >
> >    ...
> >
> > makes selecting all permissions by type and class an O(1)
> > rather than O(n) cost. It should have no effect on the
> > expressed permissions. This should reduce the number of
> > permissions that have to be iterated over. But, I have no
> > idea if traversal of the list is a major issue.
> 
> The permissions with check methods are heavy on performance, we
> may want to check the ones without method first and only if we
> found no permission check the others. Would improve performance
> a lot for some of my use-cases.

Since no permission can cause a 'deny' action, I think sorting it
is safe, but I am not willing to put money on that.

Adding grouping by class (or None for tracker level permissions)
to the existing grouping by permission name I think is a
win. Then sorting within those buckets by:

   no properties assigned no check function
   no properties assigned check function
   properties assigned and properties defined
      without prop_only no check
   properties assigned and properties defined
      without prop_only with check
   properties assigned and defined with prop_only no check function
   properties assigned and defined with prop_only and check function

might be faster. I think this ordering can be expressed using a tuple
with one position for each criteria. So a permission list item for the
first bucket is:

   (properties=0, check=0, prop-only=0, permission object)

and the last bucket is

   (properties=1, check=1, prop-only=1, permission object)

So sorting should be doable.

I am not sure what the default order of permisions is. I think the
order of permissions assigned to a role is in the order it's assigned
in schema.py. I would expect the order for a user with roles User,
Role1, Role2 would be:

  User permissions in order they were added in schema.py
  Role1 permissions in order they were added in schema.py
  Role2 permissions in order they were added in schema.py

Reordering into buckets as above could help.

> > Sorting the permissions can help as well. I am concerned that
> > sorting could result in changing the applied permissions from
> > the unsorted state. AFAIK, unlike detectors permissions don't
> > have a method to set ordering which plays a role in what
> > permission is expressed.
> 
> I do not think the semantics change. If a person has multiple
> permissions on a Class (and/or properties) this should be
> equivalent to just one permission.

I think you are corect but it's because a permission can't result
in a deny. It can only result in a "I don't match" so the
permission checker moves on to the next item.

Permissions defined with properties (e.g. View) match a request
for View permissions (without properties) on the class. So:

  havePermission('View', 'issue')

matches:

  addPermission(name='View', klass='issue', property=['title']).

This may or may not make a difference. It's possible that the
order I outlined above might be more efficient if all permisions
with a check were moved after the permissions without
checks. Otherwise the bucket order stays as above.

Thoughts?
msg8004 Author: [hidden] (rouilj) Date: 2024-04-22 15:43
Ralf, do you have more thoughts on this?
msg8015 Author: [hidden] (schlatterbeck) Date: 2024-04-29 11:02
On Mon, Apr 22, 2024 at 03:43:23PM +0000, John Rouillard wrote:
> Ralf, do you have more thoughts on this?

I had, yes, but was hit again by my reply being bounced. My reply was April 5th. Hmpf. My workflow here really depends on being able to reply to an email. That's what I wrote a month ago:

On Fri, Apr 05, 2024 at 02:25:49AM +0000, John Rouillard wrote:
> > Yes, and the permissions are checked in arbitrary order.
>
> Reordering the permissions can help here.

Yes, that's the idea.

>
> > My proposal would be to group permissions by role (so that we
> > need to check only the permissions where we know the user has
> > that role) *and* order the easy ones (without a check method)
> > first. That way we can - Check only the permissions for the
> > roles we have
>
> How would that work? Would you need to replace a
> hasPermission('View', 'issue' ...) in the core code with a
> hasRole() check? If so how would the code know what Role to use?
> The same permission could be in multiple roles.

No the hasPermission would stay. It just would check the permissions in
an order that has least effort first. And it would not check permissions
for roles the user doesn't have.

> Also I wonder (see below) if the permissions are grouped by role
> by default.

Yes, the implementation would store permissions grouped by role (e.g. in
a dict). In each dict we would have containers of permissions.
Permissions that are for the whole Class (not individual properties of a
class) *without* check methods would be in the first container, followed
by permissions on individual attributes (without check methods) followed
by everything with check methods, probably in different containers. So
we could check for each role the first entry (for the whole Class
without check methods), then each role the second entry, etc. The search
-- as it is now -- terminates when a matching entry is found.

> This is my understanding of how things work, do you agree?
>
> 1) user authenticates
> 2) user's roles are looked up
> 3) security object is populated with the permissions associated
>    with the user's roles
> 4) hasPermission(X) calls are made to see if the user can do X

Hmm, yes, the 3) will build data structures that can be more effectively
searched, see above.

> AFAIK access is denied only if no permission matches. A
> permission can't return a "deny", it only says "I don't
> match". Is that your understanding as well?

Yes.

> I can see a way to do some of this if we define relationships
> like:
>
>   owner
>   actor
>
> which are done via check commands currently. These map to a
> database check on 'creator' and 'actor'. But expressing a
> permission like:
>
>   If the issue is marked private the user needs to be on one of
>      two access lists (nosy and verynosy in my case) to see the
>      issue
>
> requires a like check for the id number in the two fields. Not
> insurmountable. But you would need to define some sort of method
> to implement this using a join (maybe, my SQL isn't great).

Yes, the owner checks are easy (but also require a join I think) and
other checks may be more challenging or even impossible. I think the
challenge is to come up with a user-understandable formulation of
sql-checkable permissions. So that permissions can be defined in a way
that the user understands *and* that can be translated to an SQL query.

> Since no permission can cause a 'deny' action, I think sorting it
> is safe, but I am not willing to put money on that.

The order currently is arbitrary and depends on the order of definition.
I think we would have seen a lot of bugs if this was order-dependent.

So, yes, I would put some money on that :-)

> Adding grouping by class (or None for tracker level permissions)
> to the existing grouping by permission name I think is a
> win. Then sorting within those buckets by:
>
>    no properties assigned no check function
>    no properties assigned check function
>    properties assigned and properties defined
>       without prop_only no check
>    properties assigned and properties defined
>       without prop_only with check
>    properties assigned and defined with prop_only no check function
>    properties assigned and defined with prop_only and check function

What do you mean be
 properties assigned and properties defined without prop_only
etc.?

I would put the ones with check functions last. See above where I
outlined this. The check functions are the problematic part concerning
performance.

> might be faster. I think this ordering can be expressed using a tuple
> with one position for each criteria. So a permission list item for the
> first bucket is:
>
>    (properties=0, check=0, prop-only=0, permission object)
>
> and the last bucket is
>
>    (properties=1, check=1, prop-only=1, permission object)
>
> So sorting should be doable.

I agree the principle with buckets, just the order of the checks would
be different checking permissions with check functions last.

> I am not sure what the default order of permisions is. I think the
> order of permissions assigned to a role is in the order it's assigned
> in schema.py. I would expect the order for a user with roles User,
> Role1, Role2 would be:
>
>   User permissions in order they were added in schema.py
>   Role1 permissions in order they were added in schema.py
>   Role2 permissions in order they were added in schema.py
>
> Reordering into buckets as above could help.

Yes to both, order is order of definition in schema.py and reordering
could help :-)

> > > Sorting the permissions can help as well. I am concerned that
> > > sorting could result in changing the applied permissions from
> > > the unsorted state. AFAIK, unlike detectors permissions don't
> > > have a method to set ordering which plays a role in what
> > > permission is expressed.
> >
> > I do not think the semantics change. If a person has multiple
> > permissions on a Class (and/or properties) this should be
> > equivalent to just one permission.
>
> I think you are corect but it's because a permission can't result
> in a deny. It can only result in a "I don't match" so the
> permission checker moves on to the next item.

Yes. It iterates over permissions until a match or until end of
permissions.

> Permissions defined with properties (e.g. View) match a request
> for View permissions (without properties) on the class. So:
>
>   havePermission('View', 'issue')
>
> matches:
>
>   addPermission(name='View', klass='issue', property=['title']).
>
> This may or may not make a difference. It's possible that the
> order I outlined above might be more efficient if all permisions
> with a check were moved after the permissions without
> checks. Otherwise the bucket order stays as above.

Yes. We should definitely check entries with a check function last.

Thanks for looking into this!

Ralf
msg8016 Author: [hidden] (rouilj) Date: 2024-04-29 17:40
Hi Ralf:

In message <1714388568.86.0.772511523784.issue2551330@roundup.psfhosted.org>,
Ralf Schlatterbeck writes:
>Ralf Schlatterbeck added the comment:
>
>On Mon, Apr 22, 2024 at 03:43:23PM +0000, John Rouillard wrote:
>> Ralf, do you have more thoughts on this?
>
>I had, yes, but was hit again by my reply being bounced. My reply was
> April 5th. Hmpf. My workflow here really depends on being able to
> reply to an email. That's what I wrote a month ago:

EWDurbin fixed this on April 15th.

  https://github.com/python/psf-salt/issues/352

Email has been working for me since.

>On Fri, Apr 05, 2024 at 02:25:49AM +0000, John Rouillard wrote:
>> > My proposal would be to group permissions by role (so that we
>> > need to check only the permissions where we know the user has
>> > that role) *and* order the easy ones (without a check method)
>> > first. That way we can - Check only the permissions for the
>> > roles we have
>>
>> How would that work? Would you need to replace a
>> hasPermission('View', 'issue' ...) in the core code with a
>> hasRole() check? If so how would the code know what Role to use?
>> The same permission could be in multiple roles.
>
>No the hasPermission would stay. It just would check the permissions in
>an order that has least effort first. And it would not check permissions
>for roles the user doesn't have.

I'm still not getting it.

>> Also I wonder (see below) if the permissions are grouped by role
>> by default.
>
>Yes, the implementation would store permissions grouped by role (e.g. in
>a dict). In each dict we would have containers of permissions.
>Permissions that are for the whole Class (not individual properties of a
>class) *without* check methods would be in the first container, followed
>by permissions on individual attributes (without check methods) followed
>by everything with check methods, probably in different containers. So
>we could check for each role the first entry (for the whole Class
>without check methods), then each role the second entry, etc. The search
>-- as it is now -- terminates when a matching entry is found.

Hang on. By 'role' do you mean Permission like "View" or Role as in
the user class: Admin, User, Anonymous? I am interpreting Role as
Admin ....

>> This is my understanding of how things work, do you agree?
>>
>> 1) user authenticates
>> 2) user's roles are looked up
>> 3) security object is populated with the permissions associated
>>    with the user's roles
>> 4) hasPermission(X) calls are made to see if the user can do X
>
>Hmm, yes, the 3) will build data structures that can be more effectively
>searched, see above.

Ok, but there is no role lookup (Admin, Anonymous) at any time in
4. Roles just don't exist there.

>> AFAIK access is denied only if no permission matches. A
>> permission can't return a "deny", it only says "I don't
>> match". Is that your understanding as well?
>
>Yes.
>
>> I can see a way to do some of this if we define relationships
>> like:
>>
>>   owner
>>   actor
>>
>> which are done via check commands currently. These map to a
>> database check on 'creator' and 'actor'. But expressing a
>> permission like:
>>
>>   If the issue is marked private the user needs to be on one of
>>      two access lists (nosy and verynosy in my case) to see the
>>      issue
>>
>> requires a like check for the id number in the two fields. Not
>> insurmountable. But you would need to define some sort of method
>> to implement this using a join (maybe, my SQL isn't great).
>
>Yes, the owner checks are easy (but also require a join I think) and
>other checks may be more challenging or even impossible. I think the
>challenge is to come up with a user-understandable formulation of
>sql-checkable permissions. So that permissions can be defined in a way
>that the user understands *and* that can be translated to an SQL query.

This has to be an optimization unless we want to drop support for
non-sql databases. I don't like dropping support for non-sql databases
from the hyperdb.

>> Since no permission can cause a 'deny' action, I think sorting it
>> is safe, but I am not willing to put money on that.
>
>The order currently is arbitrary and depends on the order of definition.
>I think we would have seen a lot of bugs if this was order-dependent.
>
>So, yes, I would put some money on that :-)

Good.

>> Adding grouping by class (or None for tracker level permissions)
>> to the existing grouping by permission name I think is a
>> win. Then sorting within those buckets by:
>>
>>    no properties assigned no check function
>>    no properties assigned check function
>>    properties assigned and properties defined
>>       without prop_only no check
>>    properties assigned and properties defined
>>       without prop_only with check
>>    properties assigned and defined with prop_only no check function
>>    properties assigned and defined with prop_only and check function
>
>What do you mean be
> properties assigned and properties defined without prop_only
>etc.?

If I have a rule "View access to username in User class" it implicitly
allows "View User". So:

   db.security.addPermission(name='View', klass='user',
    properties=('activity', 'id', 'organisation', 'phone', 'realname',
            'timezone', 'username'),
            description="User can have limited view of other users",
            props_only=False)

causes hasPermission("View", "1", "user") to return true. This is a
case of "no prop_only". The permission:

   db.security.addPermission(name='View', klass='user',
    properties=('activity', 'id', 'organisation', 'phone', 'realname',
            'timezone', 'username'),
            description="User can have limited view of other users",
            props_only=True)

would not satisfy hasPermission("View", "1", "user"). It would satisfy
hasPermission("View", "1", "user", property="phone"). This last is an
example of a permission with prop_only.

>I would put the ones with check functions last. See above where I
>outlined this. The check functions are the problematic part concerning
>performance.

Agreed.

>> might be faster. I think this ordering can be expressed using a tuple
>> with one position for each criteria. So a permission list item for the
>> first bucket is:
>>
>>    (properties=0, check=0, prop-only=0, permission object)
>>
>> and the last bucket is
>>
>>    (properties=1, check=1, prop-only=1, permission object)
>>
>> So sorting should be doable.
>
>I agree the principle with buckets, just the order of the checks would
>be different checking permissions with check functions last.

Fair enough. IIRC, the check functions aren't called unless there is
an exact match. So check function is called for:

  hasPermission("View", "1", "keyword")

given:

  db.security.addPermission(name='View', klass='keyword',
      check=own_keyword,
      description="User is allowed to edit what they created")

but not for:

  db.security.addPermission(name='View', klass='keyword',
      check=own_keyword, properties="name")

because the check doesn't specify any properties (but still allows
hasPermission to pass). I am a little hazy on this but I think I'm
correct. I think something like this was what led to adding
props_only.

>> I am not sure what the default order of permissions is. I think the
>> order of permissions assigned to a role is in the order it's assigned
>> in schema.py. I would expect the order for a user with roles User,
>> Role1, Role2 would be:
>>
>>   User permissions in order they were added in schema.py
>>   Role1 permissions in order they were added in schema.py
>>   Role2 permissions in order they were added in schema.py
>>
>> Reordering into buckets as above could help.
>
>Yes to both, order is order of definition in schema.py and reordering
>could help :-)

As long as we preserve the ordering that sounds good to me.

>> > > Sorting the permissions can help as well. I am concerned that
>> > > sorting could result in changing the applied permissions from
>> > > the unsorted state. AFAIK, unlike detectors permissions don't
>> > > have a method to set ordering which plays a role in what
>> > > permission is expressed.
>> >
>> > I do not think the semantics change. If a person has multiple
>> > permissions on a Class (and/or properties) this should be
>> > equivalent to just one permission.
>>
>> I think you are correct but it's because a permission can't result
>> in a deny. It can only result in a "I don't match" so the
>> permission checker moves on to the next item.
>
>Yes. It iterates over permissions until a match or until end of
>permissions.

Understood.

>> Permissions defined with properties (e.g. View) match a request
>> for View permissions (without properties) on the class. So:
>>
>>   havePermission('View', 'issue')
>>
>> matches:
>>
>>   addPermission(name='View', klass='issue', property=['title']).
>>
>> This may or may not make a difference. It's possible that the
>> order I outlined above might be more efficient if all permisions
>> with a check were moved after the permissions without
>> checks. Otherwise the bucket order stays as above.
>
>Yes. We should definitely check entries with a check function last.

As I noted above I think a permission with a check can still cause a
check to pass even without running the check.

I still find:

   db.security.addPermission(name='View', klass='user',
    properties=('activity', 'id', 'organisation', 'phone', 'realname',
            'timezone', 'username'),
            description="User can have limited view of other users")

matching hasPermission("View", "1", "user") unexpected. It makes sense
that I would need View rights on user (traversal rights) to access the
phone but...

>Thanks for looking into this!

Yeah something to look at for 2.5. I think improving the permission
data structure is doable. I am less optimistic about being able to
push this down to SQL.
msg8017 Author: [hidden] (schlatterbeck) Date: 2024-04-29 18:24
On Mon, Apr 29, 2024 at 05:40:48PM +0000, John Rouillard wrote:
> >I had, yes, but was hit again by my reply being bounced. My reply was
> > April 5th. Hmpf. My workflow here really depends on being able to
> > reply to an email. That's what I wrote a month ago:
> 
> EWDurbin fixed this on April 15th.
...
> Email has been working for me since.

Great, thanks for pushing this!

> >> Also I wonder (see below) if the permissions are grouped by role
> >> by default.
> >
> >Yes, the implementation would store permissions grouped by role (e.g. in
> >a dict). In each dict we would have containers of permissions.
> >Permissions that are for the whole Class (not individual properties of a
> >class) *without* check methods would be in the first container, followed
> >by permissions on individual attributes (without check methods) followed
> >by everything with check methods, probably in different containers. So
> >we could check for each role the first entry (for the whole Class
> >without check methods), then each role the second entry, etc. The search
> >-- as it is now -- terminates when a matching entry is found.
> 
> Hang on. By 'role' do you mean Permission like "View" or Role as in
> the user class: Admin, User, Anonymous? I am interpreting Role as
> Admin ....

Yes role in the user class. My use-cases involve lots of roles.
And we're seeing large search times even for users that have full
permission on a class because apparently permissions of other roles
(which the current user *also* has) are checked first. So the idea would
be to group permission by role.

> 
> >> This is my understanding of how things work, do you agree?
> >>
> >> 1) user authenticates
> >> 2) user's roles are looked up
> >> 3) security object is populated with the permissions associated
> >>    with the user's roles
> >> 4) hasPermission(X) calls are made to see if the user can do X
> >
> >Hmm, yes, the 3) will build data structures that can be more effectively
> >searched, see above.
> 
> Ok, but there is no role lookup (Admin, Anonymous) at any time in
> 4. Roles just don't exist there.

They do, from roundup/security.py:

    def hasPermission(self, permission, userid, classname=None,
                      property=None, itemid=None):
        '''Look through all the Roles, and hence Permissions, and
           see if "permission" exists given the constraints of
           classname, property, itemid, and props_only.
[...]
        for rolename in self.db.user.get_roles(userid):
            if not rolename or (rolename not in self.role):
                continue
            # for each of the user's Roles, check the permissions
            for perm in self.role[rolename].permissions:

I propose to change the order of that iterations:
- First loop over the 'easy' targets, for each role check only the
  permissions without a permission method
Only if nothing is found:
- Loop over roles again this time using the costly checks *with* check
  methods

> >Yes, the owner checks are easy (but also require a join I think) and
> >other checks may be more challenging or even impossible. I think the
> >challenge is to come up with a user-understandable formulation of
> >sql-checkable permissions. So that permissions can be defined in a way
> >that the user understands *and* that can be translated to an SQL query.
> 
> This has to be an optimization unless we want to drop support for
> non-sql databases. I don't like dropping support for non-sql databases
> from the hyperdb.

I want them *in addition* to the current methods.
And if we come up with a suitable formulation this can be implemented
for non-sql backends, too. But for non-sql databases there is probably
no performance improvement.

> >What do you mean be
> > properties assigned and properties defined without prop_only
> >etc.?
> 
> If I have a rule "View access to username in User class" it implicitly
> allows "View User". So:
> 
>    db.security.addPermission(name='View', klass='user',
>     properties=('activity', 'id', 'organisation', 'phone', 'realname',
>             'timezone', 'username'),
>             description="User can have limited view of other users",
>             props_only=False)
> 
> causes hasPermission("View", "1", "user") to return true. This is a
> case of "no prop_only". The permission:
> 
>    db.security.addPermission(name='View', klass='user',
>     properties=('activity', 'id', 'organisation', 'phone', 'realname',
>             'timezone', 'username'),
>             description="User can have limited view of other users",
>             props_only=True)
> 
> would not satisfy hasPermission("View", "1", "user"). It would satisfy
> hasPermission("View", "1", "user", property="phone"). This last is an
> example of a permission with prop_only.

Thanks I forgot we had that additional prop_only flag.

> >I agree the principle with buckets, just the order of the checks would
> >be different checking permissions with check functions last.
> 
> Fair enough. IIRC, the check functions aren't called unless there is
> an exact match. So check function is called for:
> 
>   hasPermission("View", "1", "keyword")
> 
> given:
> 
>   db.security.addPermission(name='View', klass='keyword',
>       check=own_keyword,
>       description="User is allowed to edit what they created")
> 
> but not for:
> 
>   db.security.addPermission(name='View', klass='keyword',
>       check=own_keyword, properties="name")
> 
> because the check doesn't specify any properties (but still allows
> hasPermission to pass). I am a little hazy on this but I think I'm
> correct. I think something like this was what led to adding
> props_only.

Yes, this may be the case.
But I think when displaying the results of a search (e.g. in an index
template of the web interface) all properties are checked explicitly.

> >> I am not sure what the default order of permissions is. I think the
> >> order of permissions assigned to a role is in the order it's assigned
> >> in schema.py. I would expect the order for a user with roles User,
> >> Role1, Role2 would be:
> >>
> >>   User permissions in order they were added in schema.py
> >>   Role1 permissions in order they were added in schema.py
> >>   Role2 permissions in order they were added in schema.py
> >>
> >> Reordering into buckets as above could help.
> >
> >Yes to both, order is order of definition in schema.py and reordering
> >could help :-)
> 
> As long as we preserve the ordering that sounds good to me.

I do not think we need to preserve the ordering: If two permissions
(maybe even for different roles held by the same user) specify that a
user has access it doesn't matter which one triggers first.

> As I noted above I think a permission with a check can still cause a
> check to pass even without running the check.
> 
> I still find:
> 
>    db.security.addPermission(name='View', klass='user',
>     properties=('activity', 'id', 'organisation', 'phone', 'realname',
>             'timezone', 'username'),
>             description="User can have limited view of other users")
> 
> matching hasPermission("View", "1", "user") unexpected. It makes sense
> that I would need View rights on user (traversal rights) to access the
> phone but...

Yes, me too. I would call it a bug. And I *think* the props_only flag
was added for backwards compatibility when it was discovered that the
checks are un-untuitive (or should I say buggy). And I also think that
roundup core doesn't contain any permissions on properties without the
props_only flag, at least I think I remember this from the far past.

> Yeah something to look at for 2.5. I think improving the permission
> data structure is doable. I am less optimistic about being able to
> push this down to SQL.

Yes I think so, too. If I had an idea of how permission with check
methods could be specified more SQL-friendly I would already have
proposed it :-)

Thanks
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8128 Author: [hidden] (schlatterbeck) Date: 2024-10-15 16:37
Concerning speed-up via SQL:
The main speed problem is for index templates: After the SQL has returned it will check all issues
*in python* for visibility. And I think for every issue all permissions are checked again in the same
order.
There are fast permission checks that pass almost instantly like when the user has access to the whole class.
There are slow permission checks that require execution of a check method in python. A permission check method in python might make further queries slowing down things even more.
The idea outlined in earlier messages of sorting permissions by speed (i.e. the ones with check
methods would be sorted last) would fix the problem for users having permissions without a check
method because the first check that passes terminates the permission check.

For users *only* having permissions that require execution of a check-method we could speed this up if
- we can formulate a fast permission check (either in SQL or in something similar to the current
  query language of the REST API), this would be ANDed in SQL with the generated user query. If
  such a method is present we would *not* filter issues in python but instead rely on SQL to return
  only visible issues
- A permission on the whole class without a check method would have a trivial "True" here
- This would be *in addition* to the normal visibility checks but after retrieval of the list of
  issues from SQL we would *not* run checks in python to determine which issues to show
- If there are several permissions with check methods this would only work if all of them have the
  fast permission check set. The fast permission checks for several issues would need to be ORed in
  SQL.

The reason why this would speed up things is that we do a first visibility check in SQL. Only later will we check each issue for visibility but this will occur only for a handful of issues displayed in one batch of the index template.

Let me know what you think...
msg8129 Author: [hidden] (rouilj) Date: 2024-10-17 01:38
Hi Ralf:

In message <1729010267.65.0.710188892934.issue2551330@roundup.psfhosted.org>,
Ralf Schlatterbeck writes:
>Concerning speed-up via SQL:
>The main speed problem is for index templates: After the SQL has
>returned it will check all issues *in python* for visibility. And I
>think for every issue all permissions are checked again in the same
>order.

I agree with this. What the permission order is, I am not quite
sure. I think it's the same order that the permissions are assigned in
schema.py.

So for an issue index, you are scanning and discarding checks for the
user, keyword ... classes even though you only care about issues.

>There are fast permission checks that pass almost instantly like when
>the user has access to the whole class.  There are slow permission
>checks that require execution of a check method in python. A
>permission check method in python might make further queries slowing
>down things even more.  The idea outlined in earlier messages of
>sorting permissions by speed (i.e. the ones with check methods would
>be sorted last) would fix the problem for users having permissions
>without a check method because the first check that passes terminates
>the permission check.

I think this is right. IIRC the evaluation algorithm is:

   if permission test passes, return allow
   if permission test fails, go to next permission
   if no more permissions available return deny

If you can satify yourself that this is correct, then I claim order
doesn't matter.

(Note search operations check against both view and search
permissions, which I think is unique to have one operation check two
different permissions types. This might complicate things.)

>For users *only* having permissions that require execution of a
>check-method we could speed this up if
>- we can formulate a fast permission check (either in SQL or in
>  something similar to the current query language of the REST API),
>  this would be ANDed in SQL with the generated user query. If such a
>  method is present we would *not* filter issues in python but instead
>  rely on SQL to return only visible issues

So the check method would return SQL that can be pushed to the DB level?

Would these examples match what you are proposing?

   if userid == issue.creator allow

   results in SQL query:

      AND issue.creator == `<user id>`

  if creator has manager role, allow:

  results in python query (or subselect)

      manager_set = select id where user.role == `manager`

  with
       AND issue.creator in {manager set}

>  - A permission on the whole class without a check method would have a
>    trivial "True" here

and could be removed from the SQL query.

>  - This would be *in addition* to the normal visibility checks but
>    after retrieval of the list of issues from SQL we would *not* run
>    checks in python to determine which issues to show

Right if we pushed a check down to the SQL db level, don't rerun it in
Python.

>  - If there are several permissions with check methods this would only
>    work if all of them have the fast permission check set. The fast
>    permission checks for several issues would need to be ORed in SQL.

Right so the row is returned if any one of the checks pass.

>The reason why this would speed up things is that we do a first
>visibility check in SQL. Only later will we check each issue for
>visibility but this will occur only for a handful of issues displayed
>in one batch of the index template.

The hope is that pushing permission rules into the SQL will return
fewer rows that need to be filtered in Python.

I like the idea. A few questions about your tracker:

  1) how many permissions are tested in python before a row is
     allowed? (similar stats for denied would also be good)

  2) how many permissions for the wrong class are tested before a row
     is allowed/denied?

  3) how many permissions of the wrong type (e.g. view vs edit) are
     tested before a row is allowed/denied.

If 2 is 50% less than 1, we can restructure the permissions from

   permissions = [(permission tuple)+]

to

   permissions = { 'klass': [(permission tuple)+], 'klass2': ...}

This is a win regardless of the back end.

If 3 is also some large percentage of 1, maybe make permissions:

   permissions = { 'klass': { 'perm_name': [(permission tuple)+], 'pn2': ...},
                   'klass2': ...}

I would suggest finding this data first before we start pushing down
to the SQL level. I was going to add capturing:

  4) how many rows are returned from the db that are discarded by
     python filtering 

however if we have allow and deny stats for 1-3 I think we can
calculate 4.

Also if we agree that sort order doesn't matter, sorting (probably at
insertion time) in the order:

  only klass/name specified (includes klass = None (e.g. 'Web Access'))
  properties defined (with props_only=true or false, unordered by props_only)
  check defined

for each [(permission tuple)+] should help as well.

Also memoizing hasPermission() might help as well. If a call to
hasPermission doesn't include the itemid, we might be able to reduce
dozens of calls to a simple lookup.

Also you may have explained this, but I'm not recognizing it from
earlier in this ticket. The user's permissions only include
permissions defined for that user's roles. So why would we use Roles
as a selector (permissions['Role']['class']['perm_name']

Also another thought about including roles in the selection
structure. If role A and role B overlap such that both of them have
the same permission, does the permissions structure get two copies of
the same permission? If so then adding a filter/iterator over Roles
may or may not make sense. If the same permission was added by Role A
and one by Role B so we care which permission is used to allow the
item?

Personally I would prefer that the duplicate rule not be added at all,
but that seems difficult to determine and is not low lying fruit.

Also if somebody has 30 roles, adding roles to the tuple could mean 29
lookups and possible iterations over the permissions tuple list. Can
the admin define the order that Roles are returned? I seem to recall
the roles are returned in alphabetic sorted order but....

Thanks for prompting me on this.
msg8130 Author: [hidden] (rouilj) Date: 2024-10-17 03:32
Hi Ralf:

I did a quick instrumentation of hasPermission in my custom tracker.
I displayed an issue index page logged in as admin with Admin role.

The index page has the following columns:

  ID, Activity, Title, Status, Creator, Assigned To

and groups by Priority. Using:

  grep hasPermission roundup.log| sed 's/^.*DEBUG: //' | sort | uniq -c | sort -n

I see 1241 calls of which 533 are not issue class calls. One in
particular stands out. Stats are:

     (name, userid, class, property, itemid) (roles scanned, perms scanned):

    426 hasPermission: (Web Access, 1, None, None, None) (1, 6) allow

So this check alone is 2556 permission evaulations. Just removing that
from the permissions list gets rid of 5*426 permission checks and 30%
of the calls to hasPermissions. Memoizing this would be a big win if
we can't eliminate/cache the 'Web Access' call at a higher level.

I have 15 sets of other non-issue calls that look like:

      1 hasPermission: (View, 1, query, klass, 4) (1, 5) allow
      1 hasPermission: (View, 1, query, url, 4) (1, 5) allow
      3 hasPermission: (View, 1, query, name, 4) (1, 5) allow

which makes sense as they are queries showing up in my sidebar. (Not
quite sure about the origin for the klass property check). I also have
the following 12 times:

      1 hasPermission: (View, 1, keyword, None, 10) (1, 5) allow

one for each keyword I own. Again not sure where that is triggered.

Then some randoms:

      1 hasPermission: (Create, 1, user, None, None) (1, 1) allow

Create user link

      2 hasPermission: (Create, 1, keyword, None, None) (1, 1) allow

display the create keyword link inefficiently (the second call should
be replaced by a predefined template variable 8-)).

      1 hasPermission: (Edit, 1, None, None, None) (1, 2) allow
      1 hasPermission: (Edit, 1, keyword, None, None) (1, 2) allow

display the edit existing keyword link (db.keyword.list is called as
well which is probably the trigger for the keyword prop check, if I
don't own any keywords I can't edit them)

      1 hasPermission: (Register, 1, user, None, None) (1, 14) deny

display the register link if the user has permission.

      1 hasPermission: (View, 1, user, username, 1) (1, 5) allow

to display my login name (hello admin)

      2 hasPermission: (View, 1, user, None, None) (1, 5) allow

user admin links.

      2 hasPermission: (View, 1, user, theme, 1) (1, 5) allow

loading a theme for the tracker.

Then we have:

      4 hasPermission: (Create, 1, issue, None, None) (1, 1) allow

I am not sure about the origin. I only have two create issue
links in the template. Also:

      4 hasPermission: (View, 1, issue, None, None) (1, 5) allow

not sure why there are 4.

Also this seems to indicate that View permissions are lower in the
permission list than create or edit. It seems they are satisfied on
the fifth permission check. But View also get used more often. So
there may be some additional optimization we can do there to reduce
the number of permissions checked to get an allow response.

Then for each issue (50 total) it looks like:

      1 hasPermission: (View, 1, issue, activity, 210) (1, 5) allow
      1 hasPermission: (View, 1, issue, assignedto, 210) (1, 5) allow
      1 hasPermission: (View, 1, issue, creator, 210) (1, 5) allow
      1 hasPermission: (View, 1, issue, priority, 210) (1, 5) allow
      1 hasPermission: (View, 1, issue, title, 210) (1, 5) allow
  (*) 2 hasPermission: (View, 1, issue, None, 210) (1, 5) allow
      2 hasPermission: (View, 1, issue, status, 210) (1, 5) allow

The * log is present for every issue even those that are NOT
displayed. I only show 50 of the 173 displayed.

So this is another thing to look at. We fetch and check perms on a lot
more items for an issue index even if they aren't displayed.  Since
none of these rows are displayed, only issue item checks without
property permission checks are done. I wonder if this is a major cause
of your slowdown?

Maybe we can stop checking permissions once we have page_size number
of rows?

We could also fetch fewer rows similar to the hack/patch discussed in
issue 2551328.But this patch uses the offset and page number to
calculate the database offset. If there are a lot of discarded
entries, the patch can refetch rows that were already shown in a prior
page. Also the count (issues 50 of 435) is not valid since the db
fetch may be smaller than the total number of permitted rows. I don't
have a good idea for addressing the potential miscount.
msg8131 Author: [hidden] (schlatterbeck) Date: 2024-10-17 11:26
On Thu, Oct 17, 2024 at 03:32:58AM +0000, John Rouillard wrote:
> 
> I did a quick instrumentation of hasPermission in my custom tracker.
> I displayed an issue index page logged in as admin with Admin role.

Cool, I'll try running this for my tracker.

Concerning memoization: 
In method 'batch' in roundup/cgi/templating.py we have:

    def batch(self, permission='View'):
        check = self._client.db.security.hasPermission
        ...
        # filter for visibility
        allowed = [id for id in klass.filter(matches, filterspec, sort, group)
                   if check(permission, userid, self.classname, itemid=id)]

So we're calling the check method for *each issue found* (or whatever
table we're currently searching).
I would argue that if we find any permission on the given 'klass' that
does not have a check method, we do not need to do this: The check will
always tell us that we can see at least one property of *each* item in
the klass. So we do not need to filter and consequently do not need to
loop over all issues found when we establish that fact before the loop.

When rendering the HTML, the check is performed again for each property.
If the property should not be shown it is rendered as '[Hidden]'.

So no need for memoization.

If we find only permissions *with* a check method we *need* to check
*each of them* in the loop. And we cannot memoize things because the
check is different for each item found.

Now my second proposal comes into play: If we have check methods with
SQL (or some other mechanism that ultimately compiles to SQL) we can
*OR* all the SQL for each check method and put that with an AND into the
[sql generated by the] filter call. Then the loop would again only
return valid items and again we would not have to do it in python in the
loop.

Of course this applies only to View permissions. And only when
retrieving a list of issues (or other classes), either in the batch
method cited above or in the REST (and probably XMLRPC) api.

Note that I think the performance hog lies only in searching, not in
other permission checks like creation or editing. And it probably
applies only to 'View' permissions (I don't think batch is ever called
with an explicit permission other than 'View')

Thanks for looking into this
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8132 Author: [hidden] (rouilj) Date: 2024-10-17 14:14
Hi Ralf:

In message <20241017112611.hn2rkqcg5ez2kcbo@runtux.com>,
Ralf Schlatterbeck writes:
>On Thu, Oct 17, 2024 at 03:32:58AM +0000, John Rouillard wrote:
>> 
>> I did a quick instrumentation of hasPermission in my custom tracker.
>> I displayed an issue index page logged in as admin with Admin role.
>
>Cool, I'll try running this for my tracker.

Thanks.

>Concerning memoization: 
>In method 'batch' in roundup/cgi/templating.py we have:
>
>    def batch(self, permission='View'):
>        check = self._client.db.security.hasPermission
>        ...
>        # filter for visibility
>        allowed = [id for id in klass.filter(matches, filterspec, sort, group)
>                   if check(permission, userid, self.classname, itemid=id)]
>
>So we're calling the check method for *each issue found* (or whatever
>table we're currently searching).

Correct. In my followup it looks like this would correspond to the

   2 hasPermission: (View, 1, issue, None, 210) (1, 5) allow

logs lines. That would also seem to be the point where every item
(even those that willl never be displayed because they are larger than
the batch size) is included.

>I would argue that if we find any permission on the given 'klass' that
>does not have a check method, we do not need to do this: The check will
>always tell us that we can see at least one property of *each* item in
>the klass. So we do not need to filter and consequently do not need to
>loop over all issues found when we establish that fact before the loop.

Your logic sounds right. The itemid AFAIK is only used in a check method.

Would the following code (properly indented) do the trick:

    if check(permission, userid, self.classname):
      allowed = [id for id in klass.filter(matches, filterspec, sort, group)]
    else:
      allowed = [id for id in klass.filter(matches, filterspec, sort, group)
                 if check(permission, userid, self.classname, itemid=id)]

The check(permission, userid, self.classname) should fail if all
permissions for 'permission' have a check command or properties with a
props_only=true.

>When rendering the HTML, the check is performed again for each property.
>If the property should not be shown it is rendered as '[Hidden]'.
>
>So no need for memoization.

I did a fast memoization change (patch attached) for hasPermission if
the check command and item command are missing. That at least seems to
be used for the 'Web Access' check in templating.py
HTMLClass::is_X_ok() where X is view, edit, restore ... and in the
HTMLClass::list() method.

>If we find only permissions *with* a check method we *need* to check
>*each of them* in the loop. And we cannot memoize things because the
>check is different for each item found.

Correct. Also it looks like the item check is done only once in most
cases so memoization of that check wouldn't help.

>Now my second proposal comes into play: If we have check methods with
>SQL (or some other mechanism that ultimately compiles to SQL) we can
>*OR* all the SQL for each check method and put that with an AND into the
>[sql generated by the] filter call. Then the loop would again only
>return valid items and again we would not have to do it in python in the
>loop.

We still need to figure out a method to skip the python permission
checks only for those checks that are pushed down to the DB.

>Of course this applies only to View permissions. And only when
>retrieving a list of issues (or other classes), either in the batch
>method cited above or in the REST (and probably XMLRPC) api.

Agreed.

>Note that I think the performance hog lies only in searching, not in
>other permission checks like creation or editing.

By searching do you mean finding issues to display on an index page?

Also I have a spreadsheet like editing interface for issues where I
mix an index with edit checks. So it might not always be a View
permission.

>And it probably
>applies only to 'View' permissions (I don't think batch is ever called
>with an explicit permission other than 'View')

I'm not sure that batch is only called with view permissions. A quick
grep for 'batch(' doesn't return any calls to batch() in the code. I
assume it's triggered from templates. Is there any reason to assume
that a template can't generate a batch(..., permission="edit") call?

If View is the only permission used in a batch call, why does the
batch() method have a permission parameter?

Thoughts?
msg8133 Author: [hidden] (schlatterbeck) Date: 2024-10-17 14:54
On Thu, Oct 17, 2024 at 03:32:58AM +0000, John Rouillard wrote:
> 
> I did a quick instrumentation of hasPermission in my custom tracker.
> I displayed an issue index page logged in as admin with Admin role.

I'm getting things like this:

roundup.hyperdb - DEBUG - SQL 'select _time_project.id,lower(_time_project._name),(lower(_time_project._name) is not NULL) from _time_project   where _time_project.__retired__=0 order by (lower(_time_project._name) is not NULL),lower(_time_project._name),_time_project.id' ()
roundup.security - DEBUG - hasPermission: (View, 44, time_project, None, 762) (1, 5) allow
roundup.security - DEBUG - hasPermission: (View, 44, time_project, None, 199) (1, 5) allow
roundup.security - DEBUG - hasPermission: (View, 44, time_project, None, 664) (1, 5) allow
[250 more lines like this]

roundup.hyperdb - DEBUG - SQL 'select _purchase_request.id,_purchase_request._delivery_deadline,(_purchase_request._delivery_deadline is not NULL) from _purchase_request   where _purchase_request.__retired__=0 order by (_purchase_request._delivery_deadline is not NULL) desc,_purchase_request._delivery_deadline desc,_purchase_request.id' ()
roundup.security - DEBUG - hasPermission: (View, 44, purchase_request, None, 6381) (1, 5) allow
roundup.security - DEBUG - hasPermission: (View, 44, purchase_request, None, 6398) (1, 5) allow
[> 6k lines like this]

This is a user with many permissions. For another user I'm getting many of
roundup.security - DEBUG - hasPermission: (View, 833, purchase_request, None, 5946) (2, 99) deny

But this is intermixed with other SQL queries, probably because the check methods run
other sql checks.

The current problem is that users without many permissions run into uwsgi-configured
timeouts of 10 minutes for about 3000 lines.

More verbose during the purchase_request query:
roundup.hyperdb - DEBUG - SQL 'select _activity,_actor,_charge_to,_continuous_obligation,_contract_term,_creation,_creator,_date_approved,_date_ordered,_date_progress,_delivery_address,_delivery_deadline,_department,_frame_purchase,_frame_purchase_end,_gl_account,_infosec_level,_infosec_project,_intended_duration,_internal_order,_issue_ids,_organisation,_part_of_budget,_payment_type,_pr_currency,_pr_ext_resource,_pr_justification,_pr_risks,_psp_element,_purchase_risk_type,_purchase_type,_renegotiations,_renew_until,_requester,_responsible,_safety_critical,_sap_cc,_sap_reference,_status,_termination_date,_terms_conditions,_time_project,_title,_total_cost from _purchase_request where id=%s' ('5946',)
roundup.hyperdb - DEBUG - SQL 'select _o_permission.id from _o_permission   where _o_permission._user=%s and _o_permission.__retired__=0 order by _o_permission.id' ('833',)
roundup.hyperdb - DEBUG - SQL 'select _o_permission.id from _o_permission   where _o_permission._user=%s and _o_permission.__retired__=0 order by _o_permission.id' ('833',)
roundup.hyperdb - DEBUG - SQL 'select _pr_approval.id from _pr_approval   where _pr_approval._purchase_request=%s and _pr_approval.__retired__=0 order by _pr_approval.id' ('5946',)
roundup.hyperdb - DEBUG - SQL 'select _activity,_actor,_by,_creation,_creator,_date,_deputy,_deputy_gets_mail,_description,_msg,_order,_purchase_request,_role,_role_id,_status,_user from _pr_approval where id=%s' ('28897',)
roundup.hyperdb - DEBUG - SQL 'select linkid from purchase_request_nosy where nodeid=%s' ('5946',)
roundup.hyperdb - DEBUG - SQL 'select id from _pr_status where _name=%s and __retired__=%s' ('open', 0)
roundup.security - DEBUG - hasPermission: (View, 833, purchase_request, None, 5946) (2, 99) deny
roundup.hyperdb - DEBUG - SQL 'select _activity,_actor,_charge_to,_continuous_obligation,_contract_term,_creation,_creator,_date_approved,_date_ordered,_date_progress,_delivery_address,_delivery_deadline,_department,_frame_purchase,_frame_purchase_end,_gl_account,_infosec_level,_infosec_project,_intended_duration,_internal_order,_issue_ids,_organisation,_part_of_budget,_payment_type,_pr_currency,_pr_ext_resource,_pr_justification,_pr_risks,_psp_element,_purchase_risk_type,_purchase_type,_renegotiations,_renew_until,_requester,_responsible,_safety_critical,_sap_cc,_sap_reference,_status,_termination_date,_terms_conditions,_time_project,_title,_total_cost from _purchase_request where id=%s' ('5947',)
roundup.hyperdb - DEBUG - SQL 'select _o_permission.id from _o_permission   where _o_permission._user=%s and _o_permission.__retired__=0 order by _o_permission.id' ('833',)
roundup.hyperdb - DEBUG - SQL 'select _activity,_actor,_allow_gl_account,_confidential,_creation,_creator,_description,_name,_order,_valid from _purchase_type where id=%s' ('6',)
roundup.hyperdb - DEBUG - SQL 'select linkid from purchase_type_pr_edit_roles where nodeid=%s' ('6',)
roundup.hyperdb - DEBUG - SQL 'select linkid from purchase_type_pr_roles where nodeid=%s' ('6',)
roundup.hyperdb - DEBUG - SQL 'select linkid from purchase_type_pr_forced_roles where nodeid=%s' ('6',)
roundup.hyperdb - DEBUG - SQL 'select _o_permission.id from _o_permission   where _o_permission._user=%s and _o_permission.__retired__=0 order by _o_permission.id' ('833',)
roundup.hyperdb - DEBUG - SQL 'select linkid from purchase_type_pr_view_roles where nodeid=%s' ('6',)
roundup.hyperdb - DEBUG - SQL 'select _pr_approval.id from _pr_approval   where _pr_approval._purchase_request=%s and _pr_approval.__retired__=0 order by _pr_approval.id' ('5947',)
roundup.hyperdb - DEBUG - SQL 'select _activity,_actor,_by,_creation,_creator,_date,_deputy,_deputy_gets_mail,_description,_msg,_order,_purchase_request,_role,_role_id,_status,_user from _pr_approval where id=%s' ('28898',)
roundup.hyperdb - DEBUG - SQL 'select linkid from purchase_request_nosy where nodeid=%s' ('5947',)
roundup.hyperdb - DEBUG - SQL 'select id from _pr_status where _name=%s and __retired__=%s' ('open', 0)
roundup.hyperdb - DEBUG - SQL 'select _o_permission.id from _o_permission   where _o_permission._user=%s and _o_permission.__retired__=0 order by _o_permission.id' ('833',)
roundup.hyperdb - DEBUG - SQL 'select _o_permission.id from _o_permission   where _o_permission._user=%s and _o_permission.__retired__=0 order by _o_permission.id' ('833',)
roundup.security - DEBUG - hasPermission: (View, 833, purchase_request, None, 5947) (2, 99) deny

[repeated a lot of times]

Thanks
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8134 Author: [hidden] (schlatterbeck) Date: 2024-10-17 15:32
On Thu, Oct 17, 2024 at 02:14:27PM +0000, John Rouillard wrote:
> >I would argue that if we find any permission on the given 'klass' that
> >does not have a check method, we do not need to do this: The check will
> >always tell us that we can see at least one property of *each* item in
> >the klass. So we do not need to filter and consequently do not need to
> >loop over all issues found when we establish that fact before the loop.
> 
> Your logic sounds right. The itemid AFAIK is only used in a check method.
> 
> Would the following code (properly indented) do the trick:
> 
>     if check(permission, userid, self.classname):
>       allowed = [id for id in klass.filter(matches, filterspec, sort, group)]
>     else:
>       allowed = [id for id in klass.filter(matches, filterspec, sort, group)
>                  if check(permission, userid, self.classname, itemid=id)]

Hmm, I think for the case that where we find a check on the whole class
we would simply return the result of filter without iterating, i.e.

ok = find_a_check_on_whole_class (permission, userid, self.classname)
if ok:
    allowed = klass.filter(matches, filterspec, sort, group)
elif check(permission, userid, self.classname):
    allowed = [id for id in klass.filter(matches, filterspec, sort, group)
                if check(permission, userid, self.classname, itemid=id)]

I think we need to add a method for getting a check on the whole class
to roundup/security.py

> The check(permission, userid, self.classname) should fail if all
> permissions for 'permission' have a check command or properties with a
> props_only=true.

Yes, should do the trick. But 'check' at that point in the code is the
hasPermission check, see some lines above:

check = self._client.db.security.hasPermission

So we want to group the checks into two categories, the ones with and
the others without a check method. At the point of filtering we would
completely ignore the ones with props_only=True. And we would first
check the ones in the first category and not filter at all in python
when we find any that apply.

> >When rendering the HTML, the check is performed again for each property.
> >If the property should not be shown it is rendered as '[Hidden]'.
> >
> >So no need for memoization.
> 
> I did a fast memoization change (patch attached) for hasPermission if
> the check command and item command are missing. That at least seems to
> be used for the 'Web Access' check in templating.py
> HTMLClass::is_X_ok() where X is view, edit, restore ... and in the
> HTMLClass::list() method.

I don't care much about HTMLClass: My templates edit only a handful of
things at a time. This may be different for your use-case with an
Excel-like editing form.

> >Now my second proposal comes into play: If we have check methods with
> >SQL (or some other mechanism that ultimately compiles to SQL) we can
> >*OR* all the SQL for each check method and put that with an AND into the
> >[sql generated by the] filter call. Then the loop would again only
> >return valid items and again we would not have to do it in python in the
> >loop.
> 
> We still need to figure out a method to skip the python permission
> checks only for those checks that are pushed down to the DB.

Yes.

> >Note that I think the performance hog lies only in searching, not in
> >other permission checks like creation or editing.
> 
> By searching do you mean finding issues to display on an index page?

Yes, see my log in the other mail: the permission checks are
interleaved with a *lot* of SQL.

> Also I have a spreadsheet like editing interface for issues where I
> mix an index with edit checks. So it might not always be a View
> permission.

Yes, how many do you have on a page?
Because I think *building* the page and selecting what to display may
be the performance point.

> 
> >And it probably
> >applies only to 'View' permissions (I don't think batch is ever called
> >with an explicit permission other than 'View')
> 
> I'm not sure that batch is only called with view permissions. A quick
> grep for 'batch(' doesn't return any calls to batch() in the code. I
> assume it's triggered from templates. Is there any reason to assume
> that a template can't generate a batch(..., permission="edit") call?

Hmm, yes, it's probably called from templating.

> If View is the only permission used in a batch call, why does the
> batch() method have a permission parameter?

I don't know .-)
This comes from ZTUtils so it is inherited code that has seen a lot of
use-cases which probably are not all in roundup.

Thanks
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8136 Author: [hidden] (schlatterbeck) Date: 2024-10-18 13:20
Hi John,
I've started implementing this (on a separate branch).
But in the original I have failing tests. In particular the whole
liveserver stuff doesn't work for me. Some tests are skipped (or xfail)
but all the others fail.

Tests are still running as of this writing, more info will follow.
A first hunch is that one of the early failing tests (the first one
probably) leaves something behind that makes the other tests fail.

Have you seen something like this? The liveserver tests are new to me...

Thanks
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8137 Author: [hidden] (schlatterbeck) Date: 2024-10-18 13:34
On Fri, Oct 18, 2024 at 01:20:16PM +0000, Ralf Schlatterbeck wrote:
> 
> Tests are still running as of this writing, more info will follow.
> A first hunch is that one of the early failing tests (the first one
> probably) leaves something behind that makes the other tests fail.

Looks like the error message about port 9001 is misleading.
I've verified with netstat that nothing has port 9001 open.

XFAIL test/test_liveserver.py::BaseTestCases::test__generic_item_template_editbad
XFAIL test/test_liveserver.py::TestFeatureFlagCacheTrackerOff::test__generic_item_template_editbad
FAILED test/test_liveserver.py::BaseTestCases::test__generic_item_template_editok - OSError: Ports 9001-9001 are all already in use

My first guess was that the XFAIL might leave something behind but
running the test__generic_item_template_editok alone also fails with the
same message.

Thanks
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8138 Author: [hidden] (schlatterbeck) Date: 2024-10-18 13:40
On Fri, Oct 18, 2024 at 03:34:09PM +0200, Ralf Schlatterbeck wrote:
> On Fri, Oct 18, 2024 at 01:20:16PM +0000, Ralf Schlatterbeck wrote:
> > 
> > Tests are still running as of this writing, more info will follow.
> > A first hunch is that one of the early failing tests (the first one
> > probably) leaves something behind that makes the other tests fail.
> 
> Looks like the error message about port 9001 is misleading.
> I've verified with netstat that nothing has port 9001 open.

Oops, verified on wrong machine.
On that machine I'm running an uwsgi service.

Can we please make that port configurable? Looks like it is hardcoded
currently.

I cannot run the tests via an NFS mount on another machine because the
cleanup tries to delete an open file which results in a .nfs... file in
the test directory which makes the remove fail. Our tests cannot
currently be run on an NFS mount...

Thanks
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8139 Author: [hidden] (schlatterbeck) Date: 2024-10-18 14:59
I've pushed my changes to the new branch permission-performance.

This breaks the tests
XPASS test/test_liveserver.py::BaseTestCases::test__generic_item_template_editbad
XPASS test/test_liveserver.py::TestFeatureFlagCacheTrackerOff::test__generic_item_template_editbad
FAILED test/test_liveserver.py::BaseTestCases::test__generic_item_template_editok - AssertionError: b'done-cbb' not found in b'<!DOCTYPE HTML PUBLIC "-//W3C//D...
FAILED test/test_liveserver.py::TestFeatureFlagCacheTrackerOff::test__generic_item_template_editok - AssertionError: b'done-cbb' not found in b'<!DOCTYPE HTML PUBLIC "-//W3C//D...

So two tests now pass which are marked xfail while the reverse test (as
admin) fail.
It looks like the real test in
test__generic_item_template_editok
*does* work. But before the test for editability of status7 another test
is performed (where the string 'Hello...' is searched for) which fails
for admin. Since the test was sort-of broken before this needs further
investigation.

Note that I don't have tested postgres and mysql backends and no
indexer tests that need specific software installed.

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8140 Author: [hidden] (schlatterbeck) Date: 2024-10-18 16:26
Hi John,
I think I have an idea how to formulate the sql checks for better query
performance.
Let's implement this as a method that ultimately returns parameters of a
filter call of the class.

Then when computing a batch we can run the found IDs through another
filter call similar to

new_ids = cls.filter (found_ids, <other filter args computed by method>)

This of course would only work if *all* permissions with check method of
a class do have a method for computing filter args. The downside is that
we repeatedly pass a possibly huge set of IDs to SQL. The upside is that
it works also with non-sql backends.

We would have something like:

new_ids = set (new_ids) # from klass.filter(matches, filterspec, sort, group)
confirmed = set ()

# We *OR* all the result from filtermethods
for perm in permissions:
    result = filter (list (new_ids), *perm.filtermethod (...))
    new_ids = new_ids - result
    confirmed.update (result)
allowed = list (confirmed)

The permissions above would have to be retrieved from *all* roles the
current user has.

Kind regards
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8147 Author: [hidden] (schlatterbeck) Date: 2024-10-21 15:42
I had to modify this slightly: Since the 'filter' method cannot filter on OR clauses on individual properties (e.g. prop1=value1 OR prop2=value2), I'm now returning a list of filter arguments from a Permission.filter method. This also solves the case that the filter detects early on that the user should not have any permissions which happens in one or two of my new filter methods in my application.

Preliminary performance measure: A query that ran for about 10 Minutes before (and often longer depending on the user running into an uwsgi timeout of 10 minutes occasionally) now runs in 3 seconds. It still does a lot of filtering but this is in SQL now.
msg8149 Author: [hidden] (schlatterbeck) Date: 2024-10-21 16:27
The latest version is pushed to branch permission-performance.
msg8158 Author: [hidden] (rouilj) Date: 2024-10-22 17:08
Hi Ralf:

In message <20241021162755.4s6uag2wukr6va22@runtux.com>,
Ralf Schlatterbeck writes:
>The latest version is pushed to branch permission-performance.

I am just starting to look at these changes. What were you seeing
that prompted:

   changeset:   8131:8e9181dfc9fa
   Fix searchable checks in roundup/security.py

You moved the test:

        if self.check:
            return 0

from the last test in the searchable() method to before the property
checks. From the original method description, this seems correct.

Then you added a test:

      if self.properties is None:
          return 1

that used to be handled by fall through to the return 1 at the end of
the method. I think "not self._properties_dict[property]" would be
False for every property if self.properties is None.

Is this just an optimization to not hit the final return 1 in the
method?  If so is that final "return 1" ever hit? Maybe add a
log.warning to see if it is called and under what circumstances?

Also, I think a search permissions like:

  addPermission("Search", "user", properties=("username", "id"),
                check=somecheck)

would have returned 1 with the original code when called as:

   permission.searchable("Search", "user", "username")

But would return 0 with this changeset.

Is this correct? If so we need a section in upgrading.txt because this
is an incompatible change.

Also adding a test of this case to the test suite is needed. Maybe a
new testSearch in tests/test_security.py? It could be thrown in with
testTransitiveSearchPermissions, but that is already a large test.

Any idea why Search permissions can not have a check function? The
docstring for it indicates checks are not allowed. I don't see this
documented anywhere in the permissions section of
doc/references.txt. (Granted that section is kind of light on
details/examples.) No mention in customizing.txt either.

I don't have a use case for a Search permission with a check
command. However IIRC a View permission will use/run a check command
and a Search permission is called as a fallback in search scenarios
only if a View permission did not allow access.

I am sure there is more to follow.
msg8160 Author: [hidden] (rouilj) Date: 2024-10-23 14:58
Hi Ralf,

A thought occurred to me. Permissions are some of the most complex things in Roundup.
I have spent a fair amount of time debugging them with people.

Am I correct that you implemented a new pathway pushing some permissions checks
down to the database level and still have the older python only filtering (for dbm)
intact?

Would it be possible to turn off the SQL permission checks using a parameter?

Maybe something in interfaces.py as it should be used for debugging only.
My concern is that the SQL checks may return a different result for some set of
permissions. We don't have very complex sets of permissions in the test suite.
But even if we add some more complex tests and data, it will still be far from
what is in the wild.

I would like a way in the field to be able to compare the original filtering
against the new filtering. This is similar to how we had a flag to enable/disable
the wsgi performance improvement in 2.2. 

In this case, SQL filtering will be turned on by default, but can be turned off
by importing/setting a flag. Turning it off would return to the original python only
filter. This way the user can do the same query with or without SQL permission checks
to see if there is an issue.

I really need to make roundup-admin work correctly with the -u parameter (issue2551246).
This will allow a search from the CLI under a user's permission set. This would also
make it easier for admins to test their permissions.

Thoughts?
msg8161 Author: [hidden] (schlatterbeck) Date: 2024-10-23 15:14
On Wed, Oct 23, 2024 at 02:58:57PM +0000, John Rouillard wrote:
> 
> A thought occurred to me. Permissions are some of the most complex
> things in Roundup.  I have spent a fair amount of time debugging them
> with people.

Yes, I can imagine.
I don't think this is specific to roundup, though...

> Am I correct that you implemented a new pathway pushing some
> permissions checks down to the database level and still have the older
> python only filtering (for dbm) intact?

The current default schemata do not implement any filter functions for
permissions (that are the ones that return filter parameters to be
applied during filtering the things returned from the db).

I *did* implement one for "query" in the new tests I wrote. These could
be used in the templates we have -- but I don't think many installations
will run into performance problems with the number of queries returned
for a user ...

> Would it be possible to turn off the SQL permission checks using a parameter?

Yes, I think we could make that configurable.

> Maybe something in interfaces.py as it should be used for debugging only.

We could make it a debug setting in config. Similar to the email debug
settings.

> My concern is that the SQL checks may return a different result for
> some set of permissions. We don't have very complex sets of
> permissions in the test suite.  But even if we add some more complex
> tests and data, it will still be far from what is in the wild.

Yes, this will not happen for the templates we distribute, these simply
do not have complex enough checks that warrant a filter function in
permissions.

But it would be very useful for debugging filter functions.
Because the user writing the check method and the filter function is
responsible for making them return the same set of results.
And I already discovered -- when modifying an existing tracker -- that
it is non-trivial to rewrite checks as filter functions.

> I would like a way in the field to be able to compare the original
> filtering against the new filtering. This is similar to how we had a
> flag to enable/disable the wsgi performance improvement in 2.2. 

We could make a flag that can turn off the usage of the filter functions
in permission checks. The user would be responsible for running it with
and without filter to compare results.

> In this case, SQL filtering will be turned on by default, but can be
> turned off by importing/setting a flag. Turning it off would return to
> the original python only filter. This way the user can do the same
> query with or without SQL permission checks to see if there is an
> issue.
> 
> I really need to make roundup-admin work correctly with the -u
> parameter (issue2551246).  This will allow a search from the CLI under
> a user's permission set. This would also make it easier for admins to
> test their permissions.

Yes, that should be easy now: Just replace the call to filter with the
call to filter_with_permissions when run as non-admin.

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8162 Author: [hidden] (schlatterbeck) Date: 2024-10-23 15:47
On Wed, Oct 23, 2024 at 03:14:33PM +0000, Ralf Schlatterbeck wrote:
> 
> On Wed, Oct 23, 2024 at 02:58:57PM +0000, John Rouillard wrote:

> > Would it be possible to turn off the SQL permission checks using a parameter?
> 
> Yes, I think we could make that configurable.
> 
> > Maybe something in interfaces.py as it should be used for debugging only.
> 
> We could make it a debug setting in config. Similar to the email debug
> settings.

I've implemented this, let me know if this works for you.

KR
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8163 Author: [hidden] (schlatterbeck) Date: 2024-10-23 16:41
On Tue, Oct 22, 2024 at 05:08:53PM +0000, John Rouillard wrote:
> 
> In message <20241021162755.4s6uag2wukr6va22@runtux.com>,
> Ralf Schlatterbeck writes:
> >The latest version is pushed to branch permission-performance.
> 
> I am just starting to look at these changes. What were you seeing
> that prompted:
> 
>    changeset:   8131:8e9181dfc9fa
>    Fix searchable checks in roundup/security.py
> 
> You moved the test:
> 
>         if self.check:
>             return 0
> 
> from the last test in the searchable() method to before the property
> checks. From the original method description, this seems correct.
> 
> Then you added a test:
> 
>       if self.properties is None:
>           return 1
> 
> that used to be handled by fall through to the return 1 at the end of
> the method. I think "not self._properties_dict[property]" would be
> False for every property if self.properties is None.

The check should not fail if the permission does not define *any*
properties. In that case the _properties_dict is {None: True} and the
test
if not self._properties_dict[property]
fails. I have no idea why this ever worked.

> Is this just an optimization to not hit the final return 1 in the
> method?  If so is that final "return 1" ever hit? Maybe add a
> log.warning to see if it is called and under what circumstances?
> 
> Also, I think a search permissions like:
> 
>   addPermission("Search", "user", properties=("username", "id"),
>                 check=somecheck)
> 
> would have returned 1 with the original code when called as:
> 
>    permission.searchable("Search", "user", "username")
> 
> But would return 0 with this changeset.

Yes, this is a bug-fix. A Search permission must never have a check
method. It doesn't make sense because we don't have any items to check
against *before* we start searching.

> Is this correct? If so we need a section in upgrading.txt because this
> is an incompatible change.

No, a Search permission with a check method doesn't make sense. The
check method would never have been called.

> Any idea why Search permissions can not have a check function? The
> docstring for it indicates checks are not allowed.

The check function gets an item. We don't have one before we start
searching.

> I don't see this documented anywhere in the permissions section of
> doc/references.txt. (Granted that section is kind of light on
> details/examples.) No mention in customizing.txt either.

Hmm, what would be the semantics of a check function for search? IMO
there is None.

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8164 Author: [hidden] (rouilj) Date: 2024-10-23 17:14
In message <20241023164057.k7m2yvmftzgdq55o@runtux.com>,
Ralf Schlatterbeck writes:
>On Tue, Oct 22, 2024 at 05:08:53PM +0000, John Rouillard wrote:
>> 
>> In message <20241021162755.4s6uag2wukr6va22@runtux.com>,
>> Ralf Schlatterbeck writes:
>> >The latest version is pushed to branch permission-performance.
>> 
>> I am just starting to look at these changes. What were you seeing
>> that prompted:
>> 
>>    changeset:   8131:8e9181dfc9fa
>>    Fix searchable checks in roundup/security.py
>> 
>> You moved the test:
>> 
>>         if self.check:
>>             return 0
>> 
>> from the last test in the searchable() method to before the property
>> checks. From the original method description, this seems correct.
>> 
>> Then you added a test:
>> 
>>       if self.properties is None:
>>           return 1
>> 
>> that used to be handled by fall through to the return 1 at the end of
>> the method. I think "not self._properties_dict[property]" would be
>> False for every property if self.properties is None.
>
>The check should not fail if the permission does not define *any*
>properties. In that case the _properties_dict is {None: True} and the
>test
>    if not self._properties_dict[property]
>fails. I have no idea why this ever worked.

_properties_dict should be a TruthDict. How are you getting a regular
dictionary? It should be a TruthDict instance.

  class Permission:
    def __init__():
        ...
        self.properties = properties
        self._properties_dict = support.TruthDict(properties)
        self.check = check

When TruthDict() is passed an empty properties list, the returned
object doesn't define a keys attribute. This causes
TruthDict.__getitem__(key) or TruthDict[key] to return True for
any/all keys. (It took me a while to figure out how it worked. Had to
run it under the debugger.)

>> Also, I think a search permissions like:
>> 
>>   addPermission("Search", "user", properties=("username", "id"),
>>                 check=somecheck)
>> 
>> would have returned 1 with the original code when called as:
>> 
>>    permission.searchable("Search", "user", "username")
>> 
>> But would return 0 with this changeset.
>
>Yes, this is a bug-fix. A Search permission must never have a check
>method. It doesn't make sense because we don't have any items to check
>against *before* we start searching.

Ah good point. That needs to be documented somewhere....

>> I don't see this documented anywhere in the permissions section of
>> doc/references.txt. (Granted that section is kind of light on
>> details/examples.) No mention in customizing.txt either.
>
>Hmm, what would be the semantics of a check function for search? IMO
>there is None.

I agree, but it's not documented anywhere that Search is special.
I'll find a suitable spot to document that.
msg8165 Author: [hidden] (rouilj) Date: 2024-10-23 18:10
In message <20241023154741.sej32etpaprcqwy5@runtux.com>,
Ralf Schlatterbeck writes:
>On Wed, Oct 23, 2024 at 03:14:33PM +0000, Ralf Schlatterbeck wrote:
>> 
>> On Wed, Oct 23, 2024 at 02:58:57PM +0000, John Rouillard wrote:
>
>> > Would it be possible to turn off the SQL permission checks using a parameter?
>> 
>> Yes, I think we could make that configurable.
>> 
>> > Maybe something in interfaces.py as it should be used for debugging only.
>> 
>> We could make it a debug setting in config. Similar to the email debug
>> settings.
>
>I've implemented this, let me know if this works for you.

That works.

I still think a change to interfaces.py would be faster to access but
I guess one more debug seting won't add to the clutter in config.ini
too much.

Also is this correct?

+    ), "Most settings in this section (except for backend and debug_filter)\n"
+       "are used by RDBMS backends only.",

does debug_filter have any use if you are using dbm?

Have a great evening.
--
				-- rouilj
John Rouillard
===========================================================================
My employers don't acknowledge my existence much less my opinions.
msg8166 Author: [hidden] (schlatterbeck) Date: 2024-10-24 06:33
On Wed, Oct 23, 2024 at 06:10:33PM +0000, John Rouillard wrote:
> Also is this correct?
> 
> +    ), "Most settings in this section (except for backend and debug_filter)\n"
> +       "are used by RDBMS backends only.",
> 
> does debug_filter have any use if you are using dbm?

Sure!

The whole point of using hyperdb.filter for the filtering of search
results is that it also works for non-rdbms backends. For the anydbm
backend using a filter method will probably not improve performance
since filtering still happens in python. But you *can* debug your
permission filter methods with an anydbm backend and later move to
rdbms.

Thanks for looking into this!
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8167 Author: [hidden] (schlatterbeck) Date: 2024-10-24 07:25
On Wed, Oct 23, 2024 at 05:14:45PM +0000, John Rouillard wrote:
> >> Then you added a test:
> >> 
> >>       if self.properties is None:
> >>           return 1
> >> 
> >> that used to be handled by fall through to the return 1 at the end of
> >> the method. I think "not self._properties_dict[property]" would be
> >> False for every property if self.properties is None.
> >
> >The check should not fail if the permission does not define *any*
> >properties. In that case the _properties_dict is {None: True} and the
> >test
> >    if not self._properties_dict[property]
> >fails. I have no idea why this ever worked.

Ah. I was debugging the failing liveserver test when I made that change.
And of course I didn't understand what TruthDict does at the time.
The symptom was that admin didn't have any permission (and the admin
permissions also lack a classname in addition to lacking properties).
Turns out I needed to check for an empty class, too, in Role.searchable.

At the time I was still trying to find out what the failing test was
supposed to do -- by unfortunate coincidence it was just that test that
caught a bug in my refactoring of permissions :-)

I've reverted that change and added a comment.

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8168 Author: [hidden] (rouilj) Date: 2024-10-24 14:58
We also need a couple of examples of using the filter method in customizing.txt. Reference
them from reference.txt and also add index entries for:

  filter: defining permissions
       defining permission
       example use


where defining permissions links to target in reference.txt, 'example use' links to targets
in customizing.txt.
msg8169 Author: [hidden] (schlatterbeck) Date: 2024-10-27 06:44
On Thu, Oct 24, 2024 at 02:58:48PM +0000, John Rouillard wrote:
> 
> We also need a couple of examples of using the filter method in
> customizing.txt. Reference them from reference.txt and also add index
> entries for:

Proposal for examples:
We already have a check function on query (at least in the classic
template) that permits users to see their own and public (private_for
empty) queries.

def view_query(db, userid, itemid):
    private_for = db.query.get(itemid, 'private_for')
    if not private_for:
        return True
    return userid == private_for

Augmenting this with a filter function:

    def filter_query(db, userid, klass):
        return [dict(filterspec = dict(private_for=['-1', userid]))]

Now if we also want to check if the user is the creator of a query and
permit access we would modify these to::

    def view_query(db, userid, itemid):
        q = db.query.getnode(itemid)
        if not q.private_for or userid == q.private_for:
            return True
        if userid == q.creator:
            return True

    def filter_query(db, userid, klass):
        f1 = dict(filterspec = dict(private_for=['-1', userid]))
        f2 = dict(filterspec = dict(creator=userid))
        return [f1, f2]

Note how we need a list with more than one entry here to account for the
OR-condition.

Another example would be the following: Consider we have a class
"organization". A user has a link to organization and the issue has a
link to organization. Users can see only issues of their own
organisation.

A check function would be::

    def view_issue(db, userid, itemid):
        user  = db.user.getnode(userid)
        if not user.organisation:
            return False
        issue = db.issue.getnode(itemid)
        if user.organisation == issue.organisation:
            return True

The corresponding filter function::

    def filter_issue(db, userid, klass):
        user = db.user.getnode(userid)
        if not user.organisation:
            return []
        return [dict(filterspec = dict(organisation=user.organisation))]

Note how the filter fails early by returning an empty list of filter
arguments when the user has no organisation.

I think this documents most of the use-cases.
Note that so far I've never needed the 'klass' argument to the filter
function. It might be needed when a filter applies to more than one
class, though.

Let me know what you think!

KR
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8170 Author: [hidden] (rouilj) Date: 2024-10-27 17:13
Hi Ralf:

In message <20241027064407.uzxiynqywsbcypby@runtux.com>,
Ralf Schlatterbeck writes:
>Proposal for examples:
>We already have a check function on query (at least in the classic
>template) that permits users to see their own and public (private_for
>empty) queries.
>
>def view_query(db, userid, itemid):
>    private_for = db.query.get(itemid, 'private_for')
>    if not private_for:
>        return True
>    return userid == private_for
>
>Augmenting this with a filter function:
>
>    def filter_query(db, userid, klass):
>        return [dict(filterspec = dict(private_for=['-1', userid]))]

Would:

   return [{"filterspec": {"private_for": ['-1', userid]}}]

be the same? If so we should probably have an example without the use
of dict() if possible. IIRC dict is slower than the definition using
{} (I can't remember the name for the {}'ed definition.  Also I think
the static definition is more familiar to users/admins as well.

In the 'private_for' list, I assume -1 means unset and that items in
the list are or'ed together.

We also need a reference link to the filter syntax. References.txt has
a reference for filter in the hyperdb wrapper class. Is that good
enough? I have always found it a bit difficult to understand and there
aren't a lot of great examples that show the input filter dictionary
and describes how it gets applied. IIRC you had a similar issue a
while back and added some more documentation.

I also have an uncommited section in my user_guide.txt titled:

  Advanced Searching with Property Expressions

that talks about using RPN notation in the URL for filtering. I
haven't committed it because I haven't verified all the examples I
mention. IIRC I wrote this before 2.4.0 was released and didn't commit
it then because one of my examples failed. Also I am not sure that
this location would be proper doc for the filterspec syntax.

>Now if we also want to check if the user is the creator of a query and
>permit access we would modify these to::
>
>    def view_query(db, userid, itemid):
>        q = db.query.getnode(itemid)
>        if not q.private_for or userid == q.private_for:
>            return True
>        if userid == q.creator:
>            return True
>
>    def filter_query(db, userid, klass):

>        return [f1, f2]
>
>Note how we need a list with more than one entry here to account for the
>OR-condition.

Understood.

Do we need AND-condition or NOT-condition examples as well?

I assume (a nonsense) AND is:

   f1 = dict(filterspec = dict(private_for=['-1', userid],
                               dict(creator=userid))

I still haven't traced through the code enough to understand how these
end up being applied at the database level. What do the WHERE clauses
look like when the filter is pushed down to the SQL level?

>Another example would be the following: Consider we have a class
>"organization". A user has a link to organization and the issue has a
>link to organization. Users can see only issues of their own
>organisation.
>
>A check function would be::
>
>    def view_issue(db, userid, itemid):
>        user  = db.user.getnode(userid)
>        if not user.organisation:
>            return False
>        issue = db.issue.getnode(itemid)
>        if user.organisation == issue.organisation:
>            return True
>
>The corresponding filter function::
>
>    def filter_issue(db, userid, klass):
>        user = db.user.getnode(userid)
>        if not user.organisation:
>            return []
>        return [dict(filterspec = dict(organisation=user.organisation))]
>
>Note how the filter fails early by returning an empty list of filter
>arguments when the user has no organisation.
>

The filter function here exactly duplicates the check function
correct? If so, does this make the check function redundant? Can the
permission be defined with only the filter function and no check
function?

Also how would that be modified to include people with an admin role?
Are filter's ignored if the user is an Admin?

I view adding 'filter' to permissions as the equivalent to the
hyperdb's filter_sql() method. Do you agree?  (Side note filter_sql is
missing examples.)

If we are using 'filter' on a non-sql database (dbm, or future mongo
etc.), how is the 'filter' applied? You said that the filter_function
could be debugged on a dbm style database.

>I think this documents most of the use-cases.
>Note that so far I've never needed the 'klass' argument to the filter
>function. It might be needed when a filter applies to more than one
>class, though.

I assume this would happen only if the value you are checking is
equivalent in the two classes but has different names.

Consider a filter checking to see if the owner of the item is the
current owner.  However you have help desk people creating issues and
users. So the creator property is a random help desk person, not the
actual creator. So you have a user_id property added to the user class
and a requestr_id property added to the issue class to record the id
for the actual person.

So you would generate a different filter against a different field
based on the klass.

Is this an example when applying a filter to multiple classes?

>Let me know what you think!

This looks very promising.

Thanks for tackling it.
msg8171 Author: [hidden] (schlatterbeck) Date: 2024-10-27 19:17
On Sun, Oct 27, 2024 at 05:13:17PM +0000, John Rouillard wrote:
> 
> In message <20241027064407.uzxiynqywsbcypby@runtux.com>,
> Ralf Schlatterbeck writes:
> >Proposal for examples:
> >We already have a check function on query (at least in the classic
> >template) that permits users to see their own and public (private_for
> >empty) queries.
> >
> >def view_query(db, userid, itemid):
> >    private_for = db.query.get(itemid, 'private_for')
> >    if not private_for:
> >        return True
> >    return userid == private_for
> >
> >Augmenting this with a filter function:
> >
> >    def filter_query(db, userid, klass):
> >        return [dict(filterspec = dict(private_for=['-1', userid]))]
> 
> Would:
> 
>    return [{"filterspec": {"private_for": ['-1', userid]}}]
> 
> be the same? If so we should probably have an example without the use
> of dict() if possible. IIRC dict is slower than the definition using
> {} (I can't remember the name for the {}'ed definition.  Also I think
> the static definition is more familiar to users/admins as well.

Fine with me.

> In the 'private_for' list, I assume -1 means unset and that items in
> the list are or'ed together.

This is a normal Class.filter call. So, yes, -1 means unset. We search
for 'unset or userid'.

> We also need a reference link to the filter syntax. References.txt has
> a reference for filter in the hyperdb wrapper class. Is that good
> enough? I have always found it a bit difficult to understand and there
> aren't a lot of great examples that show the input filter dictionary
> and describes how it gets applied. IIRC you had a similar issue a
> while back and added some more documentation.

Hmm, yes we should add that link and make a separate ticket for
improving the filter docs. I'm not very good at writing documentation
not being a native speaker... Maybe we can get some AI to improve on
that matter? (Honestly, this *could* be an idea :-)

> I also have an uncommited section in my user_guide.txt titled:
> 
>   Advanced Searching with Property Expressions
> 
> that talks about using RPN notation in the URL for filtering. I
> haven't committed it because I haven't verified all the examples I
> mention. IIRC I wrote this before 2.4.0 was released and didn't commit
> it then because one of my examples failed. Also I am not sure that
> this location would be proper doc for the filterspec syntax.

Fine with me, I also don't know a better place.

> >Now if we also want to check if the user is the creator of a query and
> >permit access we would modify these to::
> >
> >    def view_query(db, userid, itemid):
> >        q = db.query.getnode(itemid)
> >        if not q.private_for or userid == q.private_for:
> >            return True
> >        if userid == q.creator:
> >            return True
> >
> >    def filter_query(db, userid, klass):
> 
> >        return [f1, f2]
> >
> >Note how we need a list with more than one entry here to account for the
> >OR-condition.
> 
> Understood.
> 
> Do we need AND-condition or NOT-condition examples as well?

The AND is automagically taken care of by filter (filterspec entries for
different properties are ANDed). I'm not sure we're very good at not
conditions using hyperdb.filter.

> 
> I assume (a nonsense) AND is:
> 
>    f1 = dict(filterspec = dict(private_for=['-1', userid],
>                                dict(creator=userid))

Yes.
But I don't think we need a separate example as this is covered by
filter.

> I still haven't traced through the code enough to understand how these
> end up being applied at the database level. What do the WHERE clauses
> look like when the filter is pushed down to the SQL level?

We're filtering in a set of

results = original-filter-call

for filter_args in all the filters from filter function:
    results = klass.filter (**filter_args)

Except that I'm optimizing the number of results to pass through the
filter calls by keeping two sets.

At the end I'm again filtering with the original sort/group arguments to
get the correct order.

> 
> >Another example would be the following: Consider we have a class
> >"organization". A user has a link to organization and the issue has a
> >link to organization. Users can see only issues of their own
> >organisation.
> >
> >A check function would be::
> >
> >    def view_issue(db, userid, itemid):
> >        user  = db.user.getnode(userid)
> >        if not user.organisation:
> >            return False
> >        issue = db.issue.getnode(itemid)
> >        if user.organisation == issue.organisation:
> >            return True
> >
> >The corresponding filter function::
> >
> >    def filter_issue(db, userid, klass):
> >        user = db.user.getnode(userid)
> >        if not user.organisation:
> >            return []
> >        return [dict(filterspec = dict(organisation=user.organisation))]
> >
> >Note how the filter fails early by returning an empty list of filter
> >arguments when the user has no organisation.
> >
> 
> The filter function here exactly duplicates the check function
> correct? If so, does this make the check function redundant? Can the
> permission be defined with only the filter function and no check
> function?

Yes, the filter function *always* duplicates the check function.
I don't know if this makes the check function redundant and if we could
automagically derive a check function from a filter function. I guess it
could be done by passing the current itemid through the filter,
something along the lines of

def check (db, userid, itemid, klass, filter_function):
    args = filter_function (db, userid, klass)
    if klass.filter ([itemid], **args):
        return True

Note that the check function currently has no knowledge of the klass and
the filter function. But it would certainly be possible to write a
factory that can return a check function from a given filter function,
and klass according to the recipe above. Like so:

def check_factory (klass, filter_function):
    def check (db, userid, itemid):
        args = filter_function (db, userid, klass)
        if klass.filter ([itemid], **args):
            return True
    return check

Untested. But I think this would work. It passes a single ID through the
filter mechanism filtering the single-item again with the standard
hyperdb.filter mechanism. If it returns the item we may see it. If it
doesn't it will return an empty list of results.

> Also how would that be modified to include people with an admin role?
> Are filter's ignored if the user is an Admin?

No the admin is not special. There is code that adds a permission entry
for the role admin. These do not have a check function and therefore do
not need a filter function.

Same for normal users that have access to a klass via a permission
*without* a check function: These are taken care of by the first if
condition in hyperdb.filter_with_permissions:

        if check(permission, userid, cn, skip_permissions_with_check = True):
            allowed = item_ids

If this check passes (it does when we have a permission for a klass
without a check function) no further checks are performed at all.

> I view adding 'filter' to permissions as the equivalent to the
> hyperdb's filter_sql() method. Do you agree?  (Side note filter_sql is
> missing examples.)

I wasn't aware we have that. (and it's not in hyperdb but in
backends/rdbms_common.py)

And, no I don't think this is correct, filter uses the standard
mechanisms of hyperdb. Nothing specific to SQL.

> If we are using 'filter' on a non-sql database (dbm, or future mongo
> etc.), how is the 'filter' applied? You said that the filter_function
> could be debugged on a dbm style database.

It is calling the standard hyperdb.filter. Line 1835 in
roundup/hyperdb.py in method filter_with_permissions.

The method filter_with_permissions falls back to previous behavior
(calling the check function for every item found) if not for every
permission with a check function we also have a filter function. The
debug setting simply enforces that fallback:

if not debug and sec.is_filterable(permission, userid, cn):

The else-part of that has the comment "Last resort: filter in python"

> 
> >I think this documents most of the use-cases.
> >Note that so far I've never needed the 'klass' argument to the filter
> >function. It might be needed when a filter applies to more than one
> >class, though.
> 
> I assume this would happen only if the value you are checking is
> equivalent in the two classes but has different names.

Yes. And I think it is a good idea to have klass in the signature.

> Consider a filter checking to see if the owner of the item is the
> current owner.  However you have help desk people creating issues and
> users. So the creator property is a random help desk person, not the
> actual creator. So you have a user_id property added to the user class
> and a requestr_id property added to the issue class to record the id
> for the actual person.
> 
> So you would generate a different filter against a different field
> based on the klass.
> 
> Is this an example when applying a filter to multiple classes?

No that would be a filter for different properties of the same class.
My idea would be that you might have two different issue-like classes.
In that case you could apply the same filter function to both and using
the klass argument to differentiate e.g. between different property
names or similar.

Thanks for taking the time looking into this!
kind regards
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8172 Author: [hidden] (schlatterbeck) Date: 2024-10-28 14:30
On Sun, Oct 27, 2024 at 07:17:04PM +0000, Ralf Schlatterbeck wrote:
> Yes, the filter function *always* duplicates the check function.
> I don't know if this makes the check function redundant and if we could
> automagically derive a check function from a filter function. I guess it
> could be done by passing the current itemid through the filter,
> something along the lines of
> 
> def check (db, userid, itemid, klass, filter_function):
>     args = filter_function (db, userid, klass)
>     if klass.filter ([itemid], **args):
>         return True

This would need to be a loop:
def check (db, userid, itemid, klass, filter_function):
    args = filter_function (db, userid, klass)
    for a in args:
        if klass.filter ([itemid], **a):
            return True

> 
> Note that the check function currently has no knowledge of the klass and
> the filter function. But it would certainly be possible to write a
> factory that can return a check function from a given filter function,
> and klass according to the recipe above. Like so:
> 
> def check_factory (klass, filter_function):
>     def check (db, userid, itemid):
>         args = filter_function (db, userid, klass)
>         if klass.filter ([itemid], **args):
>             return True
>     return check

Same here, need to loop over filter args:
def check_factory (klass, filter_function):
    def check (db, userid, itemid):
        args = filter_function (db, userid, klass)
        for a in args:
            if klass.filter ([itemid], **a):
                return True
    return check

KR, Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8174 Author: [hidden] (schlatterbeck) Date: 2024-11-11 13:52
Hi John,
I've updated the docs on the new filter function in doc/reference.txt

and I've implemented the manufacturing of a check function when only a
filter function is specified for a Permission object. This also has a
test. Thanks for that idea!

Let me know if you think we can merge this.

Do you know how to rebase the feature branch to 'default' before
merging? I'm familiar with git but never caught on with mercurial...

Thanks
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8175 Author: [hidden] (rouilj) Date: 2024-11-11 16:18
Hi Ralf:

In message <20241111135208.35dkrpr4eo3qkfcv@runtux.com>,
Ralf Schlatterbeck writes:
>I've updated the docs on the new filter function in doc/reference.txt

That looks reasonable.

>and I've implemented the manufacturing of a check function when only a
>filter function is specified for a Permission object. This also has a
>test. Thanks for that idea!

Sure.

>Let me know if you think we can merge this.

If you are claiming this is complete, let me try running with it on
one of my trackers and set up a check and filter.

>Do you know how to rebase the feature branch to 'default' before
>merging? I'm familiar with git but never caught on with mercurial...

You can rebase with a mercurial extension, but normal workflows are to
do a merge.

When looking at your code, I have been merging the default branch and
then diffing against the default branch.

  hg up permission-performance

  hg merge -r default

  hg diff -r default <-- only shows differences
                         added in permission-performance branch

If I need to merge more changes from permission-performance, I do an:

  hg merge --abort

  hg up

  hg merge -r default

to get back to baseline.

With the permission-performance branch checked out, running 'hg merge
-r default' and 'hg commit' will commit the default branch changes to
you performance branch.

Then:

  hg up default
  hg merge -r permission-performance
  hg commit

should merge the performance branch into the default branch.

After this, 'hg log -G' should look something like:

@    changeset:   8157:da712d7823f8
|\   tag:         tip
| |  parent:      8153:18a391e63202
| |  parent:      8156:0835d0ce2555
| |  user:        John Rouillard <rouilj@>
| |  date:        Mon Nov 11 10:27:06 2024 -0500
| |  description: to defaut
| |
| o  changeset:   8156:0835d0ce2555
|/|  branch:      permission-performance
| |  parent:      8155:e9af08743759
| |  parent:      8153:18a391e63202
| |  user:        John Rouillard <rouilj@>
| |  date:        Mon Nov 11 10:26:10 2024 -0500
| |  files:       doc/reference.txt
| |  description: test commit
| |
| o  changeset:   8155:e9af08743759
| |  branch:      permission-performance
| |  user:        Ralf Schlatterbeck <rsc@>
| |  date:        Mon Nov 11 14:32:25 2024 +0100
| |  files:       roundup/security.py test/db_test_base.py
| |  description: Add check_factory

(I edited so only first line is shown for descriptions, email
addresses de-domained)

8155 is your last commit on the branch. 8156 is the merge from the
default branch to the permission-performance branch. 8157 is the merge
and close of the permission performance branch back to main. If I 'hg
diff -c 8157' I only see the changes from the permission-performance
branch.

From my bookmarks, rebase is answered by:

  https://stackoverflow.com/questions/2672351/hg-how-to-do-a-rebase-like-gits-rebase
msg8177 Author: [hidden] (rouilj) Date: 2024-11-11 21:32
Ralf, it needs a CHANGES.txt entry.

Also can you try the perf_memo patch attached to this ticket and see if that helps
speed things up more.
msg8180 Author: [hidden] (schlatterbeck) Date: 2024-11-12 06:09
On Mon, Nov 11, 2024 at 04:18:38PM +0000, John Rouillard wrote:
> >Let me know if you think we can merge this.
> 
> If you are claiming this is complete, let me try running with it on
> one of my trackers and set up a check and filter.

Fine!
Let me know what you find.

> >Do you know how to rebase the feature branch to 'default' before
> >merging? I'm familiar with git but never caught on with mercurial...
> 
> You can rebase with a mercurial extension, but normal workflows are to
> do a merge.

Yes, but I prefer seeing what the real changes were when merging a
feature branch, not artefacts from merging. Same for merging in the
changes in the default branch to the feature branch: In that case
we have commits on the feature branch that dilute what really 
happened.

> When looking at your code, I have been merging the default branch and
> then diffing against the default branch.

That would be just a hg diff if properly rebased before merging.

> >From my bookmarks, rebase is answered by:
> 
>   https://stackoverflow.com/questions/2672351/hg-how-to-do-a-rebase-like-gits-rebase

I'll try that when you've looked through the changes.

Thanks!
Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg8197 Author: [hidden] (rouilj) Date: 2024-11-26 20:35
Merged by Ralf in changeset:   8166:25950b620246 with msg8196 recorded on an associated issue.

Note that developer documentation on how this code change works (actually pushing the
filter items in the list to the database with a list of valid id's) is still pending.
History
Date User Action Args
2024-11-26 20:35:07rouiljsetstatus: open -> fixed
resolution: fixed
messages: + msg8197
2024-11-12 06:09:49schlatterbecksetmessages: + msg8180
2024-11-11 21:32:56rouiljsetmessages: + msg8177
2024-11-11 16:18:38rouiljsetmessages: + msg8175
2024-11-11 13:52:18schlatterbecksetmessages: + msg8174
2024-10-28 14:30:08schlatterbecksetmessages: + msg8172
2024-10-27 19:17:04schlatterbecksetmessages: + msg8171
2024-10-27 17:13:17rouiljsetmessages: + msg8170
2024-10-27 06:44:20schlatterbecksetmessages: + msg8169
2024-10-24 14:58:47rouiljsetstatus: new -> open
assignee: schlatterbeck
messages: + msg8168
2024-10-24 07:25:01schlatterbecksetmessages: + msg8167
2024-10-24 06:33:38schlatterbecksetmessages: + msg8166
2024-10-23 18:10:33rouiljsetmessages: + msg8165
2024-10-23 17:14:45rouiljsetmessages: + msg8164
2024-10-23 16:41:02schlatterbecksetmessages: + msg8163
2024-10-23 15:47:44schlatterbecksetmessages: + msg8162
2024-10-23 15:14:33schlatterbecksetmessages: + msg8161
2024-10-23 14:58:57rouiljsetmessages: + msg8160
2024-10-22 17:08:53rouiljsetmessages: + msg8158
2024-10-21 16:27:58schlatterbecksetmessages: + msg8149
2024-10-21 15:42:34schlatterbecksetmessages: + msg8147
2024-10-18 16:26:27schlatterbecksetmessages: + msg8140
2024-10-18 14:59:26schlatterbecksetmessages: + msg8139
2024-10-18 13:40:15schlatterbecksetmessages: + msg8138
2024-10-18 13:34:12schlatterbecksetmessages: + msg8137
2024-10-18 13:20:16schlatterbecksetmessages: + msg8136
2024-10-17 15:32:12schlatterbecksetmessages: + msg8134
2024-10-17 14:54:16schlatterbecksetmessages: + msg8133
2024-10-17 14:14:26rouiljsetfiles: + security.perf_memo.patch
messages: + msg8132
2024-10-17 11:26:25schlatterbecksetmessages: + msg8131
2024-10-17 03:32:58rouiljsetfiles: + security.perf_log.patch
keywords: + patch
messages: + msg8130
2024-10-17 01:38:11rouiljsetmessages: + msg8129
2024-10-15 16:37:47schlatterbecksetmessages: + msg8128
2024-04-29 18:24:14schlatterbecksetmessages: + msg8017
2024-04-29 17:40:48rouiljsetmessages: + msg8016
2024-04-29 11:02:48schlatterbecksetmessages: + msg8015
2024-04-22 15:43:22rouiljsetmessages: + msg8004
2024-04-05 02:25:49rouiljsetmessages: + msg7978
2024-04-02 08:03:40schlatterbecksetmessages: + msg7977
2024-04-01 19:55:35adminsetmessages: + msg7973, msg7974
2024-04-01 19:53:00rouiljcreate