Issue 2551392
Created on 2025-01-19 05:52 by rouilj, last changed 2025-01-21 19:49 by rouilj.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | Remove |
memoize_security::hasPermission.patch | rouilj, 2025-01-21 19:49 | First cut of memoization to reduce walking over Roles structures |
Messages | |||
---|---|---|---|
msg8299 | Author: [hidden] (rouilj) | Date: 2025-01-19 05:52 | |
Hi Ralf (et al): Regarding issue 2551330 on improving permission performance, I think we could benefit from memoizing simple calls to hasPermission with just permission/username/classname. If property or itemid is passed, don't memoize the call. In one issue index display, I saw thousands of Web Access checks. It looks like every is_X_ok() call (where X is view, edit, retire ...) ends with: return perm('Web Access', self._client.userid) and perm( "X permission", self._client.userid, self._classname) where the second part of the and can be class level or issue/check level. Some stats viewing an issue.index page with 50 issues. The following are the memoize cache hits: 1392 returning memoize, ('Web Access', '1', None) 5 returning memoize, ('View', '1', 'issue') 2 returning memoize, ('Create', '1', 'issue') 1 returning memoize, ('View', '1', 'user') 1 returning memoize, ('Create', '1', 'keyword') the view/create issue is for the "Create New" link and other sidebar links. The misses are a superset of the hits since some of the sidebar checks are hit only once: 1 memoized ('Web Access', '1', None) 1 memoized ('View', '1', 'user') 1 memoized ('View', '1', 'query') 1 memoized ('View', '1', 'issue') 1 memoized ('Register', '1', 'user') 1 memoized ('Edit', '1', 'query') 1 memoized ('Edit', '1', None) 1 memoized ('Edit', '1', 'keyword') 1 memoized ('Create', '1', 'user') 1 memoized ('Create', '1', 'keyword') 1 memoized ('Create', '1', 'issue') there are 1457 non-memoized lookups compared to 1413 misses+hits. The top few by number of queries: 167 calling no memoize, ('View', '1', 'user', 'username', '1') 72 calling no memoize, ('View', '1', 'user', 'username', '8') 24 calling no memoize, ('View', '1', 'user', 'username', '6') 20 calling no memoize, ('View', '1', 'user', 'username', '3') 14 calling no memoize, ('View', '1', 'user', 'username', '5') 13 calling no memoize, ('View', '1', 'issue', 'title', '3') 12 calling no memoize, ('View', '1', 'user', 'username', '7') 9 calling no memoize, ('View', '1', 'status', 'abbreviation', '6') 9 calling no memoize, ('View', '1', 'status', 'abbreviation', '2') 9 calling no memoize, ('View', '1', 'issue', 'title', '5') 5 calling no memoize, ('View', '1', 'issue', 'title', '22') 5 calling no memoize, ('View', '1', 'issue', 'title', '2') 5 calling no memoize, ('View', '1', 'issue', 'title', '10') 4 calling no memoize, ('View', '1', 'status', 'abbreviation', '3') 4 calling no memoize, ('View', '1', 'issue', 'title', '201') 3 calling no memoize, ('View', '1', 'query', 'name', '8') for 357 calls. So memoize is handling about 49% of the hasPermission lookups. Any ideas on how/where to instrument this to see if/how much faster it is than non-memoized? issue.item views are less affected. On a page view I saw 6 HITS: 3 returning memoize, ('View', '1', 'keyword') 1 returning memoize, ('Rest Access', '1', None) 1 returning memoize, ('Register', '1', 'user') 3 misses, one per hit above and 19 hits like: 1 calling no memoize, ('View', '1', 'keyword', 'name', '9') where the final number ranges over all 21 keywords with 2 retired keywords. Looking at the user.index page (18 or so user's), Hits: 166 returning memoize, ('Web Access', '1', None) 14 returning memoize, ('Retire', '1', 'user') 6 returning memoize, ('View', '1', 'user') 1 returning memoize, ('Create', '1', 'user') 1 returning memoize, ('Create', '1', 'keyword') 1 returning memoize, ('Create', '1', 'issue') about 12 misses and 179 skips with 16 calls repeated three times, 2 calls repeated twice and 127 only seen once. We have hits+misses/hits+misses+skips of 52%. Viewing one of my user.item pages, I see a 31% ratio of memoizable calls to all calls. Any quips, comments, evasions, questions, or answers? Thanks. -- rouilj |
|||
msg8318 | Author: [hidden] (rouilj) | Date: 2025-01-21 19:49 | |
This is the first cut of a simple memoization for the Security::hasPermission call. It has logging enabled at error level so it should make it to a log file with default log level settings. I use the following commands on the log file to determine HIT/MISS/SKIP counts. sed -ne 's/^.*HIT//p' <logfile> | sort | uniq -c | sort -nr | less to see the permission HITs in order of most hit. Change HIT to SKIP or MISS to see the other stats. Replace the first 'sort' with wc -l to get total numbers. I also have code to count the Role::hasPermission calls. I am seeing a decrease of 75% on some pages. Probably because the permission I am looking for is at the end of a stack of them. I run under roundup-server in fork mode. So the cache gets collected on every connection. As a result I have not tried limiting the cache size. https://medium.com/@ireneziwang/solving-lru-cache-with-pythons-built-in-ordereddict-class-d76e0c82d3b6 is a workable proposal that can be done with a regular dict (in Python 3.7 or newer) if other ways of running Roundup cause a lot of memory use in the dict. Using: self.cache[cache_key_tuple] = self.cache.pop(cache_key_tuple) on a hit should keep the most recently used key at the end and the lru keys at the beginning of the dict. This replaces the move_to_end() method on an OrderedDict. There is also a patch to test_mailgw.py. One of the tests changes the permissions inside the test. So the hasPermission cache needs to be wiped for each subtest scenario. It would be good to get this tested under other long running operational scenarios: wsgi, Zope, mod_python, roundup-server in threads mode probably cover the likely scenarios. |
History | |||
---|---|---|---|
Date | User | Action | Args |
2025-01-21 19:49:51 | rouilj | set | keywords:
+ patch assignee: rouilj status: new -> open messages: + msg8318 files: + memoize_security::hasPermission.patch |
2025-01-19 05:52:50 | rouilj | set | resolution: remind |
2025-01-19 05:52:38 | rouilj | create |