Issue 2551212
Created on 2022-06-27 08:31 by marcus.priesch, last changed 2022-07-04 08:27 by schlatterbeck.
Files | ||||
---|---|---|---|---|
File name | Uploaded | Description | Edit | Remove |
wsgi_handler_speedup.patch | marcus.priesch, 2022-06-27 08:31 | |||
wsgi_cache_tracker.patch | rouilj, 2022-06-28 01:54 | Added feature flags around Marcus' patch. Added docs, CHANGES.txt, updated test. |
Messages | |||
---|---|---|---|
msg7586 | Author: [hidden] (marcus.priesch) | Date: 2022-06-27 08:31 | |
Hi There, on our last set of patches, the speed up patch was not applied to wsgi_handler.py upon trying what is currently in the repo i found that my rest-powered frontend again is rather slow. timing the requests gives 230-240ms per request with the current code, whereas with the attached patch applied i come down to 70-80ms per request ! i remember we discussed that the code currently in use reloads the tracker/templates on every request, which was decided to be a good thing. discussing with ralf we came to the conclusion, we could do this optionally via a config option. my personal opinion rather would be to apply the patch and to have the "template reload" feature only when using roundup-server and not in the wsgi backend - as this (in my opinion) mainly should be used in production environments where speed matters. however speaking about automatic code reload, django also does this (and i think web.py also, which has a smaller code base), maybe we should take a look at how they are doing things - with the current behaviour only the trackers auditors and templates get reloaded but i dont think changes in roundup core get reloaded also ... but i could be wrong here ... regards, marcus. |
|||
msg7587 | Author: [hidden] (rouilj) | Date: 2022-06-27 14:10 | |
Hi Marcus: How well have you tested your changes to make sure that the lack of reinitialization can't cause security issues? It looks like a new client is created on every connect which I think should guarantee that database access by a client is only done with proper access restrictions but the comment in instance.open that says: # load the database schema # we cannot skip this part even if self.optimize is set # because the schema has security settings that must be # applied to each database instance could a lack of a call to instance.open result in an incorrectly applied schema being used? Ralf ideas/concerns about applying this patch without tests less than three weeks before the 2.2.0 release? Ideas on how to test this? We do have test_live_server but I am not sure how do test any side effects from this patch. Also what is the effect of: with self.get_tracker() as tracker: as a context handler in the original code? Does it do any close/cleanup/release of resources? It looks like preload() just calls get_tracker(). Is it's role simply to prime the pump and preload/precompile the tracker before the call to: with self.get_tracker() as tracker: so that the cost isn't paid when the first connection is made? -- rouilj |
|||
msg7588 | Author: [hidden] (schlatterbeck) | Date: 2022-06-27 15:18 | |
On Mon, Jun 27, 2022 at 02:10:06PM +0000, John Rouillard wrote: > > John Rouillard added the comment: > > Hi Marcus: > > How well have you tested your changes to make sure that the lack of > reinitialization can't cause security issues? I don't think this will cause security implications. Only the instance is cached, not the database. Note how the client calls instance.open with a username in Client.opendb. This means we do not reload all reactors and actions for each request. But we *do* open the database with a new user in each request. > It looks like a new client is created on every connect which I think > should guarantee that database access by a client is only done with > proper access restrictions but the comment in > instance.open that says: > > # load the database schema > # we cannot skip this part even if self.optimize is set > # because the schema has security settings that must be > # applied to each database instance > > could a lack of a call to instance.open result in an incorrectly > applied schema being used? See above: instance.open seems to be called for each request. Of course we should test this, so technically the patch *is* lacking a test that makes sure that for each request an instance.open is called. I'm currently not sure how to test the wsgi in a test, though. > Ralf ideas/concerns about applying this patch without tests less than > three weeks before the 2.2.0 release? I didn't apply that particular patch at the time because I didn't fully understand it (and the i18n stuff was unrelated and worked without the patch). But I'm fairly sure it will not cause major breakage. But we can put it in the next release, I'm running with intermediate versions all the time. Then I'll probably have experience with a big deployment for the next release :-) > Ideas on how to test this? We do have test_live_server > but I am not sure how do test any side effects from this patch. > > Also what is the effect of: > > with self.get_tracker() as tracker: > > as a context handler in the original code? Does it do any > close/cleanup/release of resources? I think that was the original intention. > It looks like preload() just calls get_tracker(). Is it's role simply > to prime the pump and preload/precompile the tracker before the call > to: > > with self.get_tracker() as tracker: > > so that the cost isn't paid when the first connection is made? This might have been the intention but I don't think this has ever worked (for wsgi!) because the instance is re-created on every request. Which means the template instantiation is *not* retained for the next request independent of the setting of 'optimize' in the instance. You might also want to compare Marcus' change to the wsgi with the implementation of roundup_server in scripts/roundup_server.py: This *does also* cache the instance, see how the get_tracker method of RoundupRequestHandler caches and re-uses the instance in the TRACKERS (line 200). Its a little complicated how TRACKERS is only initialized in a sub-class but looks like the same functionality to me. Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7589 | Author: [hidden] (rouilj) | Date: 2022-06-27 18:05 | |
In message <20220627151828.45dnn3em4uv3sedi@runtux.com>, Ralf Schlatterbeck writes: >On Mon, Jun 27, 2022 at 02:10:06PM +0000, John Rouillard wrote: >> >> John Rouillard added the comment: >> How well have you tested your changes to make sure that the lack of >> reinitialization can't cause security issues? > >I don't think this will cause security implications. Only the instance >is cached, not the database. Note how the client calls instance.open >with a username in Client.opendb. >This means we do not reload all reactors and actions for each request. >But we *do* open the database with a new user in each request. Ok, that's what I expected from the Client initialization. >> It looks like a new client is created on every connect which I think >> should guarantee that database access by a client is only done with >> proper access restrictions but the comment in >> instance.open that says: >> >> # load the database schema >> # we cannot skip this part even if self.optimize is set >> # because the schema has security settings that must be >> # applied to each database instance >> >> could a lack of a call to instance.open result in an incorrectly >> applied schema being used? > >See above: instance.open seems to be called for each request. Of course >we should test this, so technically the patch *is* lacking a test that >makes sure that for each request an instance.open is called. I'm >currently not sure how to test the wsgi in a test, though. At the very least I can test wsgi startup using the test_liveserver.py module. Exposing that is a mystery to me too. >> Ralf ideas/concerns about applying this patch without tests less than >> three weeks before the 2.2.0 release? > >I didn't apply that particular patch at the time because I didn't fully >understand it (and the i18n stuff was unrelated and worked without the >patch). But I'm fairly sure it will not cause major breakage. But we can >put it in the next release, I'm running with intermediate versions all >the time. Then I'll probably have experience with a big deployment for >the next release :-) I am considering adding this with a feature switch. Since this feature deals with startup, I think the only place the feature switch can go is as an argument to RequestDispatcher(). AFAIK, the invocation of RequestDispatcher is in the wsgi wapper and is always locally coded so this should be doable. But I am not sure if is is a good idea/allowed. I would change RequestDispatcher::__init__ to read: def __init__(self, home, debug=False, timing=False, lang=None, feature=None) where feature is an object. So calling: app = RequestDispatcher(tracker_home, feature={"wsgi_single_load": true}) would set self.feature_flags and if wsgi_single_load is set would enable Marcus' 2 code path changes. If it was not set the current code paths would be run. Thoughts? This would allow yuo to use it in production but it would be disabled unless opted into. >> Also what is the effect of: >> >> with self.get_tracker() as tracker: >> >> as a context handler in the original code? Does it do any >> close/cleanup/release of resources? > >I think that was the original intention. Does it actually clean anything up? AFAICT it doesn't right? >> It looks like preload() just calls get_tracker(). Is it's role simply >> to prime the pump and preload/precompile the tracker before the call >> [...] >This might have been the intention but I don't think this has ever >worked (for wsgi!) because the instance is re-created on every request. >Which means the template instantiation is *not* retained for the next >request independent of the setting of 'optimize' in the instance. Yeah I was always a little confused how/if optimize worked. >You might also want to compare Marcus' change to the wsgi with the >implementation of roundup_server in scripts/roundup_server.py: This >*does also* cache the instance, see how the get_tracker method of >RoundupRequestHandler caches and re-uses the instance in the TRACKERS >(line 200). Its a little complicated how TRACKERS is only initialized in >a sub-class but looks like the same functionality to me. It certainly does look the same. Hence my thought to feature flag it. |
|||
msg7590 | Author: [hidden] (rouilj) | Date: 2022-06-28 01:54 | |
I am attaching a new patch. Marcus can you apply this and verify that using the feature flag as documented enables the same speedup you reported with your tests. In my testing I used test_liveserver.py and ran it 6 times in a row. I recorded the time pythonX -m pytest for the last 5 runs. I ran the loop 4 times under: python2 with feature_flags={} python3 with feature_flags={} python2 with with feature_flags={"cache_tracker": ""} python3 with with feature_flags={"cache_tracker": ""} The cached tracker does seem to be faster. With python2 test_liveserver.py completes about 4% faster on average. For python3 it completes about 16% faster. I'm not seeing your x3 improvement but that may be because the time to run the test is dominated by pytest and setup overhead. Also I am running anydbm as the backend that may add to the overhead. If this looks ok, I'll get it committed and then work on how to test. I think I might be able to set up an i/o buffer for the wsgi server to write to. Passing the i/o buffer in via a feature flag could work. Then I should be able to get the output just from the wsgi handler from there in the test harness. -- rouilj |
|||
msg7591 | Author: [hidden] (rouilj) | Date: 2022-06-30 23:53 | |
Marcus, have you tested my latest patch and verified that it works for you if you set the feature flag as documented in doc/upgrading.txt? Release is on 13 July so less than 2 weeks out. -- rouilj |
|||
msg7592 | Author: [hidden] (schlatterbeck) | Date: 2022-07-01 06:43 | |
On Thu, Jun 30, 2022 at 11:53:23PM +0000, John Rouillard wrote: > > Marcus, have you tested my latest patch and verified that it works for you > if you set the feature flag as documented in doc/upgrading.txt? He's on vacation, I'll look into your patch and test it here. Note that I think that the feature flag default should be *on* eventually: We may want to set it to *off* now and warn users in the release note that they should test it with the feature turned on because it will change in the next release? What do you think? Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7593 | Author: [hidden] (schlatterbeck) | Date: 2022-07-01 08:07 | |
On Tue, Jun 28, 2022 at 01:54:03AM +0000, John Rouillard wrote: > > The cached tracker does seem to be faster. With python2 > test_liveserver.py completes about 4% faster on average. For python3 > it completes about 16% faster. I'm not seeing your x3 improvement but > that may be because the time to run the test is dominated by pytest > and setup overhead. Also I am running anydbm as the backend that may > add to the overhead. This is probably also because the tracker in question has *a lot* more detectors and actions and *a lot* more templates. Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7594 | Author: [hidden] (schlatterbeck) | Date: 2022-07-01 08:18 | |
On Tue, Jun 28, 2022 at 01:54:03AM +0000, John Rouillard wrote: > > I am attaching a new patch. Marcus can you apply this and verify that > using the feature flag as documented enables the same speedup you > reported with your tests. John, I've just looked at your patch and it seems to require that the user changes some things in the guts of roundup? What do you think about making this a config option that for now is off by default and will later becom on by default (next release)? We can warn in the release notes that people should test the option and that it boosts performance (especially if there are many detectors and/or templates). Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7595 | Author: [hidden] (schlatterbeck) | Date: 2022-07-01 09:26 | |
I should read the documentation that comes with a patch more closely: John, I'm fine with your changes, in fact the way you use the feature flags is nice, the user has to come up with a WSGI script anyway so changing this is fine. No need for a config option and we can make the feature the default later on. With your patch turned on I'm observing a speedup from about 270ms down to about 80-85ms on the same tracker Marcus was using (see https://github.com/time-track-tool/time-track-tool). So from my side: Lets go for it. I'll keep you posted when we have rolled the changes including the feature-flag in production. |
|||
msg7596 | Author: [hidden] (rouilj) | Date: 2022-07-01 11:18 | |
In message <20220701064342.5gxgebwj2mq5ksh6@runtux.com>, Ralf Schlatterbeck writes: >Ralf Schlatterbeck added the comment: > >On Thu, Jun 30, 2022 at 11:53:23PM +0000, John Rouillard wrote: >> >> Marcus, have you tested my latest patch and verified that it works for you >> if you set the feature flag as documented in doc/upgrading.txt? > >He's on vacation, I'll look into your patch and test it here. >Note that I think that the feature flag default should be *on* >eventually: If it still oooks fine by the time 2.3.0 is released, the feature flag will become a no-op and only the patched version will be present. I only threw the feature flag in because it came in at the last minute and I wanted to make it available for testing. >We may want to set it to *off* now and warn users in the >release note that they should test it with the feature turned on because >it will change in the next release? What do you think? That's my intent for the 2.3.0 release. The docs include the following lines (the patch attached tothe issue has a typo 2.0.2 in place of 2.2.0) : Note that this is experimental and was added during the 2.2.0 beta period, so it is hidden behind a feature flag. If you use this and it works better for you please followup with an email on the roundup-users at lists.sourceforge.net. Thanks. |
|||
msg7597 | Author: [hidden] (rouilj) | Date: 2022-07-01 11:21 | |
In message <20220701081837.wrp2izp3rmo7h2kn@runtux.com>, Ralf Schlatterbeck writes: > > >Ralf Schlatterbeck added the comment: > >On Tue, Jun 28, 2022 at 01:54:03AM +0000, John Rouillard wrote: >> >> I am attaching a new patch. Marcus can you apply this and verify that >> using the feature flag as documented enables the same speedup you >> reported with your tests. > >John, I've just looked at your patch and it seems to require that the >user changes some things in the guts of roundup? >What do you think about making this a config option that for now is off >by default and will later becom on by default (next release)? We can >warn in the release notes that people should test the option and that it >boosts performance (especially if there are many detectors and/or >templates). Npe the only things that should change to enable it is the wsgi.py wrapper written by the user or retreived from the frontends directory. No internal code changes need to be done by the user to enable it. At the point where the patch's action has to happen in roundup/cgi/wsgi_hander.py there no tracker is open, so no way to read the config file. |
|||
msg7598 | Author: [hidden] (rouilj) | Date: 2022-07-02 18:08 | |
Committed with doc changes, testing with feature enabled/disabled. Also enabled by default in frontends/wsgi.py. Thanks Marcus and Ralf. changeset: 6747:d32d43e4a5ba (Also I noticed that wsgi.py wasn't being installed like roundup.cgi. I fixed this as well on changeset: 6745:3c198cd7c24e) |
|||
msg7599 | Author: [hidden] (schlatterbeck) | Date: 2022-07-04 08:27 | |
On Sat, Jul 02, 2022 at 06:08:06PM +0000, John Rouillard wrote: > > Committed with doc changes, testing with feature enabled/disabled. > > Also enabled by default in frontends/wsgi.py. Thanks Marcus and Ralf. Thanks John! Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
History | |||
---|---|---|---|
Date | User | Action | Args |
2022-07-04 08:27:50 | schlatterbeck | set | messages: + msg7599 |
2022-07-02 18:08:06 | rouilj | set | priority: normal status: open -> fixed resolution: fixed messages: + msg7598 |
2022-07-01 11:21:41 | rouilj | set | messages: + msg7597 |
2022-07-01 11:18:05 | rouilj | set | messages: + msg7596 |
2022-07-01 09:26:02 | schlatterbeck | set | messages: + msg7595 |
2022-07-01 08:18:39 | schlatterbeck | set | messages: + msg7594 |
2022-07-01 08:07:25 | schlatterbeck | set | messages: + msg7593 |
2022-07-01 06:43:45 | schlatterbeck | set | messages: + msg7592 |
2022-06-30 23:53:23 | rouilj | set | messages: + msg7591 |
2022-06-28 01:54:03 | rouilj | set | status: new -> open assignee: rouilj messages: + msg7590 files: + wsgi_cache_tracker.patch |
2022-06-28 01:47:01 | admin | set | files: + wsgi_handler_speedup.patch |
2022-06-28 01:42:21 | rouilj | set | files: - wsgi_handler_speedup.patch |
2022-06-27 18:05:29 | rouilj | set | messages: + msg7589 |
2022-06-27 17:15:49 | mbehrle | set | nosy: + mbehrle |
2022-06-27 15:18:31 | schlatterbeck | set | messages: + msg7588 |
2022-06-27 14:10:06 | rouilj | set | messages: + msg7587 |
2022-06-27 08:31:58 | marcus.priesch | create |