Roundup Tracker - Issues

Issue 2551112

classification
Trigger preloading of modules and template compilation with WSGI
Type: resource usage Severity: normal
Components: Web interface Versions: 2.1.0
process
Status: fixed fixed
:
: schlatterbeck : cmeerw, rouilj, schlatterbeck, tttech-klonner
Priority: normal : patch

Created on 2021-02-02 14:45 by tttech-klonner, last changed 2021-02-12 18:41 by rouilj.

Messages
msg7059 Author: [hidden] (tttech-klonner) Date: 2021-02-02 14:45
As discussed with Ralf Schlatterbeck we should add the tracker 
initialization to the init function of the class RoundupDispatcher. 
Otherwise compilation of TAL templates and loading of python modules 
and libraries happends during the first request to Roundup.

This issue poped up during experimentation with pre-loading and lazy 
loading of WSGI processes with Apache. With this patch the request time 
for a rest api call from the first request dropped down from 2 seconds 
to 450ms.

diff --git a/roundup/cgi/wsgi_handler.py b/roundup/cgi/wsgi_handler.py
index 767221ac..ef4547a1 100644
--- a/roundup/cgi/wsgi_handler.py
+++ b/roundup/cgi/wsgi_handler.py
@@ -84,6 +84,9 @@ class RequestDispatcher(object):
                 tracker_home=home)
         else:
             self.translator = None
+        # trigger pre-loading of imports and templates
+        with self.get_tracker():
+            pass
 
     def __call__(self, environ, start_response):
         """Initialize with `apache.Request` object"""
msg7060 Author: [hidden] (rouilj) Date: 2021-02-03 03:49
I see self.get_tracker() invoked in __call__ as well.
Could adding this call in __init__ have undesirable side
effects?
msg7062 Author: [hidden] (tttech-klonner) Date: 2021-02-03 13:00
You are right, I did not consider that.

We should probably do this in a way that self.get_tracker() is invoked in 
the __init__() and just used in __call__().
msg7063 Author: [hidden] (schlatterbeck) Date: 2021-02-03 13:44
I do not think there are any unwanted side-effects, the get_tracker method is called for each request to the tracker and returns an initialized tracker object which is discarded after the request.

In the first call in __init__ there is no request but that's fine.

The noteable side-effect of that call is pulling in all the imports *and* initializing all the detectors and actions. Which in the timetracker case is more than a second that is *not* seen by the first real request.
msg7064 Author: [hidden] (rouilj) Date: 2021-02-03 14:06
Hi Ralf and Robert:

Ralf addresses my concern. I expect the results of __init__ to be
long lived and __call__ to create a per user/connection instance so the 
different users don't get each other's environment/settings.

Ralf do you want to do the commit and changelog?

-- rouilj
msg7065 Author: [hidden] (schlatterbeck) Date: 2021-02-03 16:55
I've pushed the change 6321:979cecdb70f8 closing this issue.
msg7070 Author: [hidden] (cmeerw) Date: 2021-02-09 17:54
I would prefer to have this customizable in some way - the idea behind 
having a separate get_tracker method was to allow additional 
customisation (e.g. inherit from RequestDispatcher and override 
get_tracker or add additional initialisation in the derived class). This 
hardcoded call from __init__ makes it a lot harder to customise the 
behaviour (because your overriding get_tracker relies on additional 
initialization from your __init__).
msg7071 Author: [hidden] (rouilj) Date: 2021-02-09 20:48
Hi Christof:

In message <1612893294.13.0.394483134398.issue2551112@roundup.psfhosted.org>,
Christof Meerwald writes:
>Christof Meerwald added the comment:
>
>I would prefer to have this customizable in some way - the idea behind 
>having a separate get_tracker method was to allow additional 
>customisation (e.g. inherit from RequestDispatcher and override 
>get_tracker or add additional initialisation in the derived class). This 
>hardcoded call from __init__ makes it a lot harder to customise the 
>behaviour (because your overriding get_tracker relies on additional 
>initialization from your __init__).

How would overriding work? If I understand the code flow properly
interfaces.py is loaded after the call to get_tracker. So
interfaces.py can't be used to override get_tracker.

I can only see it being overridden from a wsgi.py wrapper like:

    import sys
    # obtain the WSGI request dispatcher
    from roundup.cgi.wsgi_handler import RequestDispatcher

    tracker_home = '/home/trackers/demo'

    def My_get_tracker(self):
        # do other stuff
        RequestDispatcher.get_tracker(self)

    RequestDispatcher.get_tracker = My_get_tracker

    # runs _init_ with My_get_Tracker right?
    app =  RequestDispatcher(tracker_home)

Did I get something wrong here?
msg7072 Author: [hidden] (cmeerw) Date: 2021-02-09 21:03
In your wsgi wrapper you can do something like:

from roundup.cgi.wsgi_handler import RequestDispatcher

class MyRequestDispatcher(RequestDispatcher):
    def __init__(self, home):
        RequestDispatcher.__init__(self, home)

    @contextmanager
    def get_tracker(self):
        tracker = ...
        yield tracker

tracker_home = '/path/to/tracker/home'
app = MyRequestDispatcher(tracker_home)
msg7073 Author: [hidden] (rouilj) Date: 2021-02-09 21:35
Hi Christof:

In message <1612904631.59.0.383569670816.issue2551112@roundup.psfhosted.org>,
Christof Meerwald writes:
>In your wsgi wrapper you can do something like:
>
>from roundup.cgi.wsgi_handler import RequestDispatcher
>
>class MyRequestDispatcher(RequestDispatcher):
>    def __init__(self, home):
>        RequestDispatcher.__init__(self, home)
>
>    @contextmanager
>    def get_tracker(self):
>        tracker = ...
>        yield tracker
>
>tracker_home = '/path/to/tracker/home'
>app = MyRequestDispatcher(tracker_home)

Ok, that makes sense. Does the change to __init__ to call
get_tracker() break this in some way? I am still trying to get my head
around your concern.

> ... allow additional customisation (e.g. inherit from
> RequestDispatcher and override=20 get_tracker or add additional
> initialisation in the derived class). This hardcoded call from
> __init__ makes it a lot harder to customise the behaviour
> (because your overriding get_tracker relies on additional=20
> initialization from your __init__).

It seems that this still allows for customization.

Note that in the __init__ call the result from get_tracker is
discarded, so it is basically a noop as I understand it.  It is
invoked only for its side effects.

Any customization of get_tracker() would require that the programmer
make sure that any side effects are ok to invoke multiple times. But I
don't consider this to be "a lot harder" but I may be missing
something.
msg7074 Author: [hidden] (cmeerw) Date: 2021-02-10 07:37
The main complication now is that get_tracker is called before __init__ 
has finished, so a simple implementation that would use a single 
tracker instance (maybe because you know you WSGI server is single-
threaded anyway) doesn't work any more:

class MyRequestDispatcher(RequestDispatcher):
    def __init__(self, home):
        RequestDispatcher.__init__(self, home)
        self._tracker = roundup.instance.open(self.home, not 
self.debug)

    @contextmanager
    def get_tracker(self):
        yield self._tracker
msg7076 Author: [hidden] (rouilj) Date: 2021-02-11 23:08
Hi Christof:

Ralf do you have feedback here?

Christof, I think you are saying that the problem is with inheriting 
RequestDispatcher and replacing the __init__ method right?

If that is true, then I think this patch can stand. The new
class: MyRequestDispatcher (in your example) with the simple __init__
will have a delay on the first request by not pre-compiling
templates by calling get_tracker in __init__ but I don't see that as 
show stopper.

You mentioned you wanted this configurable, what would be the
advantage of adding another config.ini option? If you were going
to write a new __init__ method anyway that can remove the get_tracker
dependency from __init__ anyway.

Also are you inheriting RequestDispatcher in your current trackers
so this change actually breaks something?

-- rouilj
msg7077 Author: [hidden] (schlatterbeck) Date: 2021-02-12 09:27
On Thu, Feb 11, 2021 at 11:08:46PM +0000, John Rouillard wrote:
> Ralf do you have feedback here?

John, Christof,
what do you think about factoring out the call to get_tracker into a
preload method, something along the lines of (untested, if you agree
I'll implement this):

class RequestDispatcher(object):
    def __init__(self, home, debug=False, timing=False, lang=None):
    ...

        self.preload()

    def preload(self):
        """ trigger pre-loading of imports and templates """
        with self.get_tracker():
            pass

Then derived classes may easily override the preload method if needed
and we still have preloading the default.

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg7078 Author: [hidden] (cmeerw) Date: 2021-02-12 15:18
Yes, moving it to a "preload" method would definitely be an improvement. 
Another option could be to have a base class you can derive from that 
doesn't have the "preload" functionality. Either option would work for 
me.
msg7079 Author: [hidden] (rouilj) Date: 2021-02-12 17:42
In message <1613143109.77.0.35134070202.issue2551112@roundup.psfhosted.org>,
Christof Meerwald writes:
>Yes, moving it to a "preload" method would definitely be an improvement.

I am fine with that. Ralf any ideas on how to add this file to the
test suite? We have 0 lines of coverage there, so anything would be
good.

>Another option could be to have a base class you can derive from that 
>doesn't have the "preload" functionality. Either option would work for 
>me.

I don't think cluttering the code with a new RequestDispatcherBase
class and then creating:

  RequestDispatcher(RequestDispatcherBase):

     def __init__(...):

     def preload(...):

with just these two (or just __init__) methods is warranted.
msg7080 Author: [hidden] (schlatterbeck) Date: 2021-02-12 17:49
On Fri, Feb 12, 2021 at 03:18:29PM +0000, Christof Meerwald wrote:
> 
> Yes, moving it to a "preload" method would definitely be an improvement. 
> Another option could be to have a base class you can derive from that 
> doesn't have the "preload" functionality. Either option would work for 
> me.

OK, I've pushed the change.

On Fri, Feb 12, 2021 at 05:42:46PM +0000, John Rouillard wrote:
> I am fine with that. Ralf any ideas on how to add this file to the
> test suite? We have 0 lines of coverage there, so anything would be
> good.

No good idea, but I'm regularly testing with wsgi (manually). I'm not
aware of any framework that can test wsgi outside of a web browser.

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
msg7081 Author: [hidden] (rouilj) Date: 2021-02-12 18:41
In message <20210212174951.2p2s4rwzhifk53dd@runtux.com>,
Ralf Schlatterbeck writes:
>On Fri, Feb 12, 2021 at 03:18:29PM +0000, Christof Meerwald wrote:
>On Fri, Feb 12, 2021 at 05:42:46PM +0000, John Rouillard wrote:
>> I am fine with that. Ralf any ideas on how to add this file to the
>> test suite? We have 0 lines of coverage there, so anything would be
>> good.
>No good idea, but I'm regularly testing with wsgi (manually).

Well that's something at least.

>I'm not aware of any framework that can test wsgi outside of a web browser.

Could you see https://github.com/jerrykan/wsgi-liveserver referenced
in https://issues.roundup-tracker.org/issue2551101 as helpful? If so
we can followup on 2551101.
History
Date User Action Args
2021-02-12 18:41:43rouiljsetstatus: pending -> fixed
2021-02-12 18:41:24rouiljsetmessages: + msg7081
2021-02-12 17:49:53schlatterbecksetmessages: + msg7080
2021-02-12 17:42:46rouiljsetmessages: + msg7079
2021-02-12 15:18:29cmeerwsetmessages: + msg7078
2021-02-12 09:27:39schlatterbecksetmessages: + msg7077
2021-02-11 23:08:46rouiljsetstatus: closed -> pending
messages: + msg7076
2021-02-10 07:37:23cmeerwsetmessages: + msg7074
2021-02-09 21:35:53rouiljsetmessages: + msg7073
2021-02-09 21:03:51cmeerwsetmessages: + msg7072
2021-02-09 20:48:11rouiljsetmessages: + msg7071
2021-02-09 17:54:54cmeerwsetnosy: + cmeerw
messages: + msg7070
2021-02-03 16:55:59schlatterbecksetstatus: open -> closed
resolution: fixed
messages: + msg7065
2021-02-03 14:06:41rouiljsetpriority: normal
assignee: schlatterbeck
status: new -> open
messages: + msg7064
components: + Web interface, - Infrastructure
2021-02-03 13:44:26schlatterbecksetnosy: + schlatterbeck
messages: + msg7063
2021-02-03 13:00:24tttech-klonnersetmessages: + msg7062
2021-02-03 03:49:03rouiljsetnosy: + rouilj
messages: + msg7060
2021-02-02 17:37:37rouiljsetkeywords: + patch
2021-02-02 14:45:21tttech-klonnercreate