Issue 2551250
Created on 2022-12-21 18:36 by schlatterbeck, last changed 2023-02-02 01:21 by rouilj.
Messages | |||
---|---|---|---|
msg7693 | Author: [hidden] (schlatterbeck) | Date: 2022-12-21 18:36 | |
I've recently moved a production tracker from python2 to python3. Now I'm getting *occasionally* a Traceback TypeError: '<' not supported between instances of 'function' and 'function' when sorting detectors (auditors and reactors). These are using a PrioList object which sorts on demand before iterating over auditors and/or reactors. The idea there is that detectors can have a priority. The objects in the list are 3-tuples with prio, name, function. This should never try to look at the function unless we have two entries with the same prio (the default is 100 and this is seldomly set explicitly so they may all be the same) and name. Since this is quite urgent for me I'll try to fix this by allowing a key function to be passed to the PrioList. This would then be something like "lambda x: x[:2]" so that we never compare function objects. This is probably the right thing to do. What I find very confusing is that this should happen everytime when sorting the detectors. It doesn't. |
|||
msg7694 | Author: [hidden] (schlatterbeck) | Date: 2022-12-21 19:17 | |
I've pushed a fix that fixes the sorting issue should there ever be two detectors for the same event with the same name and prio. But: In my tracker I do not have duplicate detector names. In fact I'm currently unable to reproduce the problem. Is it possible that under certain circumstances we add detectors twice? Note that this occurred also in the mailgw interface, maybe when processing more than one mail this can occur? |
|||
msg7695 | Author: [hidden] (rouilj) | Date: 2022-12-21 20:01 | |
Hi Ralf: In message <1671647765.44.0.499935488614.issue2551250@roundup.psfhosted.org>, Ralf Schlatterbeck writes: >I've recently moved a production tracker from python2 to python3. >Now I'm getting *occasionally* a Traceback > >TypeError: '<' not supported between instances of 'function' and 'function' > >when sorting detectors (auditors and reactors). >These are using a PrioList object which sorts on demand before >iterating over auditors and/or reactors. [...] >The objects in the list are 3-tuples with prio, name, function. >This should never try to look at the function unless we have two >entries with the same prio (the default is 100 and this is seldomly >set explicitly so they may all be the same) and name. I would expect the prio to be the same but not the name. Could this be a string vs bytestring issue on the name? Are any of the names in German? >Since this is quite urgent for me I'll try to fix this by allowing a >key function to be passed to the PrioList. This would then be >something like "lambda x: x[:2]" so that we never compare function >objects. This is probably the right thing to do. Agreed. I wonder if this may be an issue in other places we use Priolist. Can you dump the list it is sorting as well so we have a clue where the first two elements of the tuple are identical. This data dump would also serve as a test case for PrioList. >What I find very confusing is that this should happen everytime when >sorting the detectors. It doesn't. To make this happen the name also has to compare the same right? So I would not expect it to happen. Any idea how the name is ending up the same? Maybe something is being doubly added under certain circumstances? I seem to recall some changes to python internals regarding iteration orders for dicts, but it shouldn't affect tuples (i.e. the function gets sorted before the prio/name). |
|||
msg7696 | Author: [hidden] (schlatterbeck) | Date: 2022-12-22 07:43 | |
On Wed, Dec 21, 2022 at 08:01:46PM +0000, John Rouillard wrote: > I would expect the prio to be the same but not the name. Could this be > a string vs bytestring issue on the name? Are any of the names in > German? No, the names are function names in python, they're plain ascii. And I've checked, I don't have any duplicate names. > Agreed. I wonder if this may be an issue in other places we use > Priolist. Can you dump the list it is sorting as well so we have a > clue where the first two elements of the tuple are identical. I think the PrioList is currently only used for auditors/reactors. I think I implemented this a decade or two ago :-) And, yes, I've dumped the list here and found no duplicates. But then I also could not reproduce the error so far. > >What I find very confusing is that this should happen everytime when > >sorting the detectors. It doesn't. > > To make this happen the name also has to compare the same right? So I > would not expect it to happen. Any idea how the name is ending up the > same? Maybe something is being doubly added under certain > circumstances? Yes, the name would have to be the same, too. And I've checked, I do not have any reactors/auditors with duplicate names. So my current hypothesis is, too, that we get some duplication in some circumstances. > I seem to recall some changes to python internals regarding iteration > orders for dicts, but it shouldn't affect tuples (i.e. the function > gets sorted before the prio/name). Yes, and in our case we explicitly sort the list. So iteration order should be the same in every invocation. I've rolled out my change yesterday evening so I think I will not see the error anymore. But I'll try to find out if we get duplication somewhere. Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7703 | Author: [hidden] (rouilj) | Date: 2022-12-25 04:51 | |
Hi Ralf: In message <1671650263.38.0.386812066679.issue2551250@roundup.psfhosted.org>, Ralf Schlatterbeck writes: >I've pushed a fix that fixes the sorting issue should there ever >be two detectors for the same event with the same name and prio. Should two entries with the same name/prio even be allowed? Since list.sort() is stable, identical entries will preserve the original relative order of identical keys, but that's not documented anywhere that I can see. In fact we don't document that identically prioritied and named items can be added. Given that it used to cause a crash, I assume it's not being used. >But: In my tracker I do not have duplicate detector names. >In fact I'm currently unable to reproduce the problem. I assume you're trying to reproduce without the patch. Do you have any debugging statements in place that may perturb things? Also what version of Python are you running? I wonder if we should raise an exception when adding an element with the same name and priority as an existing element. Detecting this at the source should identify where the duplicates are coming from. This might expose another bug. I am concerned about the performance issue but we could add the key (priority, name) to a dict and just compare the key to determine membership. Ideally we would detect the issue at time of creation, but wrapping: self.list.sort() in try/except to capture the exception and dump self.list might provide some clues. I have been looking for changes between python 2 and python 3 that could trigger this but I got nothing. >Is it possible that under certain circumstances we add detectors twice? I suggested that as a possibility in my response to your initial report. I have no idea how that would happen (intermittently) though. >Note that this occurred also in the mailgw interface, maybe when >processing more than one mail this can occur? Maybe. I assume you are fetching email via imap from cron? Might: https://github.com/bamthomas/aioimaplib/blob/master/aioimaplib/tests/imapserver.py or https://github.com/antespi/docker-imap-devel or some other imap mock server be useful for testing? |
|||
msg7704 | Author: [hidden] (schlatterbeck) | Date: 2022-12-27 14:19 | |
On Sun, Dec 25, 2022 at 04:51:14AM +0000, John Rouillard wrote: > > In message <1671650263.38.0.386812066679.issue2551250@roundup.psfhosted.org>, > Ralf Schlatterbeck writes: > >I've pushed a fix that fixes the sorting issue should there ever > >be two detectors for the same event with the same name and prio. > > Should two entries with the same name/prio even be allowed? I don't see any good reason to specifically check for this. > Since list.sort() is stable, identical entries will preserve the > original relative order of identical keys, but that's not documented > anywhere that I can see. In fact we don't document that identically > prioritied and named items can be added. Given that it used to cause a > crash, I assume it's not being used. Yes I think this is a minor issue. In most use-cases you don't even assign an explicit prio to detectors. > >But: In my tracker I do not have duplicate detector names. > >In fact I'm currently unable to reproduce the problem. > > I assume you're trying to reproduce without the patch. Do you have > any debugging statements in place that may perturb things? Also what > version of Python are you running? This is 3.8 on the production machine. Note that this will not occur on python2 because there comparison of functions is allowed. I currently do not yet have debug code in place (I just wanted to get out a fix before christmas :-) Also note: I'm *newly* getting the same error message on a tracker that is running on python3 for quite some time (also 3.8 afair). So is it possible that recently something has been introduce that duplicates detectors? On the other hand this tracker is quite low-volume... so it may well be that we never hat multiple incoming emails in a single mailgw run yet. > I wonder if we should raise an exception when adding an element with > the same name and priority as an existing element. Detecting this at > the source should identify where the duplicates are coming from. This > might expose another bug. I am concerned about the performance issue > but we could add the key (priority, name) to a dict and just compare > the key to determine membership. or a set() > > Ideally we would detect the issue at time of creation, but wrapping: > > self.list.sort() > > in try/except to capture the exception and dump self.list might > provide some clues. I have been looking for changes between python 2 > and python 3 that could trigger this but I got nothing. I don't think there were changes when introducing py3, the comparison (for sorting) of functions works fine in python2 but not in python3: : ralf@bee:3; python3 >>> def f1 (): ... pass ... >>> def f2 (): ... pass ... >>> f1 < f2 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: '<' not supported between instances of 'function' and 'function' : ralf@bee:4; python2 ... >>> f1 < f2 True > >Is it possible that under certain circumstances we add detectors twice? > > I suggested that as a possibility in my response to your initial > report. I have no idea how that would happen (intermittently) though. > >Note that this occurred also in the mailgw interface, maybe when > >processing more than one mail this can occur? > > Maybe. I assume you are fetching email via imap from cron? It might occur when processing more than one email in the mailgw if there is a bug that repeatedly adds detectors for each mail that is processed. That's my current hypothesis... And, yes, I'm using imap. > Might: > > https://github.com/bamthomas/aioimaplib/blob/master/aioimaplib/tests/imapserver.py > > or > > https://github.com/antespi/docker-imap-devel > > or some other imap mock server be useful for testing? Yes, unless we can reproduce it with some other mailgw method, it might not be related to imap. If we don't write each other, I wish you 'einen guten Rutsch ins Neue Jahr' (roughly translates as 'a good slide to the new year' :-) Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7705 | Author: [hidden] (schlatterbeck) | Date: 2022-12-27 15:33 | |
On Tue, Dec 27, 2022 at 02:19:33PM +0000, Ralf Schlatterbeck wrote: > Also note: I'm *newly* getting the same error message on a tracker that > is running on python3 for quite some time (also 3.8 afair). Turns out that was a different error (comparison between NoneType and float in one of my detectors). Ralf -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting www: www.runtux.com Reichergasse 131, A-3411 Weidling email: office@runtux.com |
|||
msg7710 | Author: [hidden] (rouilj) | Date: 2023-02-02 01:21 | |
Ralf any luck chasing this down? |
History | |||
---|---|---|---|
Date | User | Action | Args |
2023-02-02 01:21:43 | rouilj | set | status: new -> open messages: + msg7710 |
2022-12-27 15:33:18 | schlatterbeck | set | messages: + msg7705 |
2022-12-27 14:19:33 | schlatterbeck | set | messages: + msg7704 |
2022-12-25 04:51:14 | rouilj | set | messages: + msg7703 |
2022-12-22 07:43:38 | schlatterbeck | set | messages: + msg7696 |
2022-12-21 20:01:46 | rouilj | set | messages: + msg7695 |
2022-12-21 19:17:43 | schlatterbeck | set | messages: + msg7694 |
2022-12-21 18:36:05 | schlatterbeck | create |