Issue 2551092
Created on 2020-10-08 14:51 by ced, last changed 2020-10-27 00:59 by rouilj.
msg6967 |
Author: [hidden] (ced) |
Date: 2020-10-08 14:51 |
|
I got this traceback on email with subject encoding Q:
Traceback (most recent call last):
File "/usr/lib/python3.7/site-packages/roundup/mailgw.py", line 1511, in handle_Message
return self.handle_message(message)
File "/usr/lib/python3.7/site-packages/roundup/mailgw.py", line 1583, in handle_message
return self._handle_message(message)
File "/usr/lib/python3.7/site-packages/roundup/mailgw.py", line 1594, in _handle_message
self.parsed_message = self.parsed_message_class(self, message)
File "/usr/lib/python3.7/site-packages/roundup/mailgw.py", line 509, in __init__
self.subject = message.get_header('subject', '')
File "/usr/lib/python3.7/site-packages/roundup/mailgw.py", line 252, in get_header
return self._decode_header(value.replace('\n', ''))
File "/usr/lib/python3.7/site-packages/roundup/mailgw.py", line 219, in _decode_header
for part, encoding in decode_header(hdr):
File "/usr/lib/python3.7/site-packages/roundup/anypy/email_.py", line 124, in decode_header
last_word += word
TypeError: can only concatenate str (not "bytes") to str
I found that the roundup.anypy.email_.decode_header which is a copy of the stdlib email.header.decode_header is missing a line which convert word into bytes if it is a str (on Python3).
Here is attached a patch which applies the change only for Python3.
Also I think the fix of issue2551008 may be removed as now decode_header will always returns bytes.
It will even better to remove roundup.anypy.email_ completly once Python2 is no more supported (issue2550879).
|
msg6968 |
Author: [hidden] (rouilj) |
Date: 2020-10-10 04:36 |
|
Hi Cédric:
Do you have a test for this? It looks like a testcase added to:
class HeaderRoundupMessageTests(TestCase):
of test/test_mailgw_roundupmessage.py is the right place to put it.
I spent about 1/2 an hour trying to get a test put together from
your description but I didn't get a quoted printable subject
encoding to fail.
Re rolling back the patch for issue2551008, it doesn't look like
we have a valid test/tests for that case so I am uncomfortable
with unrolling that patch. IIUC that patch and this patch would be
compatible so nothing bad would happen if it is not rolled back.
-- rouilj
|
msg6975 |
Author: [hidden] (rouilj) |
Date: 2020-10-22 16:01 |
|
Ping Cédric: do you have an example subject line (in python format)
I can use in testing?
|
msg6977 |
Author: [hidden] (ced) |
Date: 2020-10-22 16:37 |
|
I do not have anymore the email that was breaking it.
But from memory it was a subject with some non-ascii char like "é".
|
msg6978 |
Author: [hidden] (rouilj) |
Date: 2020-10-22 18:24 |
|
Hi =?utf-8?q?C=C3=A9dric_Krier?=: (hope the encoding works correcly.)
In message <1603384621.38.0.248728051864.issue2551092@roundup.psfhosted.org>,
=?utf-8?q?C=C3=A9dric_Krier?= writes:
>I do not have anymore the email that was breaking it.
>But from memory it was a subject with some non-ascii char like "é".
Do you remember if it was a literal utf-8 character or an encoded
entity in the subject. For example in the tests I have:
Subject: [issue] Test (=?utf-8?b?w4TDlsOc?=) umlauts
and that passes fine. isinstance(word,str) is False as word (under
python3) is a byte b'' not normal string.
|
msg6980 |
Author: [hidden] (ced) |
Date: 2020-10-22 19:23 |
|
On 2020-10-22 18:24, John Rouillard wrote:
> Hi =?utf-8?q?C=C3=A9dric_Krier?=: (hope the encoding works correcly.)
>
> In message <1603384621.38.0.248728051864.issue2551092@roundup.psfhosted.org>,
> =?utf-8?q?C=C3=A9dric_Krier?= writes:
> >I do not have anymore the email that was breaking it.
> >But from memory it was a subject with some non-ascii char like "é".
>
> Do you remember if it was a literal utf-8 character or an encoded
> entity in the subject.
I do not remember but I know it was sent via mutt (like this email).
|
msg6981 |
Author: [hidden] (ced) |
Date: 2020-10-22 19:26 |
|
On 2020-10-22 18:24, John Rouillard wrote:
> In message <1603384621.38.0.248728051864.issue2551092@roundup.psfhosted.org>,
> =?utf-8?q?C=C3=A9dric_Krier?= writes:
> >I do not have anymore the email that was breaking it.
> >But from memory it was a subject with some non-ascii char like "é".
>
> Do you remember if it was a literal utf-8 character or an encoded
> entity in the subject. For example in the tests I have:
>
> Subject: [issue] Test (=?utf-8?b?w4TDlsOc?=) umlauts
It looked more like:
test_encodi?==?utf-8?B?bmcgw6k=?=
|
msg7001 |
Author: [hidden] (rouilj) |
Date: 2020-10-27 00:46 |
|
Applied suggested patch in: rev 6278:f21ec1414591
I wasn't able to come up with a test case that exercised it.
|
|
Date |
User |
Action |
Args |
2020-10-27 00:59:48 | rouilj | set | resolution: duplicate -> fixed |
2020-10-27 00:46:12 | rouilj | set | status: open -> fixed resolution: duplicate messages:
+ msg7001 |
2020-10-22 19:26:02 | ced | set | messages:
+ msg6981 title: TypeError: can only concatenate str (not "bytes") to str test encoding é -> TypeError: can only concatenate str (not "bytes") to str |
2020-10-22 19:23:31 | ced | set | messages:
+ msg6980 title: TypeError: can only concatenate str (not "bytes") to str -> TypeError: can only concatenate str (not "bytes") to str test encoding é |
2020-10-22 18:24:26 | rouilj | set | messages:
+ msg6978 |
2020-10-22 16:37:01 | ced | set | messages:
+ msg6977 |
2020-10-22 16:01:33 | rouilj | set | messages:
+ msg6975 |
2020-10-10 04:36:17 | rouilj | set | priority: high assignee: rouilj status: new -> open messages:
+ msg6968 nosy:
+ rouilj |
2020-10-08 14:51:36 | ced | create | |
|