Roundup Tracker - Issues

Message6504

Author rouilj
Recipients ezio.melotti, joseph_myers, rouilj
Date 2019-06-02.21:16:23
Message-id <1559510183.72.0.583721604963.issue2551008@roundup.psfhosted.org>
In-reply-to
Additional info from offline discussion with Ezio:

Crash is:

Traceback (most recent call last):
  File "/home/roundup/lib/python2.6/site-packages/roundup/mailgw.py", 
line 1519, in handle_Message
    return self.handle_message(message)
  File "/home/roundup/lib/python2.6/site-packages/roundup/mailgw.py", 
line 1590, in handle_message
    return self._handle_message(message)
  File "/home/roundup/lib/python2.6/site-packages/roundup/mailgw.py", 
line 1601, in _handle_message
    self.parsed_message = self.parsed_message_class(self, message)
  File "/home/roundup/lib/python2.6/site-packages/roundup/mailgw.py", 
line 555, in __init__
    self.subject = message.getheader('subject', '')
  File "/home/roundup/lib/python2.6/site-packages/roundup/mailgw.py", 
line 285, in getheader
    return self._decode_header_to_utf8(hdr)
  File "/home/roundup/lib/python2.6/site-packages/roundup/mailgw.py", 
line 273, in _decode_header_to_utf8
    return ''.join([s.encode('utf-8') for s in l])
UnicodeDecodeError: 'ascii' codec can't decode byte 0x85 in position 
33: ordinal not in range(128)

s is a non-ascii byte string: in py2 when you try to encode a byte 
string it first tries to automatically decode it using the default 
encoding (ascii) in order to get unicode, and then tries to encode the 
unicode string using the specified encoding -- but this fails for non-
ascii byte strings

Failure may be provoked by setting subject to:

  'ßðèé'.encode('utf-8')

Patch alternate URL:

  
https://bitbucket.org/python/roundup/commits/d7454b42b914a69e6d1e1de99f
e79fa6c8d6d982


at line 273 all the parts in l are encoded, in order to be encoded
they all must be unicode however at line 266, that part is decoded
only if the encoding is specified (i.e. if the condition of the if
is true), otherwise is left as it is in order to be decoded at line
266, that part must be bytes, so if the condition is false, the part
is appended to l as bytes, and the encoding at line 273 fails.

also note that on python2 this might work as long as the part is ascii-
only due to the implicit conversion between str and unicode, but on 
python 3, line 273 will always fail if the encoding at line 265 is not 
specified
History
Date User Action Args
2019-06-02 21:16:23rouiljsetmessageid: <1559510183.72.0.583721604963.issue2551008@roundup.psfhosted.org>
2019-06-02 21:16:23rouiljsetrecipients: + rouilj, ezio.melotti, joseph_myers
2019-06-02 21:16:23rouiljlinkissue2551008 messages
2019-06-02 21:16:23rouiljcreate