Roundup Tracker - Issues

Issue 2551008

classification
Title: Wrong header encoding handling in mailgw.py
Type: behavior Severity: normal
Components: Mail interface Versions: devel, 1.6, 1.5
process
Status: new Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: ezio.melotti, joseph_myers
Priority: normal Keywords: Effort-Low

Created on 2018-10-10 01:07 by ezio.melotti, last changed 2018-10-28 19:40 by joseph_myers.

Messages
msg6276 Author: [hidden] (ezio.melotti) Date: 2018-10-10 01:07
RoundupMessage._decode_header (formerly known as
Message._decode_header_to_utf8) is incorrect when the encoding is missing:

    def _decode_header(self, hdr):
        parts = []
        for part, encoding in decode_header(hdr):
            if encoding:
                part = part.decode(encoding)
            parts.append(part)
        return ''.join([u2s(p) for p in parts])

If the encoding is specified, the parts will be decoded to a list of
unicode strings, if it isn't, parts will be a list of byte strings.  In
the latter case, u2s() will fail to encode the byte strings on Python 2
if they contain non-ascii characters, and it will always fail on Python
3 since byte strings don't have an .encode() method.

I fixed this downstream by attempting the decoding using utf-8 first and
falling back on iso-8859-1 if that fails:
* https://hg.python.org/tracker/roundup/rev/d7454b42b914
* http://psf.upfronthosting.co.za/roundup/meta/issue668

The code on 1.5 is slightly different, but the logic is the same.
msg6295 Author: [hidden] (joseph_myers) Date: 2018-10-28 19:40
I think this fix is appropriate to apply (a testcase would be nice to have 
in the testsuite, but I don't know how hard that is to write).
History
Date User Action Args
2018-10-28 19:40:44joseph_myerssetnosy: + joseph_myers
messages: + msg6295
2018-10-10 01:07:54ezio.melotticreate