Roundup Tracker - Issues

Issue 2550998

classification
Title: Consider removing 8-bit character set support
Type: behavior Severity: normal
Components: Web interface Versions: devel
process
Status: new Resolution:
Dependencies: Superseder:
Assigned To: Nosy List: joseph_myers, rouilj
Priority: Keywords:

Created on 2018-09-02 19:11 by joseph_myers, last changed 2018-09-02 21:52 by rouilj.

Messages
msg6225 Author: [hidden] (joseph_myers) Date: 2018-09-02 19:11
doc/customizing.txt has a section "8-bit character set support in Web
interface" describing how a character set other than UTF-8 can be used
for sending web pages to the client, given an @charset form variable
which may result in a roundup_charset cookie being set.

The stated reason is "Unfortunately, some older browsers do not work
properly with utf-8-encoded pages (e.g. Netscape Navigator 4 displays
wrong characters in form fields).".  That text was added in 2004 - that
is, those browsers were "older" even then.  I think it would be
reasonable, in 2018, to say that those browsers are now completely
irrelevant and web pages sent to the client should always be sent in
UTF-8, so simplifying the code.

(I noticed this when fixing how actions returning binary data were
handled with Python 3; the re-encoding would break such actions for
Python 2, if anyone ever tried to use them together with the @charset
feature, which probably no-one ever did.)
msg6226 Author: [hidden] (rouilj) Date: 2018-09-02 21:25
Hi Joseph:

In message <1535915481.24.0.56676864532.issue2550998@psf.upfronthosting.co.za>,
Joseph Myers writes:
>New submission from Joseph Myers:
>
>doc/customizing.txt has a section "8-bit character set support in Web
>interface" describing how a character set other than UTF-8 can be used
>for sending web pages to the client, given an @charset form variable
>which may result in a roundup_charset cookie being set.

Does this mean that the code that supports roundup_charset would be
removed?

If so, is it possible that that could have another use?

If not, then I say remove it and the customizing.txt reference. There
will need to be a section in upgrading.txt that addresses how to
remove the customizing code in the tracker if it's in use.
msg6227 Author: [hidden] (joseph_myers) Date: 2018-09-02 21:47
On Sun, 2 Sep 2018, John Rouillard wrote:

> Does this mean that the code that supports roundup_charset would be
> removed?

Yes.  And all the code deal with conversions if self.charset != 
self.STORAGE_CHARSET, in both client.py and actions.py.

> If so, is it possible that that could have another use?

I don't see a use for it.

As far as I can tell, unless the form specifies @charset, a 
roundup_charset cookie will never be set in the first place.  And unless 
either the form specifies @charset or there is a roundup_charset cookie, 
the client's charset attribute is set to self.STORAGE_CHARSET (i.e. 
'utf-8') in __init__ and never changed afterwards, so all the code dealing 
with recoding to/from another charset is dead in that case.  And since 
Roundup always generates forms in UTF-8 unless @charset or roundup_charset 
are used, and does not use the accept-charset attribute on forms, form 
responses will always be submitted in UTF-8 so no recoding on input is 
needed.
msg6228 Author: [hidden] (rouilj) Date: 2018-09-02 21:52
In message <alpine.DEB.2.21.1809022133560.1285@digraph.polyomino.org.uk>,
Joseph Myers writes:
>On Sun, 2 Sep 2018, John Rouillard wrote:
>> If so, is it possible that that could have another use?
>
>I don't see a use for it.
>
>As far as I can tell, unless the form specifies @charset, a 
>roundup_charset cookie will never be set in the first place.  And unless 
>either the form specifies @charset or there is a roundup_charset cookie, 
>the client's charset attribute is set to self.STORAGE_CHARSET (i.e. 
>'utf-8') in __init__ and never changed afterwards, so all the code dealing 
>with recoding to/from another charset is dead in that case.  And since 
>Roundup always generates forms in UTF-8 unless @charset or roundup_charset 
>are used, and does not use the accept-charset attribute on forms, form 
>responses will always be submitted in UTF-8 so no recoding on input is 
>needed.

Works for me. Nuke it.
History
Date User Action Args
2018-09-02 21:52:40rouiljsetmessages: + msg6228
2018-09-02 21:47:05joseph_myerssetmessages: + msg6227
2018-09-02 21:25:02rouiljsetnosy: + rouilj
messages: + msg6226
2018-09-02 19:11:21joseph_myerscreate