Roundup Tracker - Issues

Message5125

Author schlatterbeck
Recipients ezio.melotti, rouilj, schlatterbeck
Date 2014-07-21.15:43:31
Message-id <20140721154327.GB8553@runtux.com>
In-reply-to <1405842264.52.0.339327978945.issue2550848@psf.upfronthosting.co.za>
Thanks John for creating the issue.

On Sun, Jul 20, 2014 at 07:44:24AM +0000, Ezio Melotti wrote:
> 
> FWIW we have been using this detector to force text/plain on (x)html
> documents:
> http://hg.python.org/tracker/python-dev/file/6f1b863bd1d8/detectors/no_texthtml.py
> 
> > Browsers seem to be interpreting *any* content-type without a '/' as 
> > html. This is actively used by spammers.
> 
> I'm not sure if our detector covers this case, but so far it's been
> working fine for us.

Your detector will *not* work for the case we were discussing. Browsers
accept any content-type without a '/' as html. And search engines will
happily index stuff as html. So as soon as someone sets the type to
'anything' or 'wrzlbrmft' your detector will fail and the file will be
interpreted as html.

Note the the current fix is very similar to your detector and I didn't
know browsers are doing this until it came up here.

We are shipping text/html attachments as application/octet-stream unless
a config-option is set. This needs to be extended to content types that
don't contain a '/'. A whitelist feature would be even better
(configurable list of content-types that are *not* mangled when shipping
via the web server).

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   http://www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
allmenda.com member                     email: rsc@allmenda.com
History
Date User Action Args
2014-07-21 15:43:32schlatterbecksetrecipients: + schlatterbeck, rouilj, ezio.melotti
2014-07-21 15:43:32schlatterbecklinkissue2550848 messages
2014-07-21 15:43:31schlatterbeckcreate