Roundup Tracker - Issues

Issue 2550848

classification
HTML attachments should not be served as text/html
Type: security Severity: normal
Components: Web interface Versions: 1.5
process
Status: fixed fixed
:
: : ber, ezio.melotti, rouilj, schlatterbeck, techtonik
Priority: urgent :

Created on 2014-07-20 03:37 by rouilj, last changed 2015-12-02 20:10 by ber.

Files
File name Uploaded Description Edit Remove
mimetypes.log.txt techtonik, 2015-01-17 18:15
Messages
msg5120 Author: [hidden] (rouilj) Date: 2014-07-20 03:37
From the mailing list,  opening an issue as requested by ralf.

- HTML attachments should not be served as text/html, see discussion on
  roundup-users under the title "Spam attack, observations, how to
  repair". I've committed a partial fix but this needs more work:

  Browsers seem to be interpreting *any* content-type without a '/' as
  html. This is actively used by spammers. Since we don't currently have
  nofollow set for attachments, search engines happily index these
  pages.

  There is no issue yet, if someone has time and wants to contribute,
  making an issue for this would be a welcome contribution. If you do
  this, please set me on the nosy list.

The email thread Ralf referred to is at:

http://permalink.gmane.org/gmane.comp.bug-tracking.roundup.user/10733
msg5123 Author: [hidden] (ezio.melotti) Date: 2014-07-20 07:44
FWIW we have been using this detector to force text/plain on (x)html
documents:
http://hg.python.org/tracker/python-dev/file/6f1b863bd1d8/detectors/no_texthtml.py

> Browsers seem to be interpreting *any* content-type without a '/' as 
> html. This is actively used by spammers.

I'm not sure if our detector covers this case, but so far it's been
working fine for us.
msg5125 Author: [hidden] (schlatterbeck) Date: 2014-07-21 15:43
Thanks John for creating the issue.

On Sun, Jul 20, 2014 at 07:44:24AM +0000, Ezio Melotti wrote:
> 
> FWIW we have been using this detector to force text/plain on (x)html
> documents:
> http://hg.python.org/tracker/python-dev/file/6f1b863bd1d8/detectors/no_texthtml.py
> 
> > Browsers seem to be interpreting *any* content-type without a '/' as 
> > html. This is actively used by spammers.
> 
> I'm not sure if our detector covers this case, but so far it's been
> working fine for us.

Your detector will *not* work for the case we were discussing. Browsers
accept any content-type without a '/' as html. And search engines will
happily index stuff as html. So as soon as someone sets the type to
'anything' or 'wrzlbrmft' your detector will fail and the file will be
interpreted as html.

Note the the current fix is very similar to your detector and I didn't
know browsers are doing this until it came up here.

We are shipping text/html attachments as application/octet-stream unless
a config-option is set. This needs to be extended to content types that
don't contain a '/'. A whitelist feature would be even better
(configurable list of content-types that are *not* mangled when shipping
via the web server).

Ralf
-- 
Dr. Ralf Schlatterbeck                  Tel:   +43/2243/26465-16
Open Source Consulting                  www:   http://www.runtux.com
Reichergasse 131, A-3411 Weidling       email: office@runtux.com
allmenda.com member                     email: rsc@allmenda.com
msg5130 Author: [hidden] (ber) Date: 2014-08-05 13:10
This is a release blocker.
msg5179 Author: [hidden] (ber) Date: 2015-01-05 15:50
There have been discussions on this on December 2014 on the devel ml.

Ralf wrote in the end:
"""
So we should
- check for valid mime-types on incoming attachments (either via
  web-interface or via mail)
  Can be realized as an auditor so that users can change the policy
  here. We should only rewrite clearly invalid mime-types at that point.
- have a whitelist of attachments that can safely be shipped to the
  browser. All mime-types not in the whitelist are shipped as
  application/octet-stream. My tests indicate that browsers will not
  display these attachments with this content-type, they only offer to
  download the file. The original code by Richard attempted this but
  failed on invalid mime-types for reasons indicated above.

I think the hardest part is coming up with a decent whitelist that
doesn't miss too many content-types in use out there.
But users can reconfigure the whitelist (and give feedback) so we can
converge to something usable.
"""

Should be make seperate issues out of this?
msg5185 Author: [hidden] (techtonik) Date: 2015-01-17 12:26
On Sun, Jul 20, 2014 at 6:37 AM, John Rouillard
<issues@roundup-tracker.org> wrote:
>
> I've committed a partial fix but this needs more work:

Where is the commit(s) for the history?

>   Browsers seem to be interpreting *any* content-type without a '/' as
>   html.

Browser Security Handbook confirms this:
https://code.google.com/p/browsersec/wiki/Part2#Survey_of_content_sniffing_behaviors
msg5186 Author: [hidden] (rouilj) Date: 2015-01-17 15:24
Hi Anatoly:

In message <CAPkN8xJ91nCz_OW-Z3mo4-u2-Q3cuQBHFcETJ5BrwPs-xv406A@mail.gmail.com>
 <CAPkN8xJ91nCz_OW-Z3mo4-u2-Q3cuQBHFcETJ5BrwPs-xv406A@mail.gmail.com>,
anatoly techtonik writes:
>On Sun, Jul 20, 2014 at 6:37 AM, John Rouillard
><issues@roundup-tracker.org> wrote:
>>
>> I've committed a partial fix but this needs more work:
>
>Where is the commit(s) for the history?

I opened the issue on Jul 20, 2014 at Ralf's request.

So the commit comment originated from Ralf not me. Maybe check the
mercurial log for his commits rather than looking for my name will
turn up the commit?
msg5187 Author: [hidden] (techtonik) Date: 2015-01-17 17:34
>>> I've committed a partial fix but this needs more work:
>>Where is the commit(s) for the history?

Found it. 48d93e98be7b or
http://sourceforge.net/p/roundup/code/ci/48d93e98be7b3428785e1087495be7ec2ee81512/

Committing a better fix now.
msg5188 Author: [hidden] (techtonik) Date: 2015-01-17 18:15
Commit 63c31b18b955 fixes this issue:
http://sourceforge.net/p/roundup/code/ci/63c31b18b95593865fd8bbd932b0030d0e2110be/

It adds whitelist composed from analysis of attached file and
https://mail.python.org/pipermail/tracker-discuss/2015-January/003988.html

The whitelist is hardcoded, because adding another option requires a major
version bump, because it will lead to removal of allow_html_file, new
upgrading.txt docs etc. etc. and due to limited time that we all have I don't
want to delay a release.
msg5394 Author: [hidden] (ber) Date: 2015-12-02 20:10
Creating issue2550897 for tracking a better solution,
closing the urgend issue here, because it seems to be resolved.
(Testing reports appreciated.)
History
Date User Action Args
2015-12-02 20:10:03bersetstatus: new -> fixed
resolution: fixed
messages: + msg5394
2015-01-17 18:15:02techtoniksetfiles: + mimetypes.log.txt
messages: + msg5188
2015-01-17 17:34:38techtoniksetmessages: + msg5187
2015-01-17 15:24:21rouiljsetmessages: + msg5186
2015-01-17 12:26:52techtoniksetnosy: + techtonik
messages: + msg5185
2015-01-05 15:52:44berlinkissue2550863 dependencies
2015-01-05 15:50:38bersetmessages: + msg5179
2014-12-18 14:44:43techtoniksetpriority: high -> urgent
2014-08-05 13:10:34bersetmessages: + msg5130
2014-08-05 13:04:08bersetnosy: + ber
2014-07-21 15:43:32schlatterbecksetmessages: + msg5125
2014-07-20 07:44:24ezio.melottisetnosy: + ezio.melotti
messages: + msg5123
2014-07-20 03:37:18rouiljcreate