Roundup Tracker - Issues

Issue 562686

classification
email attachments from outlook express
Type: Severity: normal
Components: Mail interface Versions:
process
Status: closed fixed
:
: richard : anthonybaxter, dirckb, richard
Priority: normal :

Created on 2002-05-31 00:59 by dirckb, last changed 2002-07-22 22:02 by richard.

Files
File name Uploaded Description Edit Remove
diff.txt dirckb, 2002-05-31 01:02 patch for mailgw.py
Messages
msg241 Author: [hidden] (dirckb) Date: 2002-05-31 00:59
If "rich text" formatted email is sent to the email 
gateway from outlook express, and the email also has 
attachments, the actual text of the message is dropped.

Outlook express does something like this:

Content-Type: multipart/mixed;
        Content-Type: multipart/alternative;
                Content-Type: text/plain;
                Content-Type: text/html;
        Content-Type: 4whatevers/attached;

Here is a diff for the patch
===================
.\..\ru0.4.1\roundup\mailgw.py
--- mailgw.py   Thu Mar 14 23:59:24 2002
+++ ..\..\ru0.4.1\roundup\mailgw.py     Fri May 31 
00:09:10 2002
@@ -547,6 +547,23 @@
                     break
                 # parse it
                 subtype = part.gettype()
+
+                # outlook express patch
+                # if multipart/alternative,
+                #   get the first text/plain
+                if subtype == 'multipart/alternative':
+                    # skip over the intro to the first boundary
+                    p = part.getPart()
+                    while 1:
+                        p = part.getPart()
+                        if p is None:
+                            break
+                        st = p.gettype()
+                        if st == 'text/plain':
+                            subtype = st
+                            part = p
+                            break
+
                 if subtype == 'text/plain' and not content:
                     # The first text/plain part is the message 
content.
                     content = self.get_part_data_decoded
(part)
msg242 Author: [hidden] (dirckb) Date: 2002-05-31 01:02
Logged In: YES 
user_id=474181

that diff doesn't look very good, here's a text file
msg243 Author: [hidden] (richard) Date: 2002-05-31 01:30
Logged In: YES 
user_id=6405

This is a symptom of a larger problem.   
   
The multipart/alternative you look at here may be found in   
multipart/mixed and multipart/related. Ultimately, we really   
need to perform a depth-first search into multipart/xxx and   
text/rfc822 to find the first text/plain.   
   
What we do in terms of extracting files out usefully is a whole   
other nasty, messy question. I guess it'd be good enough to   
just pull out the top-level attachments from a   
multipart/[mixed|related] and just whack them in as issue file   
attachments. I'm fairly certain that roundup won't bounce   
Content-type: multipart/xxx correctly - it doesn't remember the  
boundary (Message.gettype() doesn't return it). 
 
We need to feed anthony's MIME-parser-killer mailbox into   
roundup's mailgw and see how many places it breaks ;)      
   
msg244 Author: [hidden] (anthonybaxter) Date: 2002-06-11 06:54
Logged In: YES 
user_id=29957

Ideally you'd want to use something like the non-strict
parser mode at
http://www.python.org/sf/565183
then walk the structure pulling out the text/[whatever]
bits. I can supply
code to do this in a sane way.

[Richard: the MIME test messages are on
devhost1:/cvsroot/voicemail/mimetests =
there's a bunch of ekit specific stuff in them, 
so I'm not comfortable checking it into sf as-is.]
msg245 Author: [hidden] (richard) Date: 2002-07-22 22:02
Logged In: YES 
user_id=6405

This patch has been applied to the 0.4 maintenance branch. 0.5 will 
include a new email part discovery mechanism which will handle even 
more cases. 
 
History
Date User Action Args
2002-05-31 00:59:27dirckbcreate