Roundup Tracker - Issues

Message6027

Author rouilj
Recipients ber, marlowa, rouilj
Date 2017-10-10.23:16:25
Message-id <1507677386.87.0.213398074469.issue2550799@psf.upfronthosting.co.za>
In-reply-to
I am adapting the patch at: 

https://sourceforge.net/u/iippolitov/roundup/ci/2ee03ad0b0a5edbb8e68763
fbf03a1032cf8a83d/

from Igor Ippolitov which uses beautiful soup 4 to do the html 
processing.

I can't get the debian python-bs4 to work right, so I am merging his 
patch with the html2text code in dehtml.py attached to this issue.

The patch currently attempts to load beautiful soup and if it gets an 
import error will fall back to using dehtml.py.

I am currently working on the test cases and so far all existing test 
now pass. The new test cases I have: email with one text/html part and 
one multipart with text/csv and text/html seem to work for ascii. 
I am having issues with character representations for international 
chars.

Does anybody have some time to test this code and see if it
at least doesn't break anything and make be useful for turning html 
into text.

I still need to add a trivalue config option to select/deselect the 
option:

  beautifulsoup, dehtml, none

before I do the full commit.
History
Date User Action Args
2017-10-10 23:16:26rouiljsetmessageid: <1507677386.87.0.213398074469.issue2550799@psf.upfronthosting.co.za>
2017-10-10 23:16:26rouiljsetrecipients: + rouilj, ber, marlowa
2017-10-10 23:16:26rouiljlinkissue2550799 messages
2017-10-10 23:16:25rouiljcreate