Message6027
I am adapting the patch at:
https://sourceforge.net/u/iippolitov/roundup/ci/2ee03ad0b0a5edbb8e68763
fbf03a1032cf8a83d/
from Igor Ippolitov which uses beautiful soup 4 to do the html
processing.
I can't get the debian python-bs4 to work right, so I am merging his
patch with the html2text code in dehtml.py attached to this issue.
The patch currently attempts to load beautiful soup and if it gets an
import error will fall back to using dehtml.py.
I am currently working on the test cases and so far all existing test
now pass. The new test cases I have: email with one text/html part and
one multipart with text/csv and text/html seem to work for ascii.
I am having issues with character representations for international
chars.
Does anybody have some time to test this code and see if it
at least doesn't break anything and make be useful for turning html
into text.
I still need to add a trivalue config option to select/deselect the
option:
beautifulsoup, dehtml, none
before I do the full commit. |
|
Date |
User |
Action |
Args |
2017-10-10 23:16:26 | rouilj | set | messageid: <1507677386.87.0.213398074469.issue2550799@psf.upfronthosting.co.za> |
2017-10-10 23:16:26 | rouilj | set | recipients:
+ rouilj, ber, marlowa |
2017-10-10 23:16:26 | rouilj | link | issue2550799 messages |
2017-10-10 23:16:25 | rouilj | create | |
|