ox.html: fix sanitizing whitespace-only strings #2860

Closed
opened 2015-11-24 18:31:00 +00:00 by wjt · 5 comments

ox.html.sanitize_html(u' ') fails with an exception from lxml. Here is a fix, and some tests.

Take your pick of the patches attached or this branch: https://gitlab.com/wjt/python-ox/compare/master...sanitize_html

`ox.html.sanitize_html(u' ')` fails with an exception from lxml. Here is a fix, and some tests. Take your pick of the patches attached or this branch: <https://gitlab.com/wjt/python-ox/compare/master...sanitize_html>
j added the
python-ox
label 2015-11-24 18:31:00 +00:00
j added this to the 14.04 milestone 2015-11-24 18:31:00 +00:00
j self-assigned this 2015-11-24 18:31:00 +00:00
j added the
normal
defect
labels 2015-11-24 18:31:00 +00:00
Author

Attachment 0001-ox.html.sanitize_html-fix-existing-tests.patch (1520 bytes) added

**Attachment** 0001-ox.html.sanitize_html-fix-existing-tests.patch (1520 bytes) added
Author

Attachment 0002-ox.html.sanitize_fragment-documentation-tests.patch (1164 bytes) added

**Attachment** 0002-ox.html.sanitize_fragment-documentation-tests.patch (1164 bytes) added
Author

Attachment 0003-ox.html-fix-sanitizing-whitespace-only-strings.patch (2019 bytes) added

**Attachment** 0003-ox.html-fix-sanitizing-whitespace-only-strings.patch (2019 bytes) added
Owner

hm gitlab tries hard to hide that https://gitlab.com/wjt/python-ox.git is what I want.

Added your repo as a remote and merged your changes into master https://wiki.0x2620.org/changeset/cbcef39/python-ox

hm gitlab tries hard to hide that <https://gitlab.com/wjt/python-ox.git> is what I want. Added your repo as a remote and merged your changes into master <https://wiki.0x2620.org/changeset/cbcef39/python-ox>
j added the
fixed
label 2015-11-24 18:58:06 +00:00
j closed this issue 2015-11-24 18:58:06 +00:00
Author

Replying to [j]comment:1:

hm gitlab tries hard to hide that https://gitlab.com/wjt/python-ox.git is what I want.

Hmm yes, I can't find any way to link to a page with both the clone URL and the name of a specific branch. Great.

Added your repo as a remote and merged your changes into master https://wiki.0x2620.org/changeset/cbcef39/python-ox

Thanks!

Replying to [j]comment:1: > hm gitlab tries hard to hide that <https://gitlab.com/wjt/python-ox.git> is what I want. Hmm yes, I can't find any way to link to a page with both the clone URL and the name of a specific branch. Great. > Added your repo as a remote and merged your changes into master <https://wiki.0x2620.org/changeset/cbcef39/python-ox> Thanks!
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: 0x2620/pandora#2860
No description provided.