This is a partial alternative to #122 (open since April) for more
selective escaping of some special characters.
Here, we fix the test function naming (as noted in that PR) so the
tests are actually run (and fix some incorrect test assertions so they
pass). We also make escaping of `-#.)` (the most common cases of
unnecessary escaping in my use case) more selective, while still being
conservatively safe in escaping all cases of those characters that
might have Markdown significance (including in the presence of
wrapping, unlike in #122). (Being conservatively safe doesn't include
the cases where `.` or `)` start a fragment, where the existing code
already was not conservatively safe.)
There are certainly more cases where the code could also be made more
selective while remaining conservatively safe (including in the
presence of wrapping), so this is not a complete replacement for #122,
but by fixing some of the most common cases in a safe way, and getting
the tests actually running, I hope this allows progress to be made
where the previous attempt appears to have stalled, while still
allowing further incremental progress with appropriately safe logic
for other characters where useful.
Allow different strings before / after `<sub>` / `<sup>` content
In particular, this allows setting `sub_symbol='<sub>'`,
`sup_symbol='<sup>'`, to use raw HTML in the output when
converting subscripts and superscripts.
* Escape all characters with Markdown significance
There are many punctuation characters that sometimes have significance
in Markdown; more systematically escape them all (based on a new
escape_misc configuration option).
A limited attempt is made to limit the escaping of '.' and ')' to the
context where they might have Markdown significance (after a number,
where they can indicate an ordered list item); no such attempt is made
for the other characters (and even that limiting of '.' and ')' may
not be entirely safe in all cases, as it's possible the HTML could
have the number outside the block being escaped in one go,
e.g. `<span>1</span>.`.
---------
Co-authored-by: AlexVonB <AlexVonB@users.noreply.github.com>
* Avoid inline styles inside `<code>` / `<pre>` conversion
The check used for this is analogous to that used to avoid escaping
potential markup characters inside such tags.
Fixes#103
---------
Co-authored-by: AlexVonB <AlexVonB@users.noreply.github.com>
instead of manually walking down the dom tree
in a table, we now rely on the main descent loop
and just implement conversion for rows and cells
correctly. this enables the use of html inside a
table cell.