Fix logic for indentation inside list items

This fixes problems with the markdownify logic for indentation inside
list items.

This PR uses a branch building on that for #120, #150 and #151, so
those three PRs should be merged first before merging this one.

There is limited logic in markdownify for handling indentation in the
case of nested lists.  There are two major problems with this logic:

* As it's in `convert_list`, causing a list to be indented when inside
  another list, it does not add indentation for any other elements
  such as paragraphs that may be found inside list items (or `<pre>`,
  `<blockquote>`, etc.), so such elements are wrongly not indented and
  terminate the list in the output.

* It uses fixed indentation of one tab.  Following CommonMark, a tab
  in Markdown is considered equivalent to four spaces, which is not
  sufficient indentation in ordered list items with a number of three
  or more digits.

Fix both of these issues by making `convert_li` handle indentation for
the contents of `<li>`, based on the length of the list item marker,
rather than doing it in `convert_list` at all.
This commit is contained in:
Joseph Myers
2024-10-03 21:04:40 +00:00
parent 340aecbe98
commit c13bdd5c14
2 changed files with 14 additions and 7 deletions

View File

@@ -244,8 +244,8 @@ class MarkdownConverter(object):
text = text.replace('_', r'\_')
return text
def indent(self, text, level):
return line_beginning_re.sub('\t' * level, text) if text else ''
def indent(self, text, columns):
return line_beginning_re.sub(' ' * columns, text) if text else ''
def underline(self, text, pad_char):
text = (text or '').rstrip()
@@ -346,7 +346,7 @@ class MarkdownConverter(object):
el = el.parent
if nested:
# remove trailing newline if nested
return '\n' + self.indent(text, 1).rstrip()
return '\n' + text.rstrip()
return '\n\n' + text + ('\n' if before_paragraph else '')
convert_ul = convert_list
@@ -368,7 +368,12 @@ class MarkdownConverter(object):
el = el.parent
bullets = self.options['bullets']
bullet = bullets[depth % len(bullets)]
return '%s %s\n' % (bullet, (text or '').strip())
bullet = bullet + ' '
text = (text or '').strip()
text = self.indent(text, len(bullet))
if text:
text = bullet + text[len(bullet):]
return '%s\n' % text
def convert_p(self, el, text, convert_as_inline):
if convert_as_inline: