Back To Pandoc
So, Python Markdown is a bottomless pit of horrors, including crippling parsing bugs,
random out of nowhere, lack of features. It's definitely much faster, than
Pandoc, but Pandoc doesn't go full retard where there's a regex in a fenced code block,
that happens to be a regex for markdown elements.
Also added some ugly post string replacements to make Pandoc fenced code output work
with Prism:
instead of the Pandoc <pre class="codelang"><code>, Prism wants
<pre><code class="language-codelang>, so I added a regex sub, because it's 00:32.
2018-08-04 00:28:55 +01:00
|
|
|
import subprocess
|
|
|
|
import logging
|
|
|
|
|
2018-08-08 09:42:42 +01:00
|
|
|
class Pandoc(str):
|
2018-08-04 09:30:26 +01:00
|
|
|
def __new__(cls, text):
|
|
|
|
# TODO: cache?
|
|
|
|
# import hashlib
|
|
|
|
# print(hashlib.md5("whatever your string is".encode('utf-8')).hexdigest())
|
Back To Pandoc
So, Python Markdown is a bottomless pit of horrors, including crippling parsing bugs,
random out of nowhere, lack of features. It's definitely much faster, than
Pandoc, but Pandoc doesn't go full retard where there's a regex in a fenced code block,
that happens to be a regex for markdown elements.
Also added some ugly post string replacements to make Pandoc fenced code output work
with Prism:
instead of the Pandoc <pre class="codelang"><code>, Prism wants
<pre><code class="language-codelang>, so I added a regex sub, because it's 00:32.
2018-08-04 00:28:55 +01:00
|
|
|
|
2018-08-04 09:30:26 +01:00
|
|
|
""" Pandoc command line call with piped in- and output """
|
|
|
|
cmd = (
|
|
|
|
'pandoc',
|
|
|
|
'-o-',
|
|
|
|
'--from=markdown+%s' % (
|
|
|
|
'+'.join([
|
|
|
|
'footnotes',
|
|
|
|
'pipe_tables',
|
2018-08-08 09:42:42 +01:00
|
|
|
'strikeout',
|
|
|
|
#'superscript',
|
|
|
|
#'subscript',
|
2018-08-04 09:30:26 +01:00
|
|
|
'raw_html',
|
|
|
|
'definition_lists',
|
|
|
|
'backtick_code_blocks',
|
|
|
|
'fenced_code_attributes',
|
|
|
|
'shortcut_reference_links',
|
|
|
|
'lists_without_preceding_blankline',
|
|
|
|
'autolink_bare_uris',
|
|
|
|
])
|
|
|
|
),
|
|
|
|
'--to=html5',
|
|
|
|
'--quiet',
|
|
|
|
'--no-highlight'
|
|
|
|
)
|
|
|
|
p = subprocess.Popen(
|
Back To Pandoc
So, Python Markdown is a bottomless pit of horrors, including crippling parsing bugs,
random out of nowhere, lack of features. It's definitely much faster, than
Pandoc, but Pandoc doesn't go full retard where there's a regex in a fenced code block,
that happens to be a regex for markdown elements.
Also added some ugly post string replacements to make Pandoc fenced code output work
with Prism:
instead of the Pandoc <pre class="codelang"><code>, Prism wants
<pre><code class="language-codelang>, so I added a regex sub, because it's 00:32.
2018-08-04 00:28:55 +01:00
|
|
|
cmd,
|
2018-08-04 09:30:26 +01:00
|
|
|
stdin=subprocess.PIPE,
|
|
|
|
stdout=subprocess.PIPE,
|
|
|
|
stderr=subprocess.PIPE,
|
Back To Pandoc
So, Python Markdown is a bottomless pit of horrors, including crippling parsing bugs,
random out of nowhere, lack of features. It's definitely much faster, than
Pandoc, but Pandoc doesn't go full retard where there's a regex in a fenced code block,
that happens to be a regex for markdown elements.
Also added some ugly post string replacements to make Pandoc fenced code output work
with Prism:
instead of the Pandoc <pre class="codelang"><code>, Prism wants
<pre><code class="language-codelang>, so I added a regex sub, because it's 00:32.
2018-08-04 00:28:55 +01:00
|
|
|
)
|
2018-08-04 09:30:26 +01:00
|
|
|
|
|
|
|
stdout, stderr = p.communicate(input=text.encode())
|
|
|
|
if stderr:
|
|
|
|
logging.warning(
|
|
|
|
"Error during pandoc covert:\n\t%s\n\t%s",
|
|
|
|
cmd,
|
|
|
|
stderr
|
|
|
|
)
|
|
|
|
r = stdout.decode('utf-8').strip()
|
|
|
|
return str.__new__(cls, r)
|