Not Another Static Generator
Find a file
Peter Molnar d3fbf2e51f Back To Pandoc
So, Python Markdown is a bottomless pit of horrors, including crippling parsing bugs,
random   out of nowhere, lack of features. It's definitely much faster, than
Pandoc, but Pandoc doesn't go full retard where there's a regex in a fenced code block,
that happens to be a regex for markdown elements.

Also added some ugly post string replacements to make Pandoc fenced code output work
with Prism:
instead of the Pandoc <pre class="codelang"><code>, Prism wants
<pre><code class="language-codelang>, so I added a regex sub, because it's 00:32.
2018-08-04 00:28:55 +01:00
.venv JSON-LD and JSONFeed, you know what? Nope. 2018-07-27 14:55:21 +01:00
templates Back To Pandoc 2018-08-04 00:28:55 +01:00
.gitignore - simplifications 2018-07-25 13:24:31 +01:00
__init__.py v4.0a1 2018-07-20 16:47:25 +01:00
exiftool.py v4.0a 2018-07-20 16:45:42 +01:00
LICENSE re-licencing to LGPLv3 2018-06-24 20:02:57 +01:00
nasg.py Back To Pandoc 2018-08-04 00:28:55 +01:00
pandoc.py Back To Pandoc 2018-08-04 00:28:55 +01:00
README.md update in readme and requirements 2018-07-23 11:04:13 +01:00
requirements.txt Back To Pandoc 2018-08-04 00:28:55 +01:00
run Back on prismjs <https://prismjs.com/> for syntax highlighting. 2018-08-02 22:47:49 +01:00
settings.py JSON-LD and JSONFeed, you know what? Nope. 2018-07-27 14:55:21 +01:00

NASG - not another static generator...

Near full circle: from static to full dynamic to semi-static

Nearly 20 years ago I did my very first website with a thing called Microsoft FrontPage. I was static, but assembled from footer, nav, etc. html parts by FrontPage, then uploaded to a free webhost, and I was very happy with it: it was extremely simple to edit and to maintain.

Years passed and first I wrote a CMS that used text files as storage, based on a PHP library that stored serialized objects in text files - basically JSON, before JSON even existed. Then I moved to MySQL, then dropped the whole thing for WordPress, which I loved, up until Gutenberg was announced, at which point I realized how nastily I tinkered and altered my WordPress already to use it with Markdown, to make image and handling a tiny bit better, etc.

So I dropped it and make a static generator, only to realize, there are things I can't make static. At that point - 2017 -, these were:

  • search
  • proper redirect and gone entry handling (you can't set HTTP headers from a HTML file)
  • receiving webmentions

Because I wanted to learn Python, the static generator is coded in Python, so I decided to run a Python web service with Sanic. It took me 3 iterations to realize, I'm doing it wrong, because the one and only thing that is available on nearly any webhost - think of plain old Apache - is PHP.

So the abomination I'm doing right now is to generate some near-static PHP files from the Python code which handles:

  • search still with SQLite, but due to PHP versions, with FTS4 instead of FTS5; populated from Python, read by PHP
  • gone (HTTP 410) and redirect (HTTP 301)

As for webmentions, as much as I try avoiding external dependencies, I came to realize a very simple fact: webmentions are external as well. So I started using webmention.io to receive them and query it on build time.

Now, about that the generator itself

The content is StriclYAML + MultiMarkdown. Except exiftool, there are no non-python dependencies any more, but exiftool is the only thing that parses lends data for photos.

Python libraries used:

Most of the processing relies on the structuring of my data:

  • whatever is not a directory in the root folder of the contents will be copied as is
  • directories mean category
  • 1 sub-directory per entry within the category, named as the post slug
  • index.md as main file
  • timestamp-sanitizedurl.md for webmentions and comments
  • if there is a .jpg, named the same as the post directory name, the post is a photo
  • all markdown image entries are replaced with <figure> with added visible exif data if they math the criterias that they are my photos, namely they match a regular expression in their exif Copyright or Artist field - this is produced by my camera
  • all images will be downsized and, if matched as photo, watermarked on build
/
├── about.html
├── category-1
│   ├── article-1
│   │   └── index.md
│   │   └── extra-file.mp4
│   │   └── 1509233601-domaincomentrytitle.md
│   ├── fancy-photo
│   │   └── index.md
│   │   └── fancy-photo.jpg

Mostly that's all.