Comments
EvitanRelta OP t1_j36l6kf wrote
Oh have I compared this to Pandoc?
No actually, I've never tried Pandoc before.
I'm not sure if Pandoc can be configured to do the same adaptive-preserving of HTML in markdown, like converting this HTML:
<h1><b>Italicised heading</b></h1>
<h1 align="center">
<i>Centered italicised heading</i>
</h1>
to this markdown:
# _Italicised heading_
<h1 align="center">
<i>Centered italicised heading</i>
</h1>
Anyone here with Pandoc experience who've tried this?
seanpuppy t1_j37gis3 wrote
I have some experience with it… basically it converts markup languages to a AST and you can convert that AST to lots of things. It does not preserve everything.
Eg: I write markdown with * as bullet points but if I convert from markdown -> ast -> markdown it will be formatted a little different
ive been working in a side project to extend / modify said AST to be able to “insert” markdown into existing markdown.
Im on mobile right now, half a cup of coffee, and ive got a meeting in a few mins but I can try and show an example command later if I remember
edit: Work is over and I found an online pandoc tool
EvitanRelta OP t1_j3ac48y wrote
Oh, in the link that u gave, the "align" attribute was transformed to:
{#centered-italicised-heading align="center"}
Is it possible to customise this to instead output as a HTML tag attribute? like:
<h1 align="center">
...
</h1>
what do u mean by "insert markdown into existing markdown"? could u give an example?
seanpuppy t1_j3aeh8f wrote
It most likely can't convert the html tag to stay as html. an <h1/> tag literally is a header with one # in markdown. Markdown was meant to be a more human read/write able form of HTML, where its meant to directly translate to HTML.
​
For "inserting markdown" I wish I had a better example ready, I haven't open-sourced this thing yet (or finished it) but it started as an idea to make using my existing note system more powerful / easy to use without actually opening a file.
All my notes from day to day are in markdown, and lets say I use a template with something like below, so that I have a dedicated note file for every single day. (I use a cool VS code extension called Dendron which is similar to Obsidian for markdown notes)
# TODO
* cure cancer
* take a nap
# Ideas
* turn water to wine
# Meetings
## Meeting 1
blah blah
## Meeting 2...
blah
I want to be able to quickly jot down an idea or todo item, but I don't want to have to actual do the mental context switch of switching windows and finding the daily file. My work computer is a mac laptop, and ive found Alfred to be a very powerful and flexible tool to do basically anything from any context.
​
So ideally I could have an Alfred command for "ideas" or "todos" etc... that would insert a string of text into my daily notes into the right spot. So in this case something like `inst $todo email joe about that thing` would insert "email joe about that thing" into a list block under a header tag called 'todo'
​
with an output like:
TODO
- cure cancer
- take a nap
- email joe about that thing
Ideas
- turn water to wine
​
But that got me thinking, there's potential for a powerful / flexible system of converting markdown into a tree like AST syntax that would let me reference different levels of the note similar to how one could reference nester JSON.
​
So I started exploring pandoc, which converts all sorts of things into an AST (Abstract syntax tree) which is almost what I want, except its flat. No hierarchy except bulleted lists. To me, a <h2> below an <h1> is a second level in - BUT pandoc would treat it as a flat list of different markdown elements.
​
I started out trying to write a python pandoc filter (see https://pandoc.org/filters.html ) but realized its intended design couldn't do what I want, but that doesn't really matter as pandoc can handle reading/writing from/to a pandoc AST, so any python script that reads in and spits out a compatible tree will work fine.
SO I created a python script that can handle SOME Markdown aspects, turn it into a nested tree, and spit back out a flat tree, which can then be used as ran through pandoc again to get markdown back out. Once I have that tree, I can start to design a syntax for specifying a part of the tree, and text I want to add, resulting in a modified nested tree, which can still be ultimately converted back to markdown.
​
Unfortunately I haven't opened sourced it yet, I haven't finished but realistically its got enough functionality to be worth sharing as WIP. I hope all this made sense, I'm not sure if I've explained this project to anyone in this much detail yet.
EvitanRelta OP t1_j463a01 wrote
sflr!
Then i guess my converter library has that edge over Pandoc. Specifically, this library can preserve the HTML better than pandoc
So, what im getting is that:
- u want to make a tasklist app
- that stores the notes in markdown
- with an commandline function that inserts a string as a list item under a specific header
sounds like it can be done by just using a bash script to parse the markdown file, find the headers, and just insert the listitem.
for the nested AST idea, im not sure what itd be useful for.
EvitanRelta OP t1_j328vdy wrote
Here's the repo if u wanna check it out or contribute! :D — https://github.com/EvitanRelta/htmlarkdown
You can try it out urself on this demo: https://evitanrelta.github.io/htmlarkdown/
It's my first (hopefully industry-standard) library so I'd love some feedback! (and any contributions, im the only contributor so far so pls send help)
djsnipa1 t1_j3dyrgn wrote
This looks nice so far!
I’ve been wanting to input a url and get readability like output in markdown (where it okay keeps the main article and not all the nav and header and footer, etc. I haven’t found one I like so if I end up building my own, I’d like to try to implement this with it.
EvitanRelta OP t1_j461czu wrote
Ayyy thanks! :D
also, sflr! Do drop the repo an issue or a pull request!
TomSwirly t1_j32kzaw wrote
"HTML-to-Markdown converter that adaptively preserve HTML when needed (eg. when center-aligning, or resizing images)"
should be "preserves". :-)
Looks very promising, I starred it!
EvitanRelta OP t1_j32la16 wrote
Ayy thanks for the star! :D
oh shit ure right, thanks ill correct the description!
seanpuppy t1_j3305h6 wrote
Have you compared how pandoc does it?