How would you go about parsing markdown?

Posted by John Leidegren on Stack Overflow See other posts from Stack Overflow or by John Leidegren
Published on 2009-03-03T07:27:37Z Indexed on 2010/05/03 8:18 UTC
Read the original article Hit count: 321

Filed under:
|

You can find the syntax here.

The thing is, the source that follows with the download is written in perl. Which I have no intentions of honoring. It is riddled with regex and it relies on MD5 hashes to escape certain characters. Something is just wrong about that!

I'm about to hard code a parser for markdown and I'm wonder if someone had some experience with this?

Edit:

If you don't have anything meaningful to say about the actual parsing of markdown, spare me the time. (This might sound harsh, but yes, I'm looking for insight, not a solution i.e. third-party library).

To help a bit with the answers, regex are meant to identify patterns! NOT to parse an entire grammar. That people consider doing so is foobar.

  • If you think about markdown, it's fundamentally based around the concept of paragraphs.
  • As such, a reasonable approach might be to split the input into paragraphs.
  • There are many kinds of paragraphs e.g. heading, text, list, blockquote, code.
  • The challenge is thus to identify these paragraphs and in what context they occur.

I'll be back with a solution, once I find it's worthy to be shared.

© Stack Overflow or respective owner

Related posts about markdown

Related posts about parsing