AST reference

Specification for every node type the parser can produce. Use this chapter when you write a client renderer.

Document (root)

{
  "type":     "document",
  "version":  "1.0",
  "warnings": [ {"code": "W001", "message": "...", "context": "..."} ],
  "children": [ /* block nodes */ ],
  "meta":     {}
}
FieldTypeNotes
versionstringBumped on breaking shape changes. Currently "1.0".
warningsarrayList of diagnostic dicts. Always present (may be empty).
childrenarrayBlock nodes. Always present (may be empty).
metaobjectOpen-ended bag for transforms (TOC, slug map, …). May be absent.

Block nodes

heading

{ "type": "heading", "level": 2, "children": [ /* inline */ ], "id": "section-id" }
  • level ∈ 1–6.
  • id is added by the slugify transform; absent otherwise.
  • Inline children are restricted (rule W001). Images become alt text.

paragraph

{ "type": "paragraph", "children": [ /* inline */ ] }

blockquote

{ "type": "blockquote", "children": [ /* block */ ] }

code_block

{
  "type":            "code_block",
  "language":        "python",
  "value":           "print('hi')",
  "filename":        "main.py",
  "highlight_lines": [1, 3, 4, 5]
}

filename and highlight_lines are omitted when absent.

image (block-level)

{ "type": "image", "src": "https://...", "alt": "description", "title": null }

Produced when a Markdown source line contains exactly one image and nothing else. Mixed lines yield inline_image instead.

list

{
  "type":     "list",
  "ordered":  false,
  "start":    1,
  "children": [ /* list_item */ ]
}

start is only present on ordered lists.

list_item

{
  "type":     "list_item",
  "checked":  true,
  "children": [ /* block or inline */ ]
}

checked is true/false for GFM tasklist items, absent for plain items.

table, table_row, table_cell

{
  "type": "table",
  "head": { "type": "table_head", "rows": [ /* row */ ] },
  "body": { "type": "table_body", "rows": [ /* row */ ] }
}
{
  "type":      "table_cell",
  "is_header": true,
  "align":     "center",
  "children":  [ /* inline */ ]
}

align"left" | "center" | "right" | null. Block content inside cells is forbidden (rule W002).

divider

{ "type": "divider" }

widget

{
  "type":   "widget",
  "widget": "tip",
  "props":  { "title": "Pro tip" },
  "slots":  {
    "default": [ /* block nodes */ ],
    "footer":  [ /* block nodes */ ]
  }
}

slots always contains a "default" key (possibly an empty array). See the Widgets chapter.

html_block

{ "type": "html_block", "value": "<div>raw html</div>" }

The parser does not interpret HTML — it passes blocks through verbatim and emits a W007 informational diagnostic.

footnote_def

{ "type": "footnote_def", "label": "1", "children": [ /* block */ ] }

Inline nodes

TypeFields
textvalue (string)
boldchildren
italicchildren
bold_italicchildren
code_inlinevalue
strikethroughchildren
underlinechildren
linkhref, title (or null), children
inline_imagesrc, alt, title
softbreak
hardbreak
footnote_reflabel

JSON Schema export

The markast.json_schema() function returns a JSON-Schema describing all of the above. Drop the output into a JSON-Schema validator on your client to enforce shape compatibility, or feed it into a code generator for typed models.

import json
from markast import json_schema

with open("ast.schema.json", "w") as f:
    json.dump(json_schema(), f, indent=2)