MarkBooks Format Specification

Version 1.0 · April 2026

Status: Draft

Editor: Morten Skærsø (markant.md)

Repository: github.com/markbooks-format

Licence: CC0 1.0 Universal

This document specifies MarkBooks, a self-contained document format based on Markdown. A MarkBook packages one or more Markdown pages with their associated assets, supporting data, and optional integrity metadata into a single portable archive. The format is designed for portability, human readability, trivial implementability, and machine generatability. It addresses the gap between raw Markdown files — which lack portability when images or multi-page structure are involved — and heavyweight formats such as PDF, EPUB, and DOCX, which sacrifice editability and transparency.

This is a draft specification. It is published for review and comment by the developer and academic communities. Feedback should be directed to the GitHub repository at github.com/markbooks-format/spec.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

1. Introduction

Markdown has become the de facto output format of the AI era. Large language models, documentation generators, note-taking applications, and academic writing tools all produce Markdown natively. However, sharing Markdown documents that include images, diagrams, or multi-page structure remains fragile: relative paths break across systems, assets are separated from prose, and authors resort to converting to PDF — sacrificing editability, transparency, and the lightweight character that made Markdown attractive in the first place.

MarkBooks resolve this by packaging Markdown content into a single, self-contained, portable file. The format is a ZIP archive with the extension .mkb containing Markdown pages and their assets. At its simplest, a valid MarkBook is a single index.md file inside a ZIP. At its most capable, a MarkBook can carry multi-chapter documents with illustrations, supporting datasets, cryptographic signatures from multiple authors, and blockchain-anchored publication timestamps.

Every feature beyond index.md in a ZIP is optional. The format is designed so that a minimal reader can be implemented in an afternoon, while a full-featured reader can offer a rich scholarly experience. Readers compete on rendering quality; the specification guarantees portability, legibility, and verifiability.

1.1 Design Principles

  1. Trivially implementable. A developer can build a conforming reader in an afternoon.
  2. Machine-generatable. A large language model can produce a valid MarkBook in a single response.
  3. Human-readable inside. Unzipping a MarkBook yields plain Markdown files editable in any text editor.
  4. Self-contained. No network access is required to render a MarkBook.
  5. Gracefully degrading. Readers render what they support and display the rest as plain text.
  6. Open. This specification is released under CC0. No licence fees. No proprietary tooling required.

2. Identifiers

PropertyValue
File extension.mkb
MIME typeapplication/vnd.markant.markbook+zip
Container formatZIP (ISO/IEC 21320-1 or any compliant implementation)
Text encodingUTF-8 for all .md and .yaml files
Uniform Type Identifier (Apple)com.markant.markbook

The +zip structured syntax suffix (RFC 6839) indicates the container format, enabling generic ZIP-aware tooling to inspect the archive without format-specific knowledge.

3. Archive Structure

example.mkb (ZIP archive)
├── index.md                    # REQUIRED — entry point
├── markbook.yaml               # OPTIONAL — metadata (§6)
├── pages/                      # OPTIONAL — additional pages
│   ├── chapter-one.md
│   ├── chapter-two.md
│   └── appendix/
│       └── supplementary.md
├── assets/                     # OPTIONAL — rendered media (§3.3)
│   ├── figure-1.png
│   └── diagrams/
│       └── architecture.svg
├── data/                       # OPTIONAL — supporting data (§7)
│   ├── measurements.csv
│   ├── analysis.py
│   └── supplementary.pdf
├── signatures/                 # OPTIONAL — author signatures (§11)
│   ├── alice.sig
│   └── bob.sig
├── MANIFEST.sha256             # OPTIONAL — content integrity (§10)
├── MANIFEST.sha256.ots         # OPTIONAL — timestamp proof (§12)
└── .git/                       # OPTIONAL — version history (§9)

3.1 index.md (REQUIRED)

The entry point of the MarkBook. This is the only required file. A valid MarkBook MAY consist of index.md alone inside a ZIP archive.

3.2 pages/

Additional Markdown pages. Subdirectories within pages/ are permitted for logical grouping. There is no required naming convention; authors MAY organise pages by chapter, section, or any other scheme.

3.3 assets/

Images, diagrams, and other media files referenced by the Markdown pages. Subdirectories within assets/ are permitted. All files in assets/ are intended to be rendered inline by the reader when referenced from Markdown content.

3.4 data/

Supporting data files that accompany the document but are not rendered inline. This directory is intended for datasets, source code, supplementary materials, and other non-rendered content. Implementations SHOULD make files in data/ accessible to the user (e.g., via export or “open externally”) but MUST NOT require their presence for basic rendering.

Markdown content MAY link to data files using standard link syntax:

[Download raw measurements](data/measurements.csv)

Implementations SHOULD handle such links as file export or download actions.

3.5 markbook.yaml

Optional metadata file at the archive root. See §6.

3.6 signatures/

Optional directory containing cryptographic signatures. See §11.

3.7 .git/

Optional embedded Git repository. See §9.

4. Markdown Dialect

MarkBooks use CommonMark as the baseline Markdown specification. All conforming implementations MUST parse and render CommonMark. The format additionally requires wikilink support as specified in §4.1.

4.1 Wikilinks

Wikilinks are supported following the Obsidian convention, with target path before display text:

[[target|display text]]

Where target is a file path relative to the archive root or relative to the current file (see §5 for path resolution rules).

[[pages/chapter-one.md|Chapter One]]
[[index.md|Back to front page]]

If the display text is omitted, the target path is used as the display text:

[[pages/chapter-one.md]]

Standard CommonMark relative links are equally valid:

[Chapter One](pages/chapter-one.md)

Implementations MUST support both link styles.

4.2 Image References

Images use standard Markdown syntax with paths relative to the archive root or relative to the current file:

![Alt text](assets/diagram.png)

All referenced assets MUST be present in the archive. External URLs in image references are permitted, but implementations SHOULD NOT require network access to render a MarkBook.

4.3 Extended Syntax and Graceful Degradation

MarkBook authors MAY use any Markdown extensions beyond CommonMark, including but not limited to:

  • Tables (GFM-style)
  • LaTeX mathematics ($inline$ and $$display$$)
  • Task lists (- [ ] / - [x])
  • Footnotes
  • Definition lists
  • Strikethrough
  • Syntax-highlighted fenced code blocks
  • Cross-references (§8)

The specification imposes no requirements on readers to render all extensions. However, implementations MUST degrade gracefully for any syntax they do not support. Graceful degradation means: the content MUST remain visible and legible. Unrecognised syntax SHOULD be displayed as a code block, inline code, or plain text — never silently dropped.

This design is intentional. MarkBooks are a container and content format, not a rendering specification. Readers compete on rendering quality; the specification guarantees portability and legibility.

5. Path Rules

All internal references within a MarkBook — links, image sources, data references — are subject to the following rules:

  • All paths MUST be relative. Paths MUST NOT begin with /.
  • All paths MUST use forward slashes (/) as directory separators, regardless of the host operating system.
  • Paths MAY contain .. components for navigation between directories (e.g., ../index.md from a file in pages/). However, any path that resolves above the archive root MUST be treated as a broken link. Implementations SHOULD surface broken links visually (e.g., with a warning indicator) rather than silently ignoring them.
  • File and directory names SHOULD be lowercase ASCII, using hyphens for word separation (e.g., my-page.md).
  • File and directory names MUST NOT contain characters illegal in ZIP entries or common filesystems: \, :, *, ?, ", <, >, |.

The permissive handling of .. components is intentional: it allows existing Markdown projects with relative cross-references to be packaged as MarkBooks without rewriting links.

6. Metadata

A MarkBook MAY include a markbook.yaml file at the archive root. This file provides structured metadata for display, indexing, and integrity verification.

6.1 Basic Metadata

title: "Attention Is All You Need"
version: "1.0"
language: "en"
created: "2026-04-08"
modified: "2026-04-08"

All fields are OPTIONAL. Implementations SHOULD use these values for display purposes (e.g., window titles, library indexing) but MUST NOT require the file or any field within it.

If markbook.yaml is absent or lacks a title field, implementations SHOULD extract the title from the first ATX heading (#) in index.md.

6.2 Authorship Metadata

For single-author documents:

author: "Author Name"

For multi-author documents, use the authors array (see §11 for signature-related fields):

authors:
  - name: "Alice Chen"
  - name: "Bob Eriksen"

6.3 Extended Metadata

The markbook.yaml file MAY contain additional fields not defined by this specification. Implementations MUST ignore unrecognised fields.

7. Supporting Data

A MarkBook MAY include a data/ directory at the archive root for non-rendered supporting materials. This is intended for use cases where the document is accompanied by datasets, source code, supplementary figures, or other files that readers may wish to access but that are not rendered inline.

Common use cases include:

  • Raw datasets underlying figures or tables in the document
  • Source code for reproducing computational results
  • Supplementary materials (additional figures, extended proofs)
  • Machine-readable structured data (JSON, CSV, HDF5)

The data/ directory imposes no structure requirements. Authors MAY organise contents freely, including the use of subdirectories.

Implementations MUST preserve the data/ directory when re-packaging a MarkBook. Implementations SHOULD provide a mechanism for users to access individual files within data/ (e.g., export, “Show in Finder,” or “Open Externally”). Implementations MUST NOT require the presence of data/ or any of its contents for basic document rendering.

8. Cross-References

MarkBook authors MAY attach identifiers to block-level elements and reference those identifiers elsewhere in the document. This enables auto-numbered, hyperlinked references to figures, tables, equations, sections, and other elements.

8.1 Labels

Labels are attached to elements using the attribute syntax {#identifier}, appended to the element:

Equations:

$$ E = mc^2 $$ {#eq:einstein}

Figures:

![Transformer architecture](assets/architecture.png){#fig:transformer}

Tables:

| Model | BLEU  |
|-------|-------|
| Base  | 27.3  |
| Big   | 28.4  |
: Translation results on WMT 2014. {#tbl:results}

Sections:

## Model Architecture {#sec:architecture}

Code listings:

```python {#lst:training}
model.fit(X_train, y_train, epochs=100)
```

8.2 References

References use @-prefixed identifiers inside square brackets:

As shown in [@fig:transformer], the encoder-decoder structure...

From [@eq:einstein] we derive the energy-momentum relation.

The results in [@tbl:results] demonstrate a clear improvement.

As described in [@sec:architecture], the model uses...

Multiple references MAY be combined in a single bracket, separated by commas:

See [@fig:transformer, @tbl:results] for details.

8.3 Identifier Conventions

Identifiers SHOULD use a namespaced prefix to indicate element type:

PrefixElement
eq:Equation
fig:Figure
tbl:Table
sec:Section or heading
lst:Code listing
thm:Theorem, definition, or proof

Prefixes are a convention, not a requirement. An unprefixed label {#foo} is valid.

8.4 Rendering Behaviour

Implementations that support cross-references SHOULD:

  • Auto-number labelled elements by type in document order (e.g., Figure 1, Figure 2, Table 1)
  • Resolve [@identifier] to the appropriate numbered label (e.g., “Figure 3”)
  • Render resolved references as in-document hyperlinks

Implementations that do not support cross-references MUST display the raw [@identifier] text. This ensures that references remain visible and searchable even in minimal readers, and that the target can be found manually by searching for the corresponding {#identifier} label.

9. Versioned Documents and Annotations

A MarkBook MAY contain a .git/ directory at the archive root, making the MarkBook’s contents a Git repository.

9.1 Purpose

The embedded Git repository enables:

  • Version history. The complete authoring history of the document is preserved within the archive.
  • Reader annotations. Highlights, margin notes, and comments can be stored as commits on named branches without modifying the original content.
  • Collaborative review. Multiple readers’ annotations can be merged using standard Git merge operations.
  • Non-destructive editing. The original published content is always recoverable from the initial commit or the main branch.

9.2 Branch Conventions

The main branch represents the canonical published content of the MarkBook.

Reader annotations and notes SHOULD be stored on named branches following the convention:

annotations/{reader-name}

For example: annotations/morten, annotations/review-committee.

9.3 Implementation Requirements

Implementations that support versioned MarkBooks SHOULD use libgit2 or an equivalent library to interact with the embedded repository.

Implementations MUST preserve the .git/ directory when re-packaging a MarkBook. Implementations MUST NOT modify the main branch.

Implementations that do not support Git-based features MUST ignore the .git/ directory.

9.4 Packaging Considerations

ZIP archives do not preserve POSIX file permissions or symbolic links. Implementations that create MarkBooks containing .git/ directories SHOULD use standard ZIP compression (zip -r) and SHOULD verify that the resulting archive produces a functional Git repository when extracted.

10. Content Integrity

A MarkBook MAY include a MANIFEST.sha256 file at the archive root to enable content integrity verification.

10.1 Manifest Format

The manifest file contains the SHA-256 hash of every content file in the archive, one entry per line, in the format produced by the sha256sum utility:

e3b0c44298fc1c149afb...  index.md
a7ffc6f8bf1ed766518c...  pages/chapter-1.md
2c26b46b68ffc68ff99b...  assets/figure-1.png
9f86d081884c7d659a2f...  data/measurements.csv

Each line consists of the lowercase hexadecimal SHA-256 digest, two spaces, and the file path relative to the archive root.

10.2 Excluded Files

The following files MUST NOT be listed in the manifest:

  • MANIFEST.sha256 (the manifest itself)
  • MANIFEST.sha256.sig or any files in signatures/
  • MANIFEST.sha256.ots (timestamp proof)

These files relate to the integrity verification process itself and cannot self-referentially appear in the manifest.

Files within .git/ SHOULD NOT be listed in the manifest, as Git maintains its own internal integrity mechanisms.

10.3 Verification

Implementations that support integrity verification SHOULD:

  • Compute the SHA-256 hash of each content file in the archive
  • Compare computed hashes against the manifest entries
  • Report any mismatches to the user

Implementations MUST warn the user visibly if verification fails. A failed integrity check indicates that the document has been modified since the manifest was generated.

Implementations that do not support integrity verification MUST ignore the MANIFEST.sha256 file.

11. Authorship and Signatures

A MarkBook MAY include cryptographic signatures attesting to the authenticity and integrity of its content. Each signature is a detached cryptographic signature over the MANIFEST.sha256 file.

11.1 Single Author

For a single-author document, the signature MAY be placed directly at the archive root:

MANIFEST.sha256.sig

11.2 Multiple Authors

For multi-author documents, signatures are stored in the signatures/ directory, one file per signing author:

signatures/
├── alice.sig
├── bob.sig
└── carol.sig

Each .sig file is a detached signature over the same MANIFEST.sha256, produced independently by that author’s private key. Authors do not need to sign simultaneously or in any particular order. It is valid for a MarkBook to list more authors than there are signatures — this simply means that some authors have not (yet) cryptographically attested to the content.

11.3 Signature Algorithms

This specification does not mandate a specific signing algorithm. The following are RECOMMENDED, in order of preference:

  1. Ed25519 via minisign or signify — small keys, small signatures, fast verification, no legacy baggage.
  2. Ed25519 via SSH — leverages existing SSH key infrastructure.
  3. PGP/GPG — widely deployed in academic and open-source communities.

The signing algorithm and public key for each author SHOULD be declared in markbook.yaml.

11.4 Metadata for Signed Documents

authors:
  - name: "Alice Chen"
    role: "corresponding"
    key_url: "https://alice.example.com/.well-known/markbook/keys/primary.pub"
    signature: "signatures/alice.sig"
    contributions:
      - "Conceptualization"
      - "Methodology"
      - "Writing – original draft"
  - name: "Bob Eriksen"
    role: "equal_contribution"
    key_url: "https://bob.example.com/.well-known/markbook/keys/primary.pub"
    signature: "signatures/bob.sig"
    contributions:
      - "Software"
      - "Data curation"
  - name: "Carol Fischer"
    signature: "signatures/carol.sig"
    contributions:
      - "Visualization"
      - "Writing – review & editing"

11.5 Author Roles and Contributions

The role field is freeform and not enumerated by this specification. Common values include corresponding, equal_contribution, senior_author.

The contributions field, when present, SHOULD use terms from the CRediT (Contributor Roles Taxonomy) standard, which defines 14 roles: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing.

An author’s cryptographic signature implicitly attests to their listed contributions. If an author signs a MarkBook that declares their contributions, they are cryptographically confirming their role in the work.

11.6 Key Discovery

Authors SHOULD publish their public keys at a well-known URL on their personal or institutional domain:

https://{domain}/.well-known/markbook/keys/{key-id}.pub

Alternatively, public keys MAY be published via:

  • DNS TXT records at _markbook.{domain}
  • Institutional or community key registries

This specification does not mandate a key discovery mechanism. The key_url field in markbook.yaml provides the author’s preferred location for their public key.

11.7 Verification Behaviour

Implementations that support signature verification SHOULD:

  • Verify each signature in signatures/ against MANIFEST.sha256 using the corresponding public key
  • Display the verification status of each author individually
  • Clearly distinguish between signed (verified), signed (key not found), and unsigned authors

Implementations that do not support signature verification MUST ignore the signatures/ directory and all .sig files.

11.8 Revisions and Re-signing

When the content of a MarkBook is modified, the manifest changes, which invalidates all existing signatures. This is correct and intentional — it forces authors to re-sign any revised version. Combined with the Git layer (§9), this provides a verifiable audit trail: each version of the document can be independently verified as to who signed it and when.

12. Publication Timestamping

A MarkBook MAY include a cryptographic timestamp proof establishing that the document existed in its current form at a specific point in time. This provides a publicly verifiable, trustless record of publication priority.

12.1 Motivation

Establishing the priority of a publication — proving that a specific document with specific content existed at a specific time — is critical in academic research, intellectual property, and legal contexts. Existing mechanisms (journal submission dates, preprint server timestamps) depend on trusting a single third party. A blockchain-anchored timestamp is independently verifiable by anyone, with no infrastructure dependency, and cannot be retroactively altered.

12.2 Mechanism

The RECOMMENDED timestamping method is OpenTimestamps, which anchors SHA-256 hashes in the Bitcoin blockchain via Merkle tree aggregation at effectively zero cost per document.

The timestamp proof is stored as:

MANIFEST.sha256.ots

This file contains the OpenTimestamps proof linking the SHA-256 hash of MANIFEST.sha256 to a specific Bitcoin transaction at a specific block height.

12.3 Metadata

integrity:
  manifest: "MANIFEST.sha256"
  timestamp:
    method: "opentimestamps"
    proof: "MANIFEST.sha256.ots"
    bitcoin_block: 892451
    committed: "2026-04-08T14:23:00Z"

The bitcoin_block and committed fields are informational and SHOULD reflect the confirmed anchor point. Verification MUST be performed against the .ots proof file and the blockchain, not against these metadata fields.

12.4 Pending Timestamps

An OpenTimestamps proof requires confirmation in a Bitcoin block, which may take several hours. A MarkBook MAY be published with a pending .ots proof. The proof file SHOULD be updated once the timestamp is confirmed. Implementations SHOULD distinguish between confirmed and pending timestamps in their display.

12.5 Verification

Verification requires the ots command-line tool or equivalent library, plus access to Bitcoin block headers (approximately 60 MB for the complete chain).

Implementations that support timestamp verification SHOULD:

  • Verify the .ots proof against the Bitcoin blockchain
  • Display the confirmed timestamp to the user
  • Clearly indicate if the timestamp is pending or cannot be verified

Implementations that do not support timestamp verification MUST ignore the .ots file.

12.6 Alternative Timestamping Methods

This specification RECOMMENDS OpenTimestamps but does not prohibit alternative mechanisms. Any system that produces a verifiable, independent proof that a specific hash existed at a specific time is acceptable. The method field in markbook.yaml identifies the mechanism used.

Possible alternatives include:

  • RFC 3161 trusted timestamps (legally recognised under EU eIDAS regulation)
  • Other blockchain-anchored timestamping services
  • Certificate Transparency-style append-only logs

13. Constraints

The following constraints apply to all MarkBooks:

  • No executable content. MarkBooks MUST NOT contain scripts, macros, or embedded HTML that requires JavaScript execution. This is a security boundary: a MarkBook is a document, not an application.
  • No absolute paths. All internal references MUST be relative (see §5).
  • No required network access. A MarkBook MUST be fully renderable offline when all referenced assets are present in the archive. Network access MAY be used for optional operations such as key discovery (§11.6) and timestamp verification (§12.5).
  • UTF-8 only. All text files (.md, .yaml) MUST be encoded as UTF-8 without BOM.

14. Content Types

The following asset types are expected to be widely supported by implementations:

CategoryExtensions
Images.png, .jpg, .jpeg, .gif, .svg, .webp
Metadata.yaml
Documents.md

Implementations MAY support additional asset types but MUST NOT require them for basic rendering.

The data/ directory (§7) may contain files of any type. Implementations are not expected to render these files but SHOULD make them accessible for export.

15. MIME Type Registration

The MIME type application/vnd.markant.markbook+zip is pending registration with IANA under the vendor tree, as specified in RFC 6838 §3.2.

Until registration is complete, implementations SHOULD use this MIME type for content negotiation and file type identification. Operating system file associations SHOULD map the .mkb extension to this MIME type.

16. Specification Versioning

This is version 1.0 of the MarkBooks specification.

Future versions within the 1.x series MUST maintain backward compatibility: a conforming v1.0 reader MUST be able to open any v1.x MarkBook, ignoring features it does not understand. This is the reason all features beyond index.md are optional — a v1.0 reader that supports only basic Markdown rendering is fully conforming.

Breaking changes that alter required behaviour or invalidate previously valid MarkBooks require a major version increment (v2.0).

17. Design Rationale

17.1 Why ZIP?

ZIP is universally supported, well-specified (ISO/IEC 21320-1), streamable, and understood by every operating system. It allows individual files to be extracted without decompressing the entire archive. It is the same container used by EPUB, DOCX, XLSX, and JAR. Choosing ZIP means MarkBooks can be created with standard command-line tools (zip, unzip) and inspected by any file manager.

17.2 Why not EPUB?

EPUB is the closest existing format to MarkBooks. However, EPUB requires XHTML content documents, OPF package files, NCX navigation documents, and a specific META-INF directory structure. This complexity makes EPUB difficult to generate programmatically and impossible for a large language model to produce correctly in a single response. MarkBooks are “EPUB minus everything you don’t need” — the minimum viable self-contained document format.

17.3 Why CommonMark + extensions rather than a fixed dialect?

Markdown is a family of dialects, and the ecosystem continues to evolve. Mandating a specific set of extensions (e.g., GFM tables, KaTeX math) would freeze the format at the state of the art in 2026. Instead, the specification mandates CommonMark as the baseline and requires graceful degradation for anything beyond it. This means the format can absorb new Markdown extensions — including those that do not yet exist — without a specification revision.

17.4 Why Git for annotations?

Reader annotations are a form of version control: they are changes made to a document by someone other than the author. Git is the most widely deployed version control system in the world, with robust tooling for branching, merging, and diffing. Embedding a Git repository in the MarkBook means annotations are portable (they travel with the document), mergeable (multiple readers’ notes can be combined), and non-destructive (the original content is always recoverable). No custom annotation format is required.

17.5 Why blockchain timestamping?

Academic priority disputes, intellectual property claims, and legal proceedings all require proving that a specific document existed at a specific time. Existing timestamping mechanisms depend on trusting a single institution (a journal, a preprint server, a notary). Blockchain-anchored timestamps are independently verifiable by anyone with access to the blockchain, require no trusted third party, and are immutable once confirmed. OpenTimestamps provides this at zero marginal cost per document.

17.6 Why per-author signatures?

Academic authorship is a collective act, but accountability is individual. Listing five authors on a paper currently provides no mechanism to verify that all five approved the final version. Per-author signatures make this verifiable: each author independently attests to the content. This also provides a cryptographic audit trail for revisions — if a co-author declines to re-sign a revised version, that absence is a meaningful and visible signal.

18. Security Considerations

18.1 Executable Content

MarkBooks MUST NOT contain executable content (§13). Implementations MUST NOT execute scripts, macros, or active content found within a MarkBook, even if embedded in HTML blocks within Markdown files. This constraint exists to ensure that MarkBooks are safe to open from untrusted sources.

18.2 Path Traversal

Paths resolving above the archive root via .. components MUST be treated as broken links, not as references to the host filesystem (§5). Implementations MUST NOT resolve paths outside the archive boundary.

18.3 Signature Trust

A valid cryptographic signature proves that the holder of a specific private key signed the manifest. It does not, by itself, prove the identity of the signer. Key-to-identity binding depends on external trust mechanisms (§11.6) — domain-hosted keys, DNS records, or key registries. Implementations SHOULD clearly communicate the distinction between “signature valid” (the cryptography checks out) and “author verified” (the key is confirmed to belong to the stated author).

18.4 Timestamp Limitations

A blockchain-anchored timestamp proves that a document existed at a specific time. It does not prove that the document was published at that time — the author may have created the timestamp privately and disclosed it later. It also does not prove that the timestamped version was the first version of the content. Timestamps establish a lower bound on publication date, not a definitive publication event.

18.5 ZIP-Specific Risks

ZIP archives may contain filenames with path separators or absolute paths that, if naively extracted, could overwrite files outside the intended directory (the “Zip Slip” vulnerability). Implementations MUST sanitise file paths during extraction and MUST NOT extract files to locations outside the intended output directory.


Appendix A: Minimal Valid MarkBook

The simplest possible MarkBook:

minimal.mkb (ZIP archive)
└── index.md

Containing:

# Hello, World

This is a MarkBook.

Created with:

echo "# Hello, World\n\nThis is a MarkBook." > index.md
zip minimal.mkb index.md

Appendix B: Complete Example with All Optional Features

research-paper.mkb (ZIP archive)
├── index.md
├── markbook.yaml
├── pages/
│   ├── introduction.md
│   ├── methodology.md
│   ├── results.md
│   └── conclusion.md
├── assets/
│   ├── figure-1.svg
│   ├── figure-2.png
│   └── diagrams/
│       └── architecture.svg
├── data/
│   ├── raw-measurements.csv
│   ├── analysis.py
│   └── supplementary-figures.pdf
├── signatures/
│   ├── alice.sig
│   └── bob.sig
├── MANIFEST.sha256
├── MANIFEST.sha256.ots
└── .git/

With markbook.yaml:

title: "A Novel Approach to Sensor Fusion"
version: "1.0"
language: "en"
created: "2026-04-08"

authors:
  - name: "Alice Chen"
    role: "corresponding"
    key_url: "https://alice.example.com/.well-known/markbook/keys/primary.pub"
    signature: "signatures/alice.sig"
    contributions:
      - "Conceptualization"
      - "Methodology"
      - "Writing – original draft"
  - name: "Bob Eriksen"
    role: "equal_contribution"
    key_url: "https://bob.example.com/.well-known/markbook/keys/primary.pub"
    signature: "signatures/bob.sig"
    contributions:
      - "Software"
      - "Data curation"
      - "Visualization"

integrity:
  manifest: "MANIFEST.sha256"
  timestamp:
    method: "opentimestamps"
    proof: "MANIFEST.sha256.ots"
    bitcoin_block: 892451
    committed: "2026-04-08T14:23:00Z"

Appendix C: Verification Quickstart

Verify a MarkBook’s integrity using standard command-line tools:

# Extract the MarkBook
unzip paper.mkb -d paper/
cd paper/

# 1. Verify content integrity (requires sha256sum)
sha256sum -c MANIFEST.sha256

# 2. Verify author signature (requires minisign)
minisign -Vm MANIFEST.sha256 -p alice.pub -x signatures/alice.sig

# 3. Verify publication timestamp (requires ots-cli)
ots verify MANIFEST.sha256.ots

No proprietary tools are required at any step.


This specification is released under CC0 1.0 Universal. No rights reserved. Anyone may implement, extend, or redistribute this specification without restriction.