← Back to blog list

Markdown to PDF Is Not What You Think (Here’s What Actually Happens)

markdown-to-pdf-is-not-what-you-think

How Markdown Is Converted to PDF: From HTML to Rendering Explained

If you’ve ever converted a Markdown document into a PDF, you might have run into questions like:

  • Why doesn’t the layout look the way I expect?
  • Why is it so hard to add headers or logos?
  • Why does CSS sometimes work—and sometimes not?

All of these issues become much clearer once you understand how Markdown is actually converted into PDF.

In this article, we’ll break down what tools like md-to-pdf are doing under the hood—from a practical, real-world perspective.


Does Markdown Convert Directly to PDF?

Short answer: No, it doesn’t.

Many people think of the process as:

Markdown → PDF

But in reality, there’s a critical step in between.

This misunderstanding is exactly why layout issues and customization challenges happen.


The Core Idea: Markdown → HTML → PDF

The actual conversion flow looks like this:

Markdown → HTML → Browser Rendering → PDF

In other words:

Markdown is first converted into HTML, then rendered in a browser, and finally “printed” as a PDF.

Let’s walk through each step.


1. Converting Markdown to HTML

The first step is transforming Markdown into HTML.

For example:

# Title

becomes:

<h1>Title</h1>

This conversion is handled by Markdown parsers such as:

  • marked
  • markdown-it

These tools parse Markdown syntax and produce HTML structure.

Important: At this stage, only the structure is defined—not the visual appearance.


2. Styling with HTML + CSS

Next, CSS is applied to define the layout and appearance.

This is where everything visual is decided:

  • Fonts
  • Margins
  • Line spacing
  • Heading sizes
  • Page breaks
  • Headers and footers

In short:

The final look of your PDF is entirely controlled here.


Why This Step Causes Problems

Common requirements like:

  • Adding a logo to the header
  • Repeating headers on every page
  • Controlling page breaks

must all be implemented using HTML and CSS.

However, this isn’t normal web rendering—it’s print layout rendering, which comes with constraints:

  • Some CSS behaves differently
  • Positioning may not work as expected
  • Page breaks can feel unpredictable

3. Rendering in a Browser and Exporting to PDF

Finally, the styled HTML is rendered in a browser and exported as a PDF.

This is typically done using:

  • Puppeteer

Puppeteer controls a headless version of Chrome and performs:

  1. Load the HTML
  2. Apply CSS and render the page
  3. Export using the browser’s “Print to PDF” feature

So essentially:

What you see in the browser is what gets turned into a PDF.


Why This Approach Is Used

You might wonder: why not generate PDFs directly?

Because this approach is far more reliable.

Using a browser allows:

  • Full CSS support
  • Web fonts
  • Complex layouts

In contrast, direct PDF generation often leads to:

  • Difficult layout control
  • Font issues
  • Higher implementation complexity

That’s why browser-based rendering is the modern standard.


Common Pitfalls in Real-World Usage

Once you understand the pipeline, most of the common issues start to make sense. The key idea to keep in mind is this:

You are not rendering a web page—you are generating a printed document.

Many problems come from treating PDF output like a normal browser layout. But the moment printing is involved, the rules change.


1. Headers and Footers Don’t Repeat Properly

A common requirement is to display headers and footers on every page—logos, titles, or page numbers. However, standard HTML alone cannot achieve this reliably because it lacks a page-based model.

To implement repeating headers and footers, you need to use the rendering layer.

How to approach it (Puppeteer)

import puppeteer from "puppeteer";

const browser = await puppeteer.launch();
const page = await browser.newPage();

await page.setContent(`
  <html>
    <body>
      <h1>My Document</h1>
      <p>Content goes here...</p>
    </body>
  </html>
`);

await page.pdf({
  path: "output.pdf",
  format: "A4",
  displayHeaderFooter: true,
  margin: {
    top: "80px",
    bottom: "80px",
  },
  headerTemplate: `
    <div style="font-size:10px; width:100%; text-align:center;">
      <img src="https://example.com/logo.png" style="height:20px;" />
      <span>My Report</span>
    </div>
  `,
  footerTemplate: `
    <div style="font-size:10px; width:100%; text-align:center;">
      Page <span class="pageNumber"></span> of <span class="totalPages"></span>
    </div>
  `,
});

await browser.close();

2. Logos Don’t Stay in Position

Logos often behave unpredictably when positioned using typical CSS like position: absolute. This is because print rendering recalculates layout across pages dynamically.

Instead of forcing positioning, structure your layout depending on whether the logo is global or local.

How to approach it (HTML + safe layout)

<header class="header">
  <img src="logo.png" class="logo" />
  <h1>Report Title</h1>
</header>

<style>
.header {
  display: flex;
  align-items: center;
  gap: 12px;
}

.logo {
  height: 32px;
}

/* Avoid absolute positioning for critical elements */
</style>

If logo must appear on every page

Use the headerTemplate (shown earlier) instead of HTML layout.


3. Page Breaks Feel Random

Without explicit control, page breaks will appear in awkward places because the browser decides where to split content.

You need to guide the browser using print-specific CSS.

How to approach it (CSS)

/* Force a new page before sections */
.section {
  page-break-before: always;
}

/* Prevent elements from splitting across pages */
.table,
.section-content {
  page-break-inside: avoid;
}

/* Modern alternative */
.section {
  break-before: page;
}

.table {
  break-inside: avoid;
}

Example HTML

<div class="section">
  <h2>Section Title</h2>
  <div class="section-content">
    <p>Content that should stay together...</p>
  </div>
</div>

4. Fonts (Especially Non-Latin) Don’t Render Correctly

Fonts may fail to render correctly if they are not loaded or not available in the environment.

To ensure consistent rendering, use web fonts and wait for them to load before generating the PDF.

How to approach it (Web fonts + loading)

<head>
  <link href="https://fonts.googleapis.com/css2?family=Noto+Sans+JP&display=swap" rel="stylesheet">
  <style>
    body {
      font-family: 'Noto Sans JP', sans-serif;
    }
  </style>
</head>

Ensure fonts are loaded before PDF generation

await page.goto("http://localhost:3000", { waitUntil: "networkidle0" });

// Wait for fonts to be ready
await page.evaluateHandle("document.fonts.ready");

await page.pdf({
  path: "output.pdf",
  format: "A4",
});

Mini Takeaway

Across all these issues, there’s a consistent pattern:

Most problems come from assuming web layout rules apply to print output.

Once you shift your mindset to print rendering, it becomes much easier to decide:

  • what belongs in HTML/CSS
  • what should be controlled by the rendering engine

CSS

  • Layout
  • Page breaks
  • Typography

HTML Templates

  • Structure
  • Logo placement
  • Content grouping

Puppeteer Options

  • Header/footer templates
  • Margins
  • Print settings

Summary

Converting Markdown to PDF is not a direct transformation.

It involves:

  • Converting Markdown to HTML
  • Styling with CSS
  • Rendering in a browser and exporting as PDF

In other words:

It’s essentially browser printing.

Once you understand this, you can clearly see:

  • Why layouts break
  • Why customization is tricky

Please also check:


I will analyze your GA4 and GSC and provide actionable insights

I will analyze your GA4 and GSC and provide actionable insights

Find why your traffic dropped and get exact fixes.

Fiverr →