Skip to main content
SuperDoc handles multiple content formats with different levels of support. We are a Word editor that accepts other formats as input, not a universal document converter.

Philosophy

SuperDoc operates on Microsoft Word’s document model. HTML and Markdown are normalized into Word concepts on import. For perfect fidelity, use DOCX or JSON formats.

Supported Formats

FormatImport SupportExport SupportRound-Trip FidelityPrimary Use Case
DOCXFull compatibilityFull compatibility✅ PerfectWord documents, complete fidelity
JSONFull supportFull support✅ PerfectAutomation, programmatic control
HTMLStructure onlyStructure only⚠️ Visual onlyAI content, migration from web
MarkdownCommonMark onlyCommonMark only⚠️ Visual onlyDocumentation, AI-generated content
TextPlain text onlyPlain text only✅ PerfectSimple content
PDFNot supportedVia API onlyN/AFinal output format

When to Use Each Format

Use DOCX/JSON for:

  • Preserving all formatting
  • Round-trip editing
  • Complex documents
  • Production workflows

Use HTML/Markdown for:

  • Basic content import when JSON isn’t available
  • Migrating from legacy systems
  • Simple text with minimal structure
  • NOT for complex documents or formatting preservation

Content Import Methods

Method 1: Initialize with Content

Load content when creating the editor or SuperDoc instance.

SuperDoc Component

// DOCX file (perfect fidelity)
new SuperDoc({
  selector: '#editor',
  document: docxFile
});

// HTML content (structure preserved, styles stripped)
new SuperDoc({
  selector: '#editor',
  document: blankDocx,  // Must have styles defined
  html: '<h1>Title</h1><p>Content</p>'
});

// Markdown content (converted to Word structure)
new SuperDoc({
  selector: '#editor',
  document: blankDocx,
  markdown: '# Title\n\nContent with **formatting**'
});

// JSON schema (full control)
new SuperDoc({
  selector: '#editor',
  document: blankDocx,
  jsonOverride: documentSchema
});

Method 2: Insert into Existing Document

Add content to an already-loaded document using commands.
// Insert at current cursor position
editor.commands.insertContent(content, { 
  contentType: 'html'  // or 'markdown', 'text', 'schema'
});

// AI content integration example
const aiResponse = await ai.generate("Create a contract");
editor.commands.insertContent(aiResponse, { 
  contentType: 'html'  // AI output gets converted to Word structure
});

HTML Import/Export Behavior

What Gets Preserved

HTML ElementImport ResultExport ResultNotes
<h1> to <h6>Word heading stylesSame heading levelRequires styles in document
<p>, <div>Normal paragraph<p>All become paragraphs
<strong>, <b>Bold mark<strong>Character formatting
<em>, <i>Italic mark<em>Character formatting
<a href="...">Word hyperlink<a href="...">Links preserved
<ul>, <ol>Word listsSame list typeBasic nesting supported
<blockquote>Quote style<blockquote>If style exists
<table>Word tableBasic <table>Structure only
<img src="...">Word image<img src="...">URL preserved
<style="text-aligh: center">Alignment styles<style="text-aligh: ...">Alignment in style tag is preserved

What Gets Stripped (Always)

// Input HTML with styles and classes
<div style="color: blue; margin: 20px;" class="custom-class">
  <h1 style="font-size: 24px;" id="header">Styled Heading</h1>
  <p style="font-family: Arial;">Content with <span style="color: red;">inline styles</span></p>
</div>

// After import (ALL styling removed)
<div>
  <h1>Styled Heading</h1>  // Becomes Heading1 style internally
  <p>Content with inline styles</p>  // Becomes Normal style
</div>
Why styles are stripped:
  • Word uses its own styling system (styles.xml)
  • CSS doesn’t translate to DOCX properties
  • Ensures consistent Word document behavior

Round-Trip Example

// 1. Original HTML input
const html = '<h1>Title</h1><p>Content</p>';

// 2. Import into SuperDoc
editor.commands.insertContent(html, { contentType: 'html' });
// Internal: { type: 'paragraph', attrs: { styleId: 'Heading1' }}

// 3. User edits in Word environment
// (adds tables, changes formatting, etc.)

// 4. Export back to HTML
const exported = editor.getHTML();
// Output: '<h1>Title</h1><p>Content</p><table>...</table>'

// 5. Re-import maintains visual appearance
editor.commands.insertContent(exported, { contentType: 'html' });
// Visual appearance preserved, but no custom HTML attributes

Markdown Import/Export Behavior

Supported CommonMark Elements

  • Headers (# through ######) → Word heading styles
  • Bold/italic (**bold**, *italic*) → Character marks
  • Lists (ordered/unordered) → Word lists
  • Links [text](url) → Word hyperlinks
  • Images ![alt](url) → Word images
  • Code blocks → Code style (if defined)
  • Blockquotes > → Quote style (if defined)

Not Supported

  • Tables (may partially work)
  • Footnotes
  • Task lists
  • Custom syntax extensions
  • HTML within Markdown

Export Options

DOCX Export (Full Fidelity)

// Standard export
const blob = await editor.exportDocx();

// Without comments
const blob = await editor.exportDocx({ 
  commentsType: 'clean' 
});

// With fields replaced
const blob = await editor.exportDocx({ 
  isFinalDoc: true 
});

HTML Export (Structure Only)

// Get HTML representation
const html = editor.getHTML();
// Note: This is for visual representation only
// Do not expect perfect round-trip with custom styles

JSON Export (Full Fidelity)

// Get complete document structure
const json = editor.getJSON();
// Can be perfectly re-imported

AI Integration Pattern

Current Approach (Basic)

Most AI models today output HTML or Markdown. While this works for simple content, it has limitations:
// AI generates basic HTML/Markdown
const content = await openai.complete({
  prompt: "Generate a service agreement",
  format: "html"  // Limited control over structure
});

// Import into SuperDoc (structure only, no fine control)
editor.commands.insertContent(content, { 
  contentType: 'html'
});

// Content uses document's Word styles
// User must manually refine formatting

// Export as DOCX
const docx = await editor.exportDocx();

Preferred Approach (JSON Schema)

For better control, AI should generate JSON Schema directly:
// Ideal: AI generates precise document structure
const schema = await ai.generate({
  prompt: "Generate a service agreement",
  format: "prosemirror-schema",  // Full control
  schema_definition: superDocSchema  // Coming soon via tooling
});

// Import with perfect control
editor.commands.insertContent(schema, { 
  contentType: 'schema'
});
We’re developing tools (MCP servers, prompts) to help AI models generate proper JSON Schema. Until then, HTML/Markdown provides basic compatibility.

Best Practices

  1. Start with a styled DOCX template - Define all heading styles, paragraph styles, and table styles you need.
  2. Set expectations - HTML/Markdown import is for content, not formatting.
  3. Use the right format - DOCX/JSON for production, HTML/Markdown for content input.
  4. Test your workflow - Import → Edit → Export → Re-import to understand behavior.

API Conversions

Need server-side document conversions? Our REST API handles additional conversions with high fidelity. View API documentation →