SuperDoc handles multiple content formats with different levels of support. We are a Word editor that accepts other formats as input, not a universal document converter.
Philosophy
SuperDoc operates on Microsoft Word’s document model. HTML and Markdown are normalized into Word concepts on import. For perfect fidelity, use DOCX or JSON formats.
| Format | Import Support | Export Support | Round-Trip Fidelity | Primary Use Case |
|---|
| DOCX | Full compatibility | Full compatibility | ✅ Perfect | Word documents, complete fidelity |
| JSON | Full support | Full support | ✅ Perfect | Automation, programmatic control |
| HTML | Structure only | Structure only | ⚠️ Visual only | AI content, migration from web |
| Markdown | CommonMark only | CommonMark only | ⚠️ Visual only | Documentation, AI-generated content |
| Text | Plain text only | Plain text only | ✅ Perfect | Simple content |
| PDF | Not supported | Via API only | N/A | Final output format |
Use DOCX/JSON for:
- Preserving all formatting
- Round-trip editing
- Complex documents
- Production workflows
Use HTML/Markdown for:
- Basic content import when JSON isn’t available
- Migrating from legacy systems
- Simple text with minimal structure
- NOT for complex documents or formatting preservation
Content Import Methods
Method 1: Initialize with Content
Load content when creating the editor or SuperDoc instance.
SuperDoc Component
// DOCX file (perfect fidelity)
new SuperDoc({
selector: '#editor',
document: docxFile
});
// HTML content (structure preserved, styles stripped)
new SuperDoc({
selector: '#editor',
document: blankDocx, // Must have styles defined
html: '<h1>Title</h1><p>Content</p>'
});
// Markdown content (converted to Word structure)
new SuperDoc({
selector: '#editor',
document: blankDocx,
markdown: '# Title\n\nContent with **formatting**'
});
// JSON schema (full control)
new SuperDoc({
selector: '#editor',
document: blankDocx,
jsonOverride: documentSchema
});
Method 2: Insert into Existing Document
Add content to an already-loaded document using commands.
// Insert at current cursor position
editor.commands.insertContent(content, {
contentType: 'html' // or 'markdown', 'text', 'schema'
});
// AI content integration example
const aiResponse = await ai.generate("Create a contract");
editor.commands.insertContent(aiResponse, {
contentType: 'html' // AI output gets converted to Word structure
});
HTML Import/Export Behavior
What Gets Preserved
| HTML Element | Import Result | Export Result | Notes |
|---|
<h1> to <h6> | Word heading styles | Same heading level | Requires styles in document |
<p>, <div> | Normal paragraph | <p> | All become paragraphs |
<strong>, <b> | Bold mark | <strong> | Character formatting |
<em>, <i> | Italic mark | <em> | Character formatting |
<a href="..."> | Word hyperlink | <a href="..."> | Links preserved |
<ul>, <ol> | Word lists | Same list type | Basic nesting supported |
<blockquote> | Quote style | <blockquote> | If style exists |
<table> | Word table | Basic <table> | Structure only |
<img src="..."> | Word image | <img src="..."> | URL preserved |
<style="text-aligh: center"> | Alignment styles | <style="text-aligh: ..."> | Alignment in style tag is preserved |
What Gets Stripped (Always)
// Input HTML with styles and classes
<div style="color: blue; margin: 20px;" class="custom-class">
<h1 style="font-size: 24px;" id="header">Styled Heading</h1>
<p style="font-family: Arial;">Content with <span style="color: red;">inline styles</span></p>
</div>
// After import (ALL styling removed)
<div>
<h1>Styled Heading</h1> // Becomes Heading1 style internally
<p>Content with inline styles</p> // Becomes Normal style
</div>
Why styles are stripped:
- Word uses its own styling system (styles.xml)
- CSS doesn’t translate to DOCX properties
- Ensures consistent Word document behavior
Round-Trip Example
// 1. Original HTML input
const html = '<h1>Title</h1><p>Content</p>';
// 2. Import into SuperDoc
editor.commands.insertContent(html, { contentType: 'html' });
// Internal: { type: 'paragraph', attrs: { styleId: 'Heading1' }}
// 3. User edits in Word environment
// (adds tables, changes formatting, etc.)
// 4. Export back to HTML
const exported = editor.getHTML();
// Output: '<h1>Title</h1><p>Content</p><table>...</table>'
// 5. Re-import maintains visual appearance
editor.commands.insertContent(exported, { contentType: 'html' });
// Visual appearance preserved, but no custom HTML attributes
Markdown Import/Export Behavior
Supported CommonMark Elements
- Headers (
# through ######) → Word heading styles
- Bold/italic (
**bold**, *italic*) → Character marks
- Lists (ordered/unordered) → Word lists
- Links
[text](url) → Word hyperlinks
- Images
 → Word images
- Code blocks → Code style (if defined)
- Blockquotes
> → Quote style (if defined)
Not Supported
- Tables (may partially work)
- Footnotes
- Task lists
- Custom syntax extensions
- HTML within Markdown
Export Options
DOCX Export (Full Fidelity)
// Standard export
const blob = await editor.exportDocx();
// Without comments
const blob = await editor.exportDocx({
commentsType: 'clean'
});
// With fields replaced
const blob = await editor.exportDocx({
isFinalDoc: true
});
HTML Export (Structure Only)
// Get HTML representation
const html = editor.getHTML();
// Note: This is for visual representation only
// Do not expect perfect round-trip with custom styles
JSON Export (Full Fidelity)
// Get complete document structure
const json = editor.getJSON();
// Can be perfectly re-imported
AI Integration Pattern
Current Approach (Basic)
Most AI models today output HTML or Markdown. While this works for simple content, it has limitations:
// AI generates basic HTML/Markdown
const content = await openai.complete({
prompt: "Generate a service agreement",
format: "html" // Limited control over structure
});
// Import into SuperDoc (structure only, no fine control)
editor.commands.insertContent(content, {
contentType: 'html'
});
// Content uses document's Word styles
// User must manually refine formatting
// Export as DOCX
const docx = await editor.exportDocx();
Preferred Approach (JSON Schema)
For better control, AI should generate JSON Schema directly:
// Ideal: AI generates precise document structure
const schema = await ai.generate({
prompt: "Generate a service agreement",
format: "prosemirror-schema", // Full control
schema_definition: superDocSchema // Coming soon via tooling
});
// Import with perfect control
editor.commands.insertContent(schema, {
contentType: 'schema'
});
We’re developing tools (MCP servers, prompts) to help AI models generate proper JSON Schema. Until then, HTML/Markdown provides basic compatibility.
Best Practices
-
Start with a styled DOCX template - Define all heading styles, paragraph styles, and table styles you need.
-
Set expectations - HTML/Markdown import is for content, not formatting.
-
Use the right format - DOCX/JSON for production, HTML/Markdown for content input.
-
Test your workflow - Import → Edit → Export → Re-import to understand behavior.
API Conversions
Need server-side document conversions? Our REST API handles additional conversions with high fidelity.
View API documentation →