Knowledge Base
Supported Formats

Supported Formats

Cuneiform Chat supports a wide range of document formats.

Document Types

Text Documents

FormatExtensionsNotes
PDF.pdfText-based and scanned (with OCR)
Microsoft Word.docx, .docFull formatting support
Plain Text.txtSimple text files
Markdown.mdMarkdown formatting preserved
Rich Text.rtfBasic formatting support

Spreadsheets

FormatExtensionsNotes
Excel.xlsx, .xlsTables and data
CSV.csvComma-separated values

Presentations

FormatExtensionsNotes
PowerPoint.pptx, .pptSlide content extracted

Web Content

FormatExtensionsNotes
HTML.html, .htmWeb page content

Data Formats

FormatExtensionsNotes
JSON.jsonStructured data files
XML.xmlXML documents
TSV.tsvTab-separated values

Format-Specific Notes

PDFs

PDFs work best when they are:

  • Text-based — Created from digital documents
  • Searchable — Text can be selected/copied

Scanned PDFs are supported but may have lower quality extraction due to OCR limitations.

Word Documents

.docx (modern format) is recommended over .doc (legacy format) for best results.

Supported elements:

  • Text and paragraphs
  • Headings and styles
  • Tables
  • Lists

Spreadsheets

Spreadsheet content is extracted as structured data. For best results:

  • Use clear column headers
  • Avoid complex merged cells
  • Keep data organized in tables

Markdown

Markdown formatting is preserved during processing:

  • Headings become section markers
  • Lists maintain structure
  • Code blocks are recognized

Unsupported Formats

The following are not currently supported:

  • Images (.jpg, .png, .gif)
  • Audio files (.mp3, .wav)
  • Video files (.mp4, .mov)
  • Compressed archives (.zip, .rar)
  • Executable files (.exe, .app)

If you have content in an unsupported format, try converting it to PDF or a text-based format before uploading.

File Size Limits

Maximum file sizes depend on your plan. Very large documents may take longer to process.

For best performance:

  • Split very large documents into smaller files
  • Remove unnecessary images from PDFs if they're not needed
  • Use text-based formats when possible