Supported Formats
Cuneiform Chat supports a wide range of document formats.
Document Types
Text Documents
| Format | Extensions | Notes |
|---|---|---|
.pdf | Text-based and scanned (with OCR) | |
| Microsoft Word | .docx, .doc | Full formatting support |
| Plain Text | .txt | Simple text files |
| Markdown | .md | Markdown formatting preserved |
| Rich Text | .rtf | Basic formatting support |
Spreadsheets
| Format | Extensions | Notes |
|---|---|---|
| Excel | .xlsx, .xls | Tables and data |
| CSV | .csv | Comma-separated values |
Presentations
| Format | Extensions | Notes |
|---|---|---|
| PowerPoint | .pptx, .ppt | Slide content extracted |
Web Content
| Format | Extensions | Notes |
|---|---|---|
| HTML | .html, .htm | Web page content |
Data Formats
| Format | Extensions | Notes |
|---|---|---|
| JSON | .json | Structured data files |
| XML | .xml | XML documents |
| TSV | .tsv | Tab-separated values |
Format-Specific Notes
PDFs
PDFs work best when they are:
- Text-based — Created from digital documents
- Searchable — Text can be selected/copied
Scanned PDFs are supported but may have lower quality extraction due to OCR limitations.
Word Documents
.docx (modern format) is recommended over .doc (legacy format) for best results.
Supported elements:
- Text and paragraphs
- Headings and styles
- Tables
- Lists
Spreadsheets
Spreadsheet content is extracted as structured data. For best results:
- Use clear column headers
- Avoid complex merged cells
- Keep data organized in tables
Markdown
Markdown formatting is preserved during processing:
- Headings become section markers
- Lists maintain structure
- Code blocks are recognized
Unsupported Formats
The following are not currently supported:
- Images (
.jpg,.png,.gif) - Audio files (
.mp3,.wav) - Video files (
.mp4,.mov) - Compressed archives (
.zip,.rar) - Executable files (
.exe,.app)
If you have content in an unsupported format, try converting it to PDF or a text-based format before uploading.
File Size Limits
Maximum file sizes depend on your plan. Very large documents may take longer to process.
For best performance:
- Split very large documents into smaller files
- Remove unnecessary images from PDFs if they're not needed
- Use text-based formats when possible