Skip to main content

DOC/DOCX Memory

Hana DOC/DOCX Memory converts Microsoft Word files into searchable memories. Upload a .doc or .docx and Hana extracts the text, stores it securely, and makes it available for retrieval in chats and workflows. This page describes how the feature works, who can use it, and how to handle common edge cases.


Overview

  • Purpose: Turn Word documents into "memories" that Hana can cite and use during conversations, automations, and reports.
  • Output: Hana extracts and sanitizes the file’s text. Formatting, comments, track‑changes, and images are discarded.
  • Storage: Each upload becomes a dedicated memory batch, appearing alongside other batches in the organization’s memory management area.

Access & Requirements

RequirementDetails
RoleOnly Admins and above can create DOC/DOCX memories in their organization. Super Admins can upload on behalf of any user.
PlanAvailable on Basic plans and higher. Free‑tier organizations cannot ingest DOC/DOCX memories.
QuotaEach batch counts against the organization’s memory quota. Batches can have auto‑resync enabled within allowed limits.

Supported Formats & Limits

  • Maximum size: 25 MB per file. Larger uploads are rejected.
  • Accepted types: .doc and .docx.
  • Unsupported content removed: embedded objects, macros, and images are dropped; only plain text is preserved.
  • Rejection triggers: Oversized files, unsupported extensions, password‑protected or corrupted documents, or antivirus failures.

Upload Workflow

  1. Start a DOC/DOCX Memory Batch
    • Navigate to the memory management area and select the document upload option.
    • Super Admins may specify a user ID to upload on someone’s behalf.
  2. Select the File
    • Choose a .doc or .docx up to 25 MB.
    • Ensure the file is not password‑protected or heavily formatted with unsupported elements.
  3. Security Scan
    • Hana scans the file for malware. Any detection immediately halts processing.
  4. Text Extraction
    • The platform extracts readable text, strips metadata, and ignores embedded objects or images.
    • Hidden text or change tracking is removed to avoid ingesting unintended content.
  5. Batch Creation & Progress
    • A new memory batch is created and queued for ingestion.
    • Progress indicators show statuses such as In Progress, Ingested, Partially Ingested, Error, or Aborted.
  6. Completion
    • Once ingestion finishes, the sanitized text is stored as a searchable memory.
    • The batch appears in memory listings where it can be edited, resynced, or deleted like any other batch.

Security & Storage

  • No file retention: The original document is discarded after extraction; only sanitized text remains.
  • Encryption: All stored text is encrypted in the database.
  • Role-based access: Only authorized roles within the organization can view or modify the resulting memory batch.
  • Auto-resync: Batches can be configured to automatically re-fetch the document from its source if enabled and within organization limits.

Using the Memory

  • Search & Retrieval: DOC/DOCX memories are indexed alongside other memories. Hana references them when answering questions or generating responses.
  • Contextual citations: When a document is referenced during chat, Hana can cite its source batch and section headings (if present).
  • Editing: Admins can edit or delete ingested content. Updates trigger resyncs to keep the memory current.

Error Handling & Troubleshooting

ScenarioOutcome / Action
File exceeds 25 MBUpload rejected before processing. Reduce file size and retry.
Unsupported extension or typeUpload blocked. Convert the document to .doc or .docx and retry.
Password‑protected documentUpload rejected. Remove password or export as unprotected.
Antivirus failure or detectionUpload blocked for safety. Provide a clean file.
Corrupted or unreadable fileBatch ends in error state. Fix the file and re‑upload.
Plan below BasicUpload blocked with “not allowed” message. Upgrade plan to proceed.
Extraction failureBatch ends in error state. Retry or contact support.

Best Practices

  • Remove unnecessary images or embedded objects to keep the file under 25 MB and speed up processing.
  • Avoid password protection or macros; export a clean version before upload.
  • Use clear headings and section titles within the document to enhance retrieval accuracy.
  • Organize batches with descriptive titles for simpler auditing and management.
  • Review extracted text after ingestion to ensure important sections were captured.

Summary

Hana DOC/DOCX Memory provides a secure, plan‑aware pipeline for converting Microsoft Word documents into searchable knowledge. By enforcing size limits, sanitizing content, and integrating with existing memory batch workflows, it helps organizations capture structured knowledge without retaining the original file. Administrators can ingest documents, monitor batch status, and leverage the resulting memories across Hana’s chat, automation, and reporting features, all while maintaining governance and access control.