Skip to main content

Recursive URL Memory

Overview

warning

Please remember that recursive URLs memories are only meant for information web pages such as documentation, company landing pages and similar. Do not try and ingest pages with large number of sub-pages such as e-commerce sites and similar as this can cause the batch to fail most likely and lead to memory bank pollution.

info

URL ingestion on Free and Basic plans reads only the text content of each page. PRO users get advanced ingestion that also considers images on the page.

What It Does

  • Start with a URL: Provide any publicly reachable webpage as the entry point.
  • Optional Sub-page Crawling (PRO only): When enabled, Hana follows links within that domain up to three levels deep.
  • Content Processing: Each page is fetched, converted to plain text and stored with helpful metadata.
  • Duplicate Detection: During a single crawl, pages already fetched are skipped so you don't ingest the same page twice. However, pages ingested in earlier batches aren't automatically skipped if you re-crawl later.
  • Safety Checks: All URLs are verified with Google Web Risk to prevent malicious content ingestion.

Recursive URL Depth of 3

info

When the checkbox for digesting all sub-pages is checked while creating a memory batch, sub-pages up to a depth of 3 (capped at 200 pages in total) will be ingested by Hana. This can take some time so keep coming back to dashboard to check for status. Hana will also mail you once the batch is ingested (coming soon!).

  • Level 1
    • https://example.com
    • Level 2
      • https://example.com/about
      • https://example.com/contact
      • Level 3
        • https://example.com/about/team
        • https://example.com/contact/form
tip

Recursive URL memory skips duplicate pages within the same batch. If you ingest another batch later that includes previously processed pages, those pages will be ingested again unless you delete the earlier batch.

Caveats & Limitations

  • Sub-page crawling is only available on the PRO plan. Free users can ingest a single page at a time.
  • Crawl depth is capped at three levels to keep ingestion manageable.
  • A single recursive URL memory batch can include at most 200 sub-pages. Additional pages beyond this limit are skipped.
  • Only publicly reachable pages are ingested. Pages behind logins or paywalls are skipped.
  • Large or complex sites may take time to process, especially in Best Quality mode.
  • Links to unsupported file types (images, videos, binaries) are ignored.

Example Walkthrough

  1. An admin navigates to the Add Memory page.
  2. They enter https://example.com/docs as the source URL.
  3. They enable Crawl Sub-Pages and choose Best Quality mode.
  4. Hana processes the main page and follows allowed links up to three levels deep.
  5. Once finished, the admin can search those pages in Hana's memory or use them for richer chat responses.

Key Takeaway

Recursive URL Memory helps you build a centralized knowledge base from your own website without manual copying.

Accessing Recursive URL Memory

  1. Navigate to the Memory section from the left sidebar.

  2. Click on the Recursive URL Memory tab to view all URL memories.

    hana memory recursive url

Recursive URL Memory Sections

In the Recursive URL Memory tab, you can:

  • View URL Memory Batch Entries : See a list of connected Google Docs with details such as ingestion status, status, creator, and last synced date.

  • Add New URL Memory: Connect a new URL memory for Hana to process and remember.

    • Optionally choose to ingest sub-pages or not
  • Sync or Delete URL Memory Batch Entries: Manage existing entries by syncing or removing them as needed.

  • COPY: Copies the main ingestion URL, allowing you to view it by pasting the link into a new browser tab.

Adding a New URL Memory

  • Click the + INGEST URL button.

    ingesting recursive url memory

  • Provide the starting URL

  • Select whether you want to ingest sub-pages or not

    ingesting recursive url memory popup with sub pages option

  • Click the Submit button.

  • The ingestion status will initially show as IN_PROGRESS, which updates to INGESTED once the processing is complete.

Syncing a Recursive URL Memory

  • Locate the memory batch in the memory list.

  • Click the SYNC button to manually update the memory with the latest content from the URL.

  • Hana will delete all existing memories associated with the URL batch and re-ingest once again from scratch

    syncing recursive url memory

  • The Last Synced At field updates to reflect the latest sync time.

Deleting Memory Entries

  • Use the corresponding Delete option for a specific entry.

  • Deleting removes the selected URL memory batch and all its sub-pages (if it was selected to ingest sub pages) details from Hana's memory.

deleting recursive url memory