Indexed search across a folder of documents is outside the Web SDK's scope — the SDK searches only documents loaded in memory. This sample composes the missing piece. A build-time pipeline (scripts/build-search-index.ts) extracts text per page (PDF), slide (PPTX), sheet (XLSX), or section (DOCX) using pdfjs-dist, mammoth, SheetJS, and JSZip, then writes a MiniSearch index to /search-index/index.json. The browser loads that dump on first paint, runs queries client-side, and hands the matched term to Nutrient's instance.search() to highlight in place.
Shipping a JSON index to the browser is fine for a fixed demo corpus, but doesn't scale: every visitor downloads the full index, and rebuilds require a redeploy. For a real document repository, move the index server-side. Common stacks: Postgres full-text search or SQLite FTS5 when you want to keep infrastructure tight; Meilisearch, Typesense, Elasticsearch / OpenSearch, or Algolia when you want a dedicated search service with relevance tuning, faceting, and analytics out of the box. Indexing runs as a background job triggered on document upload/change (queue, cron, or event handler) instead of at build time, and the browser hits a GET /api/search?q=… endpoint that returns { filename, locator, snippet } hits. The viewer + highlight layer in this sample stays identical — only the index location moves.