| Component | Description | Technology Options | |-----------|-------------|--------------------| | | Run an OCR pass (e.g., Tesseract, ABBYY) on the scanned pages to produce an invisible, searchable text layer that sits on top of the image. | PDF‑Lib (Python), iText (Java), or Ghostscript. | | Table‑of‑Contents (TOC) Builder | Parse headings (e.g., “Capítulo 1 – …”) and generate a hierarchical bookmark file. | PDF‑Lib addBookmark() , or a separate JSON TOC that the viewer reads. | | Glossary / Lookup Service | A dictionary of theological terms, biblical cross‑references, and historical notes. When a user selects a word, a tooltip appears with the definition/verse link. | JSON dictionary + JavaScript tooltip; optional fallback to an online API (e.g., Bible API, Wikidata). | | Annotation Store | A lightweight JSON file ( annotations.json ) that records page‑number, rectangle coordinates, highlight colour, and user comment. | Export/Import via “Save Annotations” button; optional sync to cloud (Google Drive, Dropbox). | | Patched PDF Loader | A small script or browser extension that, when opened, reads the original PDF and the companion files (OCR text, TOC, glossary, annotations) and renders them together. | PDF.js (web), Electron + PDF‑Viewer (desktop), or a simple Python/Qt viewer. | | Accessibility Layer | Ensure that the OCR text is tagged for screen readers and that the tooltip content is ARIA‑compatible. | Use PDF/UA tagging, or provide a separate HTML version generated on‑the‑fly. |