feat(PROJ-44): Snippet + match_field fuer Suche, GetAttachmentText

Hit-Struct um Snippet + MatchField erweitert. enrichHitsWithSnippets
fuellt diese pro Treffer: detectMatchField probt subject>body>
attachment_text>attachment_names>from_addr>to_addr; buildSnippet ruft
CALL SNIPPETS mit <b>-Markern. Snippet-Fehler droppen den Treffer nicht.

AttachmentTextReader-Interface + Manticore-Implementation
GetAttachmentText liefert den indexierten OCR-Text fuer den neuen
/ocr-text-Endpoint.
This commit is contained in:
sysops
2026-05-10 22:20:52 +02:00
parent 5078830469
commit 7b75433999
3 changed files with 174 additions and 2 deletions
+17 -2
View File
@@ -35,9 +35,16 @@ type SearchRequest struct {
}
// Hit is a single search result.
//
// PROJ-44: Snippet and MatchField are populated by the Manticore Search path
// when a full-text query was provided. They remain empty for filter-only
// searches (e.g. date range without query) and when the per-hit highlight
// pass fails — the hit is still returned in that case (no hard error).
type Hit struct {
ID string `json:"id"`
Score float64 `json:"score"`
ID string `json:"id"`
Score float64 `json:"score"`
Snippet string `json:"snippet,omitempty"` // HTML-marked excerpt with <b>match</b> tags
MatchField string `json:"match_field,omitempty"` // subject|body|attachment_text|attachment_names|from_addr|to_addr
}
// SearchResult holds paginated search results.
@@ -63,6 +70,14 @@ type AttachmentTextUpdater interface {
UpdateAttachmentText(mailID, text string) error
}
// AttachmentTextReader is implemented by indexers that can return the stored
// OCR-extracted attachment text for a mail. Optional add-on to Indexer.
//
// PROJ-44: Manticore implements this for the /api/mails/{id}/ocr-text endpoint.
type AttachmentTextReader interface {
GetAttachmentText(mailID string) (string, error)
}
// TenantIndexer manages per-tenant Indexer instances.
// Implemented by ManticoreTenantManager (primary) and TenantIndexManager (legacy Xapian).
type TenantIndexer interface {