feat(PROJ-5): AES-256-GCM Verschlüsselung, PostgreSQL Metadaten, Async Index Worker

- Storage: AES-256-GCM Verschlüsselung (keyfile, graceful fallback bei fehlendem Key)
- Storage: PostgreSQL emails-Tabelle mit Auto-Migration
- Storage: Save/Delete/Stats/FirstAndLastMail nutzen DB wenn verfügbar
- Index: Async IndexWorker (Go-Channel, Queue 1000, non-blocking Submit)
- SMTP: IndexCallback für async Indexierung nach Mail-Eingang
- main: Backfill beim Start (40 Mails migriert + indexiert)
- Bestehende Mails werden transparent entschlüsselt (Fallback auf Raw)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
sysops
2026-03-14 20:26:50 +01:00
parent 850290b5ef
commit 7e68c7ab02
8 changed files with 750 additions and 36 deletions
+1 -1
View File
@@ -16,7 +16,7 @@
| PROJ-2 | E-Mail-Import: EML/MBOX Upload | In Progress | [PROJ-2](PROJ-2-import-eml-mbox.md) | 2026-03-12 |
| PROJ-3 | E-Mail-Import: IMAP-Verbindung | In Progress | [PROJ-3](PROJ-3-import-imap.md) | 2026-03-12 |
| PROJ-4 | E-Mail-Import: SMTP-Eingang via BCC (primär) | In Progress | [PROJ-4](PROJ-4-import-smtp.md) | 2026-03-12 |
| PROJ-5 | E-Mail-Speicherung & Volltext-Indexierung | In Progress | [PROJ-5](PROJ-5-speicherung-und-indexierung.md) | 2026-03-12 |
| PROJ-5 | E-Mail-Speicherung & Volltext-Indexierung | In Review | [PROJ-5](PROJ-5-speicherung-und-indexierung.md) | 2026-03-12 |
| PROJ-6 | Volltext-Suche & Filterung | In Progress | [PROJ-6](PROJ-6-volltext-suche.md) | 2026-03-12 |
| PROJ-7 | E-Mail-Ansicht (Lesen & Anhänge) | In Progress | [PROJ-7](PROJ-7-email-ansicht.md) | 2026-03-12 |
| PROJ-8 | Automatischer IMAP-Sync (Cron-Job) | In Progress | [PROJ-8](PROJ-8-imap-auto-sync.md) | 2026-03-12 |
+50 -2
View File
@@ -1,8 +1,8 @@
# PROJ-5: E-Mail-Speicherung & Volltext-Indexierung
## Status: In Progress
## Status: In Review
**Created:** 2026-03-12
**Last Updated:** 2026-03-12
**Last Updated:** 2026-03-14
## Dependencies
- None (Basis-Feature, wird von Import-Features genutzt)
@@ -187,6 +187,54 @@ Body (ohne Anhänge) Anhänge (0..n)
| `mime`, `mime/multipart` | MIME-Parsing (Go Stdlib) |
| `golang.org/x/net/html` | HTML → Plain-Text für Index |
## Implementation Notes (2026-03-14)
### What was built
1. **AES-256-GCM encryption** in `internal/storage/storage.go`:
- Key loaded from file at `cfg.Storage.Keyfile` path or `ARCHIVMAIL_KEY` env var
- Supports base64-encoded or raw 32-byte key files
- If no keyfile configured, stores unencrypted (backwards compatible for dev)
- `Save()` encrypts with random 12-byte nonce prepended to ciphertext
- `Load()` decrypts transparently; falls back to raw read if decryption fails (pre-encryption files)
- SHA-256 dedup based on **plaintext** content (hash before encrypt)
- Same flat file path `store/{id[:2]}/{id}`
2. **PostgreSQL `emails` metadata table** auto-created at startup:
- Schema: `id TEXT PK, received_at, mail_from, mail_to, subject, size_bytes, has_attach, indexed_at`
- Indexes on `received_at`, `mail_from`, and GIN on `subject`
- `Save()` inserts metadata via mailparser after writing file (ON CONFLICT DO NOTHING)
- `Delete()` also removes DB row
- `Stats()` and `FirstAndLastMail()` use DB queries when available (fast), fall back to FS walk
- New methods: `SaveMeta()`, `SetIndexedAt()`, `IsIndexed()`, `WalkStore()`
3. **Storage constructor changed** from `New(dir string)` to `New(cfg storage.Config)`:
- `Config` struct: `Dir`, `Keyfile`, `DSN`
- All callers updated: `main.go`, `cmd_import.go`, `cmd_export.go`
- `Close()` method added to release DB pool
4. **Async Index Worker** in `internal/index/worker.go`:
- Buffered channel queue (configurable size via `config.Index.AsyncQueueSize`)
- `Submit()` is non-blocking; drops + warns if queue full
- `Start()` launches background goroutine; `Stop()` drains queue and blocks until done
- Serialises Xapian writes (one writer at a time)
5. **SMTP daemon integration**: `SetIndexCallback()` on `smtpd.Daemon`
- After each successfully stored mail, callback submits to async worker
- Wired in `main.go`
6. **Backfill at startup** in `main.go`:
- Runs in background goroutine
- Walks store directory, parses each file, upserts DB metadata
- Submits un-indexed emails (`indexed_at IS NULL`) to the async worker
- Logs progress every 100 files
### Deviations from spec
- Store path kept flat `store/{id[:2]}/{id}` (no `server_id/customer_id` hierarchy) per user decision
- Attachment dedup store (`astore/`) not yet implemented (body + attachments stored together in `.m` files as before)
- No separate `attachments` or `email_attachments` DB tables yet (deferred to future iteration)
- IMAP importer still uses synchronous `IndexSync()` directly (not routed through async worker yet)
## QA Test Results
_To be added by /qa_