Changelog

All notable changes to this project are documented here in a high‑level, date‑based format.

v1.0.0 — 2026-04-21 — Modular architecture

First major release. Septum is now seven independently installable packages across three security zones; the monolithic backend/ is gone.

7 modules: septum-core (PII engine, zero net deps), septum-mcp, septum-api, septum-web, septum-queue (file / Redis bridge), septum-gateway (cloud LLM forwarder — cannot import core by invariant), septum-audit.
4 compose topologies: standalone (SQLite) · full dev stack · air-gapped zone · internet-facing zone.
Auto-RAG routing + MCP over stdio / streamable-http / sse + Audit Trail v2 (every detection links back to its audit event).
17 regulation packs with canonical RegulationId registry, national-ID checksum validators, legal-sources doc.
Honest benchmark: 3,468 values × 16 languages + 5 external datasets
- adversarial pack. Combined F1 96.6%, 0.00 FP/1k on clean text. Full methodology in docs/benchmark.md.
Security: Redis AUTH, parameterised compose passwords (${VAR:?} fail-fast, .env.example), pickle → json in BM25, 18 security-scan findings addressed.
Docker: six multi-arch images, CPU + GPU variants for torch-dependent ones, git-tag-driven release workflow.

Breaking

from backend.app.* → from septum_api.*.
Compose files require POSTGRES_PASSWORD / REDIS_PASSWORD in .env (see .env.example); no dev defaults.

Date-based ledger below has the full incremental history.

2026-04-28

Critical fix — Ollama validation was silently dropping deterministic detections: use_ollama_validation_layer gated every span outside the high-priority identifier set through Ollama and kept only the LLM's "validated" subset, so when the model omitted a real address, email, date of birth, or customer reference, the recognizer's correct hit was discarded. Reproduced on a real KVKK consent form: 19 detections persisted out of 24 the recognizers actually produced — every "Adres :" line, the second (KVKK) : email, the second Doğum Tarihi :, and both customer references were stripped. _SEMANTIC_VALIDATION_PASSTHROUGH_TYPES now covers every deterministic type (POSTAL_ADDRESS, STREET_ADDRESS, EMAIL_ADDRESS, DATE_OF_BIRTH, URL, IP_ADDRESS, MAC_ADDRESS, COORDINATES, COOKIE_ID, DEVICE_ID, CUSTOMER_REFERENCE_ID); only NER-driven types (PERSON_NAME / LOCATION / ORGANIZATION_NAME) reach the validator.
KVKK customer-reference recognizer + new entity type: Müşteri No: ETP-2021-00489-style alphanumeric reference IDs were not recognised under any pack; KVKK Md. 3's broad personal-data definition covers them. New CUSTOMER_REFERENCE_ID entity type plus a Turkish-label pattern recognizer that handles Müşteri / Üye / CRM / Hesap cues, optional (varsa) qualifiers, and the double-colon spacing PDF extraction sometimes produces.
Per-language NER ensembles: NERModelRegistry.DEFAULT_MODEL_MAP now carries an ordered list of HuggingFace model IDs per language; pipelines are cached by model id so a model shared across languages (Davlan wikiann across 11 locales) loads once. The Detector NER loop iterates every pipeline, applies the ALL-CAPS title-cased re-run per pipeline, and unions detections via a (start, end, entity_type) seen-set. Default Turkish ensemble: akdeniz27/xlm-roberta-base-turkish-ner + savasy/bert-base-turkish-ner-cased — different architectures catch different rare surnames the XLM-RoBERTa encoder underweights. Other locales stay single-model to bound memory cost; users opt into ensembles per language by entering a comma-separated list in the NER override input.
Settings UX audit: dropped the unused ollama de-anonymization strategy (cloud LLM preserves placeholders verbatim — deterministic Unmasker is faster, predictable, and free of hallucination risk); removed the dead extract_embedded_images / recursive_email_attachments toggles that no ingester ever read; surfaced the missing use_ollama_semantic_layer toggle in PrivacyTab so the only layer that detects DIAGNOSIS / MEDICATION / RELIGION / ETHNICITY / POLITICAL_OPINION is no longer silently disabled. All five sanitization-layer descriptions rewritten with cost/benefit guidance in EN and TR.
Custom-rules reference + inline UI help: new docs/custom-rules.md (EN + TR) walks through all three detection methods with worked examples (internal project codes, codenames, medication mentions) plus the test loop, common fields, and audit-trail behaviour. Regulations page Custom Rules and Advanced (Non-PII) tabs gain inline help banners with copy-paste pattern examples and a doc link, both locales.
Toggle visual fix: ToggleSwitch knob was rendering separated from the track in production due to inline-flex centering quirks. Rebuilt on absolute positioning with top-1/2 + -translate-y-1/2, added role="switch" + aria-checked for screen readers, replaced the always-on border with a focus-visible ring, dropped the text-[0px] hack. Single component change restandardises every toggle in the app (Privacy, Ingestion, RAG, Custom Rule Builder, User Form modal).
Person-name expansion no longer crosses newlines: expand_person_name_spans skipped past \n while looking for an adjacent surname/given-name token, so on form-style layouts (Ad Soyad\nFatma Nur Öztürk\nT.C. Kimlik No) the name span absorbed the previous-row label. The polluted text became the entity_index HMAC value, so chat-time entity routing for that document never matched ("Fatma Nur Öztürk" → "Doğrudan yanıt (doküman kullanılmadı)" on five separate questions in our hasta-kayit reproduction). Both expansion directions now bail on the first \n / \r; inline-Latin "John" → "John Smith" expansion still works.
Chat input lag eliminated: pressing Enter used to await a /api/chat/analyze round-trip (~500–1000 ms of query sanitization + entity-index lookup) before the user's message rendered in the chat — every send had a visible "where did my text go" gap. The analyze now runs in the background; the user message + assistant placeholder appear synchronously. If the analyze later detects a multi-document ambiguity the in-flight stream is cancelled and the picker modal surfaces, so the rare disambiguation path still works.

2026-04-26

Entity-aware RAG routing across documents: Auto-RAG now narrows retrieval to the documents that actually contain the queried PII (new cross-document entity index, HMAC-SHA256 keyed for privacy; relationship cache + cytoscape-based graph page; disambiguation picker when an ambiguous person/term matches several documents; intent classifier is bypassed on any entity hit so a weak match still grounds the answer rather than falling through to general chat). Multi-doc chat now unifies per-document anon maps into a single placeholder space so the LLM never sees two unrelated [PERSON_NAME_1] entries that mean different people.
Critical fix: chunks were storing raw text under the sanitized_text column. A 1.5-year-old latent bug in document_pipeline saved unmasked text as "sanitized", so the chat sanitizer's "context already masked" fast-path was passing raw PII through. Pipeline now persists sanitize_result.sanitized_text, with a new chunks.raw_text column behind it for the document-preview UI (preview keeps showing the original text without re-leaking it to chat).
PII detection tightening: KVKK Turkish-label recognizers for T.C. / Vergi No (closed a leak path the keyword-alternation regex couldn't reach); IBAN spans trimmed to longest checksum-valid prefix; entity normalization key now collapses whitespace variants so OCR/PDF spacing differences land in the same placeholder; per-entity NER confidence thresholds externalized into one table with PERSON_NAME lifted 0.85 → 0.80 (privacy-first recall on rare Turkish surnames); LOCATION spans absorbed by overlapping ORG / postal-address spans so "Antalya Sağlık Merkezi" stays one placeholder instead of two.
Per-document source citation in chat: SSE meta event now carries matched_documents with per-document chunk counts; the chat bubble shows "{N} belgeden, {M} parça kullanıldı" with an expandable per-document list. Works for both auto-RAG and manual-document modes.
Per-language NER model presets in settings: NER Models tab gains one-click preset chips next to the override input so users can swap between curated alternatives (3 Turkish + 2 English) without typing HuggingFace IDs by hand. Backend SUGGESTED_MODELS table is the single source of truth, surfaced via /api/settings/ner-defaults.
Pluggable Ollama benchmark: OLLAMA_MODEL in benchmark_detection.py reads SEPTUM_BENCHMARK_OLLAMA_MODEL from env; new scripts/benchmark_ollama_models.sh runs the harness against a list of models in turn (default trio: llama3.2:3b, aya-expanse:8b, qwen2.5:14b) and persists per-model logs. Benchmark docs (EN + TR) gain an "Ollama model comparison" section with TBD numbers waiting on the host that owns the GPU + model downloads.
Database hygiene + chat resilience: PRAGMA foreign_keys=ON was missing on SQLite, so detection rows could outlive the documents they referenced and surface in chat as bogus matches; startup now enforces FKs and one-time purges any pre-existing orphans from entity_detections / entity_index. Includes entity_type in the entity_index uniqueness key so legitimate token-level hash collisions across types coexist. Approval timeout in chat now treats the timeout as a user rejection so the retry button works again.

2026-04-24

Quickstart $EDITOR fix: The cp .env.example .env && $EDITOR .env line in README and the installation guide broke on shells without $EDITOR set (zsh tried to execute .env as a command). Split into a cp + explicit "open in your editor" comment; mirrored in the Turkish copies.
Compose files usable out of the box: docker-compose*.yml tagged services as septum/api, septum/web, septum/gateway, septum/audit, septum/mcp, septum/standalone — none of which exist on Docker Hub — so docker compose up failed with "pull access denied" for any image not already built locally. Retagged all six services to the real published names (byerlikaya/septum-*). Also fixed the Redis healthcheck in the same compose files: REDISCLI_AUTH=$${REDIS_PASSWORD} escaped the dollar sign so the in-container shell tried to read REDIS_PASSWORD (never set inside the redis container) and errored out, marking Redis unhealthy and cascading to every dependent service; switched to single-dollar so compose interpolates the password into the healthcheck string (same visibility as --requirepass, no new leak).
Published septum-web image baked against compose topology: .github/workflows/docker-publish.yml did not pass BACKEND_INTERNAL_URL to the septum-web build, so the published image defaulted to http://127.0.0.1:8000 (single-container topology) and broke the docker-compose.yml multi-container topology where the api is reachable at http://api:8000. Added per-image extra_build_args to the publish matrix and wired it through the build step so future releases bake the correct proxy target into the web image.
Drop YAML frontmatter from tracked markdowns: GitHub renders YAML frontmatter as a visible table at the top of every markdown it shows, which cluttered the README and every doc page browsed through the repo UI. Stripped frontmatter from 21 files (5 root MDs + 16 docs/ pages, EN + TR). Titles and descriptions moved into a central PAGE_META map in docs/.vitepress/config.mjs; the existing transformPageData hook now injects pageData.title / pageData.description plus Open Graph / Twitter card meta tags per route, so VitePress page titles, Google search snippets, and social share cards stay unique per page. Promoted the ## Changelog H2 in CHANGELOG.md to an H1 so VitePress picks it up as the page title.
Symmetric language link on package READMEs: Every packages/*/README.tr.md already carried a 🇬🇧 English version pointer back to the English file, but the English READMEs had no link in the other direction. Added a matching 🇹🇷 Türkçe sürüm line right below the H1 on all seven English package READMEs (api, audit, core, gateway, mcp, queue, web) so Turkish readers landing on the English page have an obvious switch.
Native arm64 builder for septum-web in CI: The septum-web image was the only one in the publish matrix that ran a heavy Next.js production build during docker build. Under QEMU arm64 emulation that step had been stretching past two hours and OOM'ing silently on GitHub's hosted runners, blocking entire releases. Split septum-web out of the matrix into a fan-out publish-web job that builds linux/amd64 on ubuntu-latest and linux/arm64 on the free ubuntu-24.04-arm runner, plus a follow-up merge-web job that assembles the multi-arch manifest with docker buildx imagetools create. Pulled images are bit-for-bit identical to before — only the build path changes. The five other images (api, audit, gateway, mcp, septum + GPU variants) keep the QEMU path because their builds finish in seconds under emulation.

2026-04-22

Dedicated installation guide + compose-first quickstart: Added docs/installation.md / .tr.md — nine-section guide covering quickstart, system requirements, five supported topologies (full local stack, standalone demo, air-gapped zone, internet-facing zone, native dev), first-launch wizard, LLM providers, volumes, upgrade, troubleshooting, and uninstall. README quickstart sections shortened to one compose command + pointer at the new page; docs/readme.md / .tr.md index tables gain an Installation row. Compose becomes the blessed path for non-trivial installs because it ships Ollama bundled — the standalone single-container image is now positioned as the "hızlı deneme" demo rather than the recommended install.
Nav reshuffle: Top/bottom nav bars now show 🏠 Home · 🚀 Installation · 📈 Benchmark · ✨ Features · 🏗️ Architecture · 📊 Document Ingestion · 📸 Screenshots. Installation moves to second position (most-requested resource for new users); Benchmark precedes Features (the "proof before the pitch"). The 📝 Changelog entry leaves every nav — GitHub's repo sidebar and release pages already surface it.

2026-04-21

Split benchmark into its own page: Moved the benchmark section out of docs/features.md / .tr.md into dedicated docs/benchmark.md / .tr.md pages. Added a 📈 Benchmark entry to the top + bottom nav of every markdown file and a source-link block on the benchmark pages (HF model cards, dataset papers, regulation primary sources).
Nav cleanup: Dropped the 🤝 Contributing entry from every top/bottom nav bar; the GitHub sidebar already surfaces it.
Require explicit POSTGRES_PASSWORD / REDIS_PASSWORD in compose files: Replaced the septum_secret / septum_redis dev defaults with ${VAR:?...} — compose now fails fast if either is missing. Added .env.example as the canonical template; README, README.tr and CLAUDE.md updated to point at it. GitGuardian no longer flags the compose files.
Turkish literary polish on docs/tr/features.md: Rewrote calques and awkward inversions across the Detection Pipeline, Regulation Packs, Auto-RAG, Why Septum, MCP, and REST API sections.
Per-image Docker Hub overviews: Replaced the single DOCKERHUB.md (standalone-only, pushed to all six repos) with six role-specific READMEs under docker/readmes/ — each image now gets an Overview page that matches what's actually in it (air-gapped vs internet-facing zone badge, role-specific quick-start, transport options for septum-mcp, etc.). Workflow readme-filepath is now matrix-driven.

2026-04-20

Audit Trail entity provenance: every EntityDetection row now carries an audit_event_id FK back to the event that produced it. New GET /api/audit/{event_id}/entity-detections endpoint plus an entity_type query filter (via EXISTS correlated subquery). Frontend adds a "Focus on these entities" button on each audit card that opens the document preview highlighting only that event's detections, with an event-scoped navigator and an entity-type filter dropdown on the log itself.
MCP over HTTP: septum-mcp now speaks all three standard MCP transports — stdio (default, unchanged), streamable-http, and sse. Bearer-token ASGI middleware gates non-loopback HTTP with constant-time comparison; /health always bypasses auth. CLI flags (--transport/--host/--port/--token/--mount-path) override the matching SEPTUM_MCP_* env vars. Docker image defaults to streamable-http on port 8765 with a healthcheck; docker-compose.yml and .airgap.yml gained an opt-in mcp profile.
Repo layout + architecture diagram refresh: ARCHITECTURE.md / .tr.md moved into docs/ next to FEATURES, screenshots/ renamed to assets/ (logo joined it), cross-file link references updated in lockstep. Replaced the Mermaid architecture diagram with hand-crafted SVGs (assets/architecture.svg + .tr.svg) — dashed zone borders, orthogonal L-routed queue → gateway arrow, paint-order halo labels, explicit response-path arrows — because Mermaid's auto-layout fought the 7-module topology.
Contributor onboarding: added CONTRIBUTING.md / .tr.md (dev setup, code style, PR process, security reporting) so GitHub surfaces the Contribute button. Extracted every screenshot into new docs/screenshots.md / .tr.md so README and FEATURES stay text-focused. README picked up a Roadmap section.
Simplify-pass cleanup after feature reviews: IN (SELECT DISTINCT) → EXISTS correlated subquery in the audit filter; N individual EntityDetection UPDATEs consolidated into one bulk statement; the new endpoint returns EntityDetectionListResponse for shape parity with its sibling; septum_mcp/config.py reuses parse_active_regulations_env instead of re-implementing it; DocumentPreview useEffect split so modal reopens don't refetch chunks / anon-summary / schema; orphan audit.card.viewEntities i18n keys deleted.
Readability pass on docs/tr/features.md: two rough spots introduced while drafting the Turkish deep-dive today. "Seçim retrieval'ı sürdürür" — I meant "drives" but picked the wrong verb; fixed to "retrieval seçilen dokümanlarda çalışır". "Doküman İngest" — mixed Turkish-dotted İ with an English stem that reads as a neologism; dropped to "Doküman Ingest" so the English tech term stays clean. No other calques detected after a full sweep.
Clean up pre-existing anglicism calques in ARCHITECTURE.tr.md: five calques that pre-date today's auto-RAG / TR-term commit but read as forced translations to the target audience. "ek bileşen / ek bileşeni" (calque of "extra") → "extra" / "extra'sı" — Python devs write pip install pkg[extra] in every language, so Turkish tech writing does the same. "Sözleşme gereği sıfır ağ bağımlılığı" (calque of "zero network deps by contract") → "Kod seviyesinde ağ bağımlılığı yok". "Boşta duran maliyet sıfıra yakın" (calque of "idle cost near zero") → "Boşta dururken neredeyse hiç maliyet üretmez". "Sıfır runtime bağımlılığı" → "runtime bağımlılığı yok". "Servis tanımları tekrarı kaldırılmıştır" (calque of "dedupes service definitions") → "servis tanımları tek yerde tutulur". Also fixed a "extra'sınin" double-genitive typo that the global ek bileşeni → extra'sı replacement introduced.
Update ARCHITECTURE docs for features that landed after the Phase 8 rewrite + stop translating technical zone terms in TR: Auto-RAG routing (103c6a0, 2026-04-18), the 17-pack default behavior (768d10b), and the core-side canonical regulation registry (6eb2335 + 7bcfca0) were never documented in ARCHITECTURE.md / ARCHITECTURE.tr.md — the last substantive update (1ffa995) pre-dated them. Three surgical edits in both locales: (a) the Policy Composition section now notes the RegulationId StrEnum + BUILTIN_REGULATION_IDS tuple + parse_active_regulations_env helper and the all-17-packs default in the standalone SeptumEngine; (b) the AI Privacy Gateway section gains an "Auto-RAG routing" subsection describing the three chat paths (manual / auto / none), the Ollama intent classifier, the rag_relevance_threshold setting, and the rag_mode + matched_document_ids SSE meta fields; (c) the septum-core package internals entry spells out per-pack ENTITY_TYPES constants and locates the canonical registry in recognizers/__init__.py. Across all TR docs (README.tr.md, ARCHITECTURE.tr.md), "hava boşluklu" and "internet-yönlü / İnternete açık" are reverted to "air-gapped" and "internet-facing" — the same rule the codebase already applies to "framework" / "gateway" / "worker" / "broker": technical zone terms stay English in Turkish tech writing; forced calques like "hava boşluklu" read as jokes to the target audience. Heading count still matches 1:1 across EN/TR.
Split READMEs into a slim overview + docs/features.md deep-dive: Both READMEs were 768 / 772 lines of dense prose — too long to scan and mostly duplicated what belonged in feature reference docs. Trimmed to ~275 lines each: hook, five-step flow, the single 7-module architecture mermaid, a compact feature list, two money-shot screenshots (setup wizard + approval gate), and a Docker quick start. The 17-regulation table, detection benchmark, Auto-RAG walkthrough, Why-Septum comparison, MCP integration, REST API + auth reference, and the full UI gallery (document preview GIF + 5 settings PNGs + audit trail) moved to new docs/features.md + docs/tr/features.md. The Turkish versions were rewritten as native Turkish (not a word-for-word translation): sentence order, idiom, and mermaid diagram labels — "Kullanıcı sorusu", "Maskeli istek", "Maskeleme + Map", "Köprü" — all now read naturally instead of betraying an English source. Net: 1540 lines of README became 1380 lines across two files per locale, with mermaid diagram count going from 5 to 8 and with the navigation structure matching exactly across EN/TR.

2026-04-19

Fix MCP/standalone dropping NER entity types + canonical regulation registry in core: MCP and standalone SeptumEngine were silently dropping PERSON_NAME/LOCATION/ORGANIZATION_NAME because the 17 packs' entity lists only existed in the API seed. Moved to core as per-pack ENTITY_TYPES constants plus RegulationId StrEnum and BUILTIN_REGULATION_IDS; MCP/API defaults now load all 17 packs, cross-pack duplicate recognizers are deduped on load (46 → 29 per mask call), and every downstream import goes through core — including a shared parse_active_regulations_env helper that replaces three duplicated env-parsing blocks. PATCH /api/settings now rejects typo'd regulation ids (previously stored silently and later filtered to an empty policy).
Dead-code sweep (~500 LOC removed): 23 unused Python imports across tests and one unused local in benchmark_detection.py (ruff); 5 unused API client functions, 1 unused type, and 3 orphan components (theme.tsx, AuthGuard.tsx, TextNormalizationTab.tsx) from the Next.js package; empty packages/api/documents/ directory; and 9 stale backend/* lines in .gitignore left behind after the Phase 8 shim removal.
Fix dev.sh --reset leaving live processes, SQLite sidecars, and webpack cache behind: reset now pkills the running uvicorn/next dev servers before wiping so aiosqlite worker threads release the DB handle (otherwise the next uvicorn hit disk I/O error on the first PRAGMA table_info), strips septum.db-wal/septum.db-shm and the legacy packages/api/septum.db* copy from before the Phase 8 working-directory change, and nukes packages/web/.next + .turbo + node_modules/.cache so a dev server that saw a source file deleted mid-session doesn't keep serving the old chunk with ChunkLoadError.
Fix 403 on whisper endpoints during setup wizard: the 04-17 hardening pass guarded /api/setup/whisper-status and /api/setup/install-whisper with _require_setup_phase(), but the wizard calls them after /api/setup/initialize has already flipped database_configured=true — so needs_setup() returns False and the guard rejects the very flow that needs them. Removed the guard from both whisper endpoints (downloading a public OpenAI model is not a security boundary); other guarded endpoints (test-database, test-redis, check-update) keep their guards.
Fix ChunkLoadError on first dev-server load after --reset: the 04-17 npm audit fix --force bumped Next.js 16.1.6 → 16.2.4, which regressed the deprecated --webpack dev path with a cold-compile race where the HTML was served before its chunks finished emitting. Switched packages/web/package.json dev script to Next 16's default turbopack (NODE_OPTIONS=--max_old_space_size=4096 next dev); webpack was originally chosen in 2026-03 for lower RAM, turbopack-on-16.2 measurably fits in the same 4 GB heap.
Fix QueuePool limit of size 5 overflow 10 reached during parallel document upload: the SQLite async engine was using SQLAlchemy's default pool (5 + 10 = 15 connections), which saturated as soon as two ingestion background tasks pinned a bg_db session each for the full OCR/Whisper pipeline while the frontend simultaneously uploaded 4 files in parallel and polled /documents/progress + /auth/me + /settings + /regulations from a freshly opened dashboard. Raised SQLite-dialect pool_size to 20 and max_overflow to 30 (50 total) in packages/api/septum_api/database.py::_engine_kwargs; SQLite connections are just cheap file handles and WAL already serialises the real contention point (single-writer lock), so the oversized pool costs nothing. Postgres kwargs untouched.

2026-04-17

Fix web proxy target under docker-compose: Next.js bakes rewrites() destinations at build time, so the runtime BACKEND_INTERNAL_URL env var had no effect — the dashboard proxied every /api call to its own 127.0.0.1:8000 instead of the api container. Converted to a build-arg in docker/web.Dockerfile; both compose files now pass http://api:8000 at build.
Add Alembic migration 013 for use_gateway column: The Phase 5 model addition lacked a PostgreSQL migration, causing UndefinedColumnError on the first SELECT after the setup wizard. Adds the column as BOOLEAN NOT NULL DEFAULT false.
Eliminate AI rule duplication across tools: Removed .claude/rules/ (5 files), .cursor/rules/ (12 .mdc files), .cursor/skills/, and the new-ingester/new-recognizer/new-regulation skill templates. CLAUDE.md is now the single source of truth for all AI tools; Cursor gets a 3-line .cursorrules pointer. New tool = one pointer file, zero content duplication.
Add Redis authentication + eliminate curl|bash supply chain risk: All four compose files now start Redis with --requirepass and every redis:// URL includes the password via ${REDIS_PASSWORD:-septum_redis} substitution. The standalone Dockerfile's curl | bash NodeSource install is replaced with a multi-stage COPY --from=node:20-alpine — no remote scripts executed at build time.
Security hardening (18 findings from /security-scan): Bumped 8 vulnerable Python deps (cryptography, PyJWT, python-multipart, pillow, langchain-*, pdfminer-six, pytest) and fixed 7 npm picomatch vulns via npm audit fix --force. Added SecurityHeadersMiddleware (X-Content-Type-Options, X-Frame-Options, Referrer-Policy, Permissions-Policy). Replaced pickle.loads with json.loads in BM25 retriever (eliminates code execution risk). Guarded 5 setup router endpoints with _require_setup_phase() so test-database, test-redis, whisper-status, install-whisper, and check-update return 403 after setup completes (closes SSRF vector). Changed default log level from DEBUG to INFO. Parameterized Postgres password in compose files via ${POSTGRES_PASSWORD:-septum_secret}, bound Postgres port to localhost only, added *.pem/*.key to .gitignore, and applied os.chmod(0o600) to config.json writes.
Promote Ollama to a default service in docker-compose.yml: Ollama now starts alongside Postgres and Redis on docker compose up instead of requiring --profile ollama. Added OLLAMA_BASE_URL + healthcheck-gated depends_on to the api service. New docker-compose.no-ollama.yml override disables it via !reset null for cloud-only deployments.

2026-04-16

CPU / GPU image variants for torch-dependent images: Split byerlikaya/septum and byerlikaya/septum-api into CPU (default, multi-arch, ~250 MB torch wheel) and GPU (linux/amd64 only, full CUDA runtime) variants via a TORCH_VARIANT build-arg. CPU keeps :latest + rolling tags; GPU floats on :gpu with -gpu-suffixed rolling aliases. Other images (web, gateway, audit, mcp) stay CPU-only — docker/mcp.Dockerfile now explicitly calls install-torch.sh cpu before the editable install of packages/core[transformers], so the mcp image no longer accidentally drags in ~5 GB of CUDA shared libs through the default PyPI torch wheel (GPU offers no benefit for the stdio-attached short-call pattern; users needing GPU-accelerated NER run septum-api instead). Local ./dev.sh is unaffected — it picks up whatever torch variant PyPI serves for the host (CUDA on NVIDIA Linux, MPS on Apple Silicon, CPU elsewhere).
Git-tag based release process + modular Docker publish: Replaced the VERSION file + [release] commit gating + bot auto-bump with a git-tag trigger (git tag v0.2.0 && git push --tags). Workflow now builds all six images via an 8-entry matrix, stamps OCI labels, and creates a GitHub Release with autogenerated notes. VERSION is a build-arg; _app_version() reads the stamped /app/VERSION or falls back to 0.0.0-dev. CLAUDE.md grows a "Release process" section.
Remove the backend/ shim layer entirely (Phase 8): Seven months after the Phase 3a scaffold introduced the backend/app/* re-export shims, every caller has migrated to direct septum_api.* imports and the shim layer is now dead weight. Phase 8 lands in six ordered slices: (6.1) backend/tests/ (33 files + conftest + factories) moves to packages/api/tests/ via git mv with history preserved, every from app.X / from backend.app.X import rewritten to from septum_api.X (including string patch paths in @patch decorators), backend/pytest.ini merged into packages/api/pyproject.toml [tool.pytest.ini_options] filterwarnings block, BACKEND_ROOT + sys.path.insert removed from conftest (editable install resolves septum_api cleanly), two __init__.pys removed to fix the tests namespace collision between core and api, test_policy_composer.py → test_policy_composer_api.py rename. (6.2) backend/alembic/, backend/alembic.ini, backend/scripts/ move to packages/api/ with env.py + scripts/*.py + docker-entrypoint.sh import paths rewritten to septum_api.*. (6.3) backend/requirements.txt → packages/api/requirements.txt as a pure rename (extras split deferred to a follow-up refresh). (6.4) 10 shim files under backend/app/**/__init__.py + backend/app/{bootstrap,config,database,main}.py deleted (forwarders that had no callers left). (6.5) 3 Dockerfiles (api, standalone, root Dockerfile) + .github/workflows/tests.yml + dev.sh + .claude/pre-commit-check.sh repointed at the new paths. WORKDIR /app/backend → WORKDIR /app/packages/api; PATH=/app/backend/.venv → PATH=/app/.venv; standalone start.sh cd /app/backend → cd /app/packages/api; CI dropped working-directory: backend and switched Ruff/Bandit to scan all 6 package source roots. Docker builds verified: api and standalone both produce working images with from septum_api.main import app loading cleanly. (6.6) backend/ directory removed; remaining tracked content (docs/REGULATION_ENTITY_SOURCES.md, migrations/002_add_chunk_field_metadata.sql, .coveragerc, .dockerignore) relocated or deleted; docstring sweep updated 16 recognizer files + 9 rule/skill files + ARCHITECTURE.md + ARCHITECTURE.tr.md + CLAUDE.md + both READMEs to point at modular paths. 446 / 446 backend+modular tests + 18 / 18 frontend Jest tests pass at every step.
Tighten Phase 8 hygiene after simplify-pass review: F1 — fix .coveragerc source = app (stale shim reference) → source = septum_api; coverage runs were silently measuring nothing since the Phase-8 move. F2 — add the missing CHANGELOG entry for Phase 8 itself (project rule: changelog update in the same commit as the code change). F3 — sweep the remaining backend/-era narrative out of ten file-docstrings, comment blocks, and skill files: septum_api/__init__.py ("the real FastAPI app still lives in backend/app/"), packages/api/pyproject.toml ("heavy libraries stay in backend/"), packages/api/README.md (Phase-3a shim narrative in "Status" block), utils/text_utils.py + services/anonymization_map.py + services/national_ids/__init__.py (docstrings still claimed "re-exports for from app.* imports"), .claude/pre-commit-check.sh ("Both the legacy backend/app path and the new…" comment where only one path now exists), scripts/reprocess_entity_detections.py (BACKEND_DIR var + "launched from backend/" comment), scripts/test_ollama_layer.py ("Run from backend:" usage hint), tests/test_recognizer_packs.py (docstring pointing at backend/app/seeds/), and 4 .claude/skills/*/SKILL.md + 1 .cursor/skills/*/SKILL.md still instructing users to edit backend/tests/test_*.py paths that no longer exist. F4 — root Dockerfile was a 158-line byte-identical copy of docker/standalone.Dockerfile (the two files contradicted each other: the standalone header said standalone was canonical, yet the root was never deleted). Removed the root copy and repointed .github/workflows/docker-publish.yml to ./docker/standalone.Dockerfile directly — a symlink was considered and rejected because Windows checkouts with core.symlinks=false turn it into a text file that breaks the Docker build. F5 — packages/api/docs/REGULATION_ENTITY_SOURCES.md moved to packages/core/docs/ (its natural home — the 16 regulation recognizer docstrings in septum-core all point at it, and septum-core is a zero-dependency package that cannot reference api-side paths without inverting the zone wall); 16 recognizer docstrings, both READMEs, ARCHITECTURE docs, CLAUDE.md, .claude/rules/git-and-changelog.md, and .claude/pre-commit-check.sh all updated in lockstep. F6 — CI backend-tests job installed all 6 packages (core + queue + api + mcp + gateway + audit); dropped mcp / gateway / audit since packages/api/tests only exercises api's dependency closure and those packages are covered by the separate modular-tests job. Skipped: A2-F10 (alembic legacy/ subdir naming — cosmetic), A2-F11 (tests/factories/__init__.py blank file — confirmed legitimate by pytest import resolution), A2-F12 (extract install-all-packages.sh from dev.sh + CI — bigger refactor, low urgency), A2-F13 (narrative Dockerfile comment — marginal), A3-F2/F3/F5/F6/F7/F9/F10 (premature/microscopic for Septum's scale).
Tighten Phase 7 hygiene after simplify-pass review: F1 — extract the duplicated _build_queue(topic) helper from septum_gateway/worker.py + septum_audit/worker.py into a single backend_from_env(topic) in septum_queue/__init__.py. Redis URL vs file-dir dispatch now lives in one place; both workers shrink to a single-line backend construction. septum-queue is the natural home because it already owns QueueBackend and both workers already depend on it — no zone-wall violation. F2 — drop dead if TYPE_CHECKING: from septum_queue import QueueBackend blocks from both workers (forward-ref only needed when the annotation is stringified, but from __future__ import annotations already makes all annotations strings). F3 — drop unused argv: list[str] | None = None parameter from both main() entry points (speculative API surface; __main__.py always calls main() with no args). F4 — shrink both worker module docstrings from 3-4 paragraphs to a single-line summary (CLAUDE.md strict rule; the SystemExit error message already documents the env var contract at the point where operators actually hit it). F5 — drop the curl apt-get install from docker/gateway.Dockerfile and docker/audit.Dockerfile and standardize every HTTP HEALTHCHECK (both Dockerfile HEALTHCHECK directives and matching compose test: entries) on python -c "import urllib.request; urllib.request.urlopen('http://.../health')" — python is already in the runtime image, curl install was ~3 MB of wasted layer weight. F6 — add YAML anchors (x-gateway-base, x-audit-base + inline env &gateway-env / &audit-env) to docker-compose.gateway.yml so the gateway-worker / gateway-health pair and audit-worker / audit-api pair no longer duplicate their 10-line build / image / env blocks; ~40 lines removed from the file. F7 — drop the 3-line decorative preamble from .dockerignore (unverifiable "shave several hundred MB" claim) and the narrative # Default: run the worker. Override CMD with uvicorn... comment from docker/gateway.Dockerfile (restates what compose files already do). Tests: the 5 test_gateway_worker.py + test_audit_worker.py cases are deleted and replaced by a single packages/queue/tests/test_backend_from_env.py covering file-dir dispatch, missing-env SystemExit, and Redis URL precedence — one test file for one shared function. Skipped: A1-F2 / A2-F3 run_consumer_forever extraction (would couple septum-queue to consumer semantics, scope creep), A2-F5 except Exception narrowing around queue.close() (code already logs via exc_info=True — not truly silent; A3 confirmed no leak), A2-F10 fail-fast on missing API keys (envelope-first deployment is a legitimate pattern), A2-F15 standalone heredoc extraction (drift risk with top-level Dockerfile), A2-F17 shared Dockerfile base (per-image specialization diverges immediately after FROM), A3-F4 drop build-essential from api.Dockerfile (untested build, risk > reward). 10 files changed, +83 / −253 net. 446 backend+modular + 18 frontend tests pass (two worker test files removed; functionality covered by the 3 backend_from_env tests).
Mark Phase 7 complete in PROJECT_SPEC (Phase 7 — closeout): PROJECT_SPEC.md Phase 7 flips to "✓ DONE" with per-item check marks — 6 Dockerfiles (api + web + gateway + audit + mcp + standalone) + .dockerignore, 4 compose variants (full dev + airgap + gateway + standalone) verified via docker-compose config, Docker HEALTHCHECK on every HTTP image, worker CLI entrypoints (python -m septum_gateway / -m septum_audit) + SIGINT/SIGTERM graceful shutdown. All 7 phases complete: monolithic Septum is now split into 7 independent modules (core, mcp, api, web, queue, gateway, audit), each with its own pyproject.toml / Dockerfile and deployable on its own. Air-gapped zone (core + mcp + api + web) has no internet access; bridge (queue) carries masked data only; internet-facing zone (gateway + audit) cannot import septum-core by code-review invariant. 448/448 backend+modular + 18/18 frontend Jest tests pass.
Document deployment topologies in both READMEs + CLAUDE.md (Phase 7 — docs slice): New "Deployment Topologies" subsection under "Docker Compose" in README.md / README.tr.md gives a 4-row table mapping each compose variant (standalone / full dev / airgap / gateway) to its host count, zone-split property, and when-to-use guidance; follow-up paragraph explains that a true air-gapped deployment runs airgap.yml + gateway.yml on two hosts pointing at the same Redis over a VPN and that only masked text crosses the queue. Per-module Dockerfile list added so operators running custom orchestrators (K8s / Nomad / ECS) know which images they can cherry-pick. CLAUDE.md Docker block expanded to show the three modular compose commands alongside the standalone docker run and docker compose up; the "refactor/modular-architecture branch" caveat is dropped (the files are now on main-track). README structure parity preserved across both locales: 79 table rows, 18 H3 headings in each.
Add 4 compose variants covering every deployment topology (Phase 7 — compose slice): docker-compose.airgap.yml runs only the air-gapped zone (api + web + postgres + redis); USE_GATEWAY_DEFAULT=true + SEPTUM_QUEUE_URL=redis://redis:6379/0 wire the api into gateway mode so cloud LLM calls leave the host only via Redis Streams. docker-compose.gateway.yml runs only the internet-facing zone (gateway-worker + gateway-health + audit-worker + audit-api + redis); two containers share each image — one runs python -m septum_gateway / -m septum_audit (stdio worker), the other runs uvicorn septum_gateway.main:create_app --factory / septum_audit.main:create_app for /health + /api/audit/export. docker-compose.standalone.yml is the all-in-one (one container from docker/standalone.Dockerfile, SQLite, no external services) — the simplest install. The existing docker-compose.yml is rewritten as the full dev stack on one host: every module + postgres + redis + optional Ollama profile; it's the fastest path from git clone to a working local Septum. Every compose file has depends_on: condition: service_healthy so a fresh up waits for Redis / Postgres before the first consumer starts — no sleep 10 shell races. All four files validate clean via docker-compose config.
Split the monolithic image into per-module Dockerfiles (Phase 7 — Dockerfile slice): New docker/api.Dockerfile ships only the Python backend (FastAPI + SQLAlchemy + Presidio + torch); the Next.js dashboard moves to docker/web.Dockerfile which takes a build-arg NEXT_PUBLIC_API_BASE_URL so a split deployment can point the bundled bundle at a separate api origin. New docker/gateway.Dockerfile and docker/audit.Dockerfile are deliberately lightweight — no torch, no Presidio, no spaCy; just septum-queue[redis] + septum-gateway[server] / septum-audit[queue,server] — so each internet-facing image lands around ~100 MB instead of the ~4 GB the api image needs. Critical invariant: neither gateway nor audit Dockerfile copies packages/core/, enforcing the "no septum-core in the internet-facing zone" rule at the image layer. New docker/mcp.Dockerfile bundles septum-mcp + septum-core[transformers] for the rare case where an orchestrator runs the stdio MCP server as a subprocess container (most users still use uvx septum-mcp locally, so no MCP HEALTHCHECK — the parent orchestrator observes subprocess exit codes). The legacy top-level Dockerfile keeps building the combined standalone image unchanged (still published as byerlikaya/septum:latest); a matching docker/standalone.Dockerfile is the canonical modular-naming copy. New .dockerignore at the repo root shaves a few hundred MB off every build context by excluding .git/, node_modules/, .next/, backend runtime state dirs, local *.db, docs screenshots, and the docker/ + compose files themselves (never needed inside an image).
Add worker CLI entrypoints for septum-gateway and septum-audit (Phase 7 — worker slice): New septum_gateway/worker.py + __main__.py and septum_audit/worker.py + __main__.py so python -m septum_gateway / python -m septum_audit boot a long-running process — the prerequisite for Dockerizing either module. Queue backend picked from the environment: SEPTUM_QUEUE_URL=redis://host:6379/0 selects RedisStreamsQueueBackend.from_url(...), SEPTUM_QUEUE_DIR=/srv/septum/queue selects FileQueueBackend(...); Redis URL wins when both are set. Neither set is a SystemExit with a clear error — an air-gapped deployment without a declared queue is almost always a misconfiguration, silent defaulting would mask it. Graceful shutdown via SIGINT / SIGTERM handlers that set an asyncio.Event, cancel the run_forever task, and close every queue/sink handle in a finally. 5 new tests cover file-backend selection, missing-env → SystemExit(SEPTUM_QUEUE_URL), and Redis URL taking precedence (using fakeredis.aioredis monkeypatched into redis.asyncio.Redis.from_url). test_worker.py renamed to test_gateway_worker.py / test_audit_worker.py to avoid the Phase 5/6 basename-collision pattern.
Tighten Phase 6 hygiene after simplify-pass review: F1 — drop a dead if TYPE_CHECKING: pass block and an Optional[QueueBackend] import in gateway/response_handler.py; X | None is the codebase standard. F2 — drop _import_queue_backend() from audit/consumer.py; the fail-fast wrapper was redundant because any deployment without septum-queue installed already fails earlier at the from septum_queue import ... caller site, and the test suite's pytest.importorskip("septum_queue") handles the dev path. F3 — collapse the three exporters' copy-pasted to_string() / write() bodies into a shared BaseExporter(ABC) with a single iter_chunks(records) -> Iterator[str] streaming primitive; concrete exporters now only implement iter_chunks (~15 LOC saved, removes three from io import StringIO imports, and makes the next exporter addition one subclass + two class attrs). F4/F5 — strip narrating module / class docstrings across every new Phase 6 module (events / sink / retention / config / consumer / main / exporters / gateway response_handler); per CLAUDE.md the rule is "docstrings required, but redundant/obvious/decorative comments must be avoided", and every module had 2–6 sentences of backstory that belonged in the README or commit message. Inline comments that restated the code (e.g. "# Run the blocking write under to_thread so the async consumer loop is never stalled", "# Nothing to release; kept to satisfy the protocol") are deleted; only genuine non-obvious WHY lines are kept (PIPE_BUF / logrotate note on JsonlFileSink, "truncated tail of an actively-written log" note on _iter_records). F6 — /api/audit/export format query param is now typed as Literal["jsonl", "csv", "siem"] so FastAPI surfaces it as an OpenAPI enum and returns a 422 with the allowed set for invalid values; the try/except KeyError branch and its HTTPException(400) are gone. F7 — MemorySink(initial_records=...) constructor replaces the sink._records.append(r) private-attribute poke that test_app.py::_build was doing; the helper now seeds the sink through the public constructor. F8 — rename test_consumer_drops_malformed_payload_without_crashing to test_consumer_nacks_when_sink_write_fails (the test was actually exercising the sink-failure → re-queue path, not malformed payloads) and delete the 6-line narrative comment block explaining the mismatch. F9 — delete JsonlFileSink.aread_all and its test; no production caller used it, and a sync read_all() already works for the FastAPI export path via asyncio.to_thread. F10 — /api/audit/export switches from a Response(content=full_body) to StreamingResponse driven by a _stream_export async generator that pulls each chunk through asyncio.to_thread so large dumps never block the event loop and never materialize the whole body in memory; the X-Audit-Record-Count header is dropped (cannot be known before the stream starts). F11 — replace if not self._path.exists(): return iter(()) in JsonlFileSink.read_all and apply_retention_to_jsonl with try: open(...); except FileNotFoundError: return iter(()) / return 0; removes the TOCTOU race and one stat syscall per call. Skipped: A1-F3 (_now() / _new_id() duplication with septum_queue.models — justified by the zero-dep contract of each package), A1-F5 (lazy __getattr__ triplication across audit/queue/gateway — justified by per-package independence), A2-F8 (moving AuditRecord shape into septum-queue to unify the gateway's hand-rolled audit dict — cross-package refactor, defer), A2-F9 (status vs event_type redundancy — both fields are real filter axes for downstream consumers), A3-F2/F3/F4/F6/F7 (premature optimization at chat-scale traffic). 15 files changed, +148 / −411 net. 443 backend+modular tests pass.
Document the Phase 6 split in README.md / README.tr.md / CLAUDE.md / PROJECT_SPEC.md: Both READMEs flip septum-audit from "Planned" / "Planlanıyor" to "Released" / "Yayında" in the Package Layout table, with the audit row noting the JSON / CSV / Splunk HEC exporters, age + count retention, the optional queue consumer ([queue] extra), the FastAPI /health + /api/audit/export ([server] extra), and the no-septum-core-import invariant. CLAUDE.md Modular Packages command block grows install lines for [queue] / [server] extras and a pytest packages/audit/tests/ invocation. PROJECT_SPEC.md flips Faz 6 to "✓ TAMAMLANDI" with per-item check marks plus a final tally line. README structure parity preserved (identical heading and table-row counts across both locales).
Add FastAPI /health and /api/audit/export endpoints behind the [server] extra (Phase 6 — audit FastAPI slice): New septum_audit/main.py exposes create_app(config, *, sink=None) mirroring the gateway factory pattern — deployment code passes a sink (or accepts the default JsonlFileSink(cfg.sink_path)) and the FastAPI dependency itself is lazy-imported so a queue-only audit deployment never pulls fastapi/uvicorn. /health reports the configured audit_topic / sink_path / supported formats so an operator can probe wiring without touching the audit log. /api/audit/export?format=jsonl|csv|siem streams the current sink contents through the matching exporter, sets the right Content-Type (application/x-ndjson / text/csv / application/json) and Content-Disposition (attachment; filename="septum-audit.{ext}"), and adds an X-Audit-Record-Count header so a downstream pipeline can verify the dump size without parsing the body. The format string is matched via a single _EXPORTERS dict so adding Loki line protocol or OTLP later is a one-line change. Unknown formats return 400 with the choice list rather than a generic 422; format matching is case-insensitive. 7 new tests cover health, jsonl/csv/siem export shapes (including content-type / disposition / record-count assertions), unknown-format 400, case insensitivity, and empty-sink zero-count behavior. test_app.py kept under that name since no other audit/gateway/queue test shares the basename.
Wire optional audit hook into GatewayConsumer so each handled request emits a PII-free telemetry envelope (Phase 6 — gateway audit hook slice): New audit_queue: QueueBackend | None parameter on GatewayConsumer; when set, every handled message produces a small dict on the audit topic after the response publish completes — source: "septum-gateway", event_type: "llm.request.completed" or "llm.request.failed", correlation_id, plus an attributes block carrying provider, model, status, latency_ms (monotonic-clock measured around the forwarder call), message_count, optional max_tokens, and the error string on failure. Discipline check: the envelope contains no prompt content, no response text, no api keys, and no base URLs — only metadata an internet-facing observer is allowed to see — so the gateway can stream telemetry into septum-audit without breaking the no-PII-leaves-the-zone invariant. Audit-side failures are swallowed (warning-logged via logging.exception-style, never raised) so a transient audit-queue outage cannot block the primary request/response path. GatewayConfig grows an audit_topic: str | None field plus SEPTUM_GATEWAY_AUDIT_TOPIC env loading (empty string treated as unset); /health reports the topic so operators can tell whether telemetry is wired in. 6 new tests cover the no-audit-queue zero-overhead default, the success-path envelope shape (including the explicit "no messages / no text / no api_key" PII-discipline assertion), the error-envelope variant with the error field populated, and a broken-audit-queue resilience test that asserts the response still goes out and a warning is logged.
Add AuditConsumer so septum-audit can ingest events from a queue topic (Phase 6 — queue consumer slice): New septum_audit/consumer.py mirrors the GatewayConsumer shape (run_once(block_ms=...) / run_forever) and persists each delivered message into the configured AuditSink after rebuilding it through AuditRecord.from_dict. The septum-queue import is gated behind the [queue] extra and forced at construction time so a misconfigured deployment fails fast at startup rather than deep inside the loop. Failure handling matches the gateway's: malformed payloads ack-and-drop with a logged error (so one poison pill cannot stall the audit pipeline), but sink write failures nack-with-requeue so a transient disk-full / permission glitch does not lose events. 4 new tests using the FileQueueBackend cover memory-sink and jsonl-sink round-trips, empty-queue run_once returning False, and the sink-failure → re-queue path. test_consumer.py renamed to test_audit_consumer.py to avoid pytest basename collision with packages/gateway/tests/test_consumer.py.
Scaffold septum-audit with records, sinks, exporters, and retention (Phase 6 — audit core slice): New packages/audit/ package persists already-masked compliance event records and ships them to downstream SIEM pipelines. Like the gateway, it lives in the internet-facing zone and — by explicit dependency-wall invariant — never imports septum-core, so raw PII cannot land in the audit store even via a typo. AuditRecord envelope (id, timestamp, source, event_type, correlation_id, attributes) is the single shape every component round-trips. AuditSink Protocol with two bundled implementations: JsonlFileSink (append-only newline-delimited JSON, POSIX-safe concurrent writes via O_APPEND + asyncio.Lock, logrotate-friendly because each line opens-and-closes the file) and MemorySink (snapshot-iterating list for tests + ephemeral counts). Three exporters cover the SIEM matrix: JsonExporter (jsonl, content-type application/x-ndjson), CsvExporter (RFC 4180 with attributes JSON-flattened into one cell), SplunkHecExporter (HEC envelope with time / host / source / sourcetype / event plus optional index). RetentionPolicy(max_age_days, max_records) with apply_retention_to_jsonl(path, policy, *, now=None) does an atomic in-place rewrite via .tmp sibling + os.replace, so a crash mid-pass leaves the original file untouched; corrupt lines count as removals. AuditConfig.from_env() reads SEPTUM_AUDIT_SINK_PATH / SEPTUM_AUDIT_TOPIC / SEPTUM_AUDIT_RETENTION_DAYS / SEPTUM_AUDIT_RETENTION_MAX_RECORDS, treating empty strings as unset. Exporters are lazy-imported via PEP 562 __getattr__ so a stdlib-only audit pipeline never pays the csv / io cost. 28 tests cover envelope round-trip, snapshot-safe iteration, concurrent writes, missing files, blank/corrupt line handling, all three exporter shapes, retention age + count + combined caps, and env-var overrides.
Tighten Phase 5 hygiene after simplify-pass review: F1 — drop the dead try/except TypeError around an await llm_router.set_gateway_client_factory(None) in test_gateway_client.py (the setter is sync, the await always raised, so the except branch ran every run as pure noise). F2 — LLMRouter._resolve_gateway_client no longer swallows factory-construction exceptions into a silent None-fallback; it lets the caller see the failure, per the project rule "never swallow errors". F3 — define a GatewayClientProtocol(Protocol) so the module-level _gateway_client_factory and _resolve_gateway_client carry a real type instead of object | None; the LLMRouter narrative comment that opened with "Phase 5 — optional gateway delegation" is gone (the Protocol + setter docstring already convey the intent). F4 — RequestEnvelope / ResponseEnvelope grow a to_dict() helper that gateway_client.publish and GatewayConsumer._handle now call instead of importing dataclasses.asdict directly, keeping serialization knowledge in one place. F5 — gateway_client._build_envelope collapses the three-branch if provider == "anthropic"/"openai"/"openrouter" into a _PROVIDER_API_KEY_FIELD dict + getattr lookup. F6 — _await_response drops the doubled delivered_for_us sentinel, returns directly on the matching correlation id, and adds a remaining_ms <= 0 guard before each consume call so a sub-millisecond deadline does not buy one wasted poll. F7 — forwarder._post_with_retries regains the body_len=… log field and the (status=…) suffix that the api-side http_client.post_with_retries already had, closing the observability gap that simplify-agent#1 flagged as drift. F8 — ForwarderRegistry.from_config picks up its missing config: "GatewayConfig" annotation via a TYPE_CHECKING import. F9 — LLMRouter._dispatch_cloud_call reads self._settings.use_gateway directly instead of the defensive getattr(..., False) fallback (the column is always present after _sqlite_ensure_columns). F10 — _now docstring corrected from the misleading "Monotonic-ish wall-clock" to plain "Wall-clock (needed for cross-host transport)". F11 — file_backend.py module docstring now explicitly tells operators that done/ is never trimmed and recommends a find done/ -mtime +7 -delete cron. F12 — strip the "Phase 5 — opt-in delegation… Phase 7 deploy flips this…" narrative on AppSettings.use_gateway. Skipped: introducing a CompletionRequest dataclass (pre-existing parameter-sprawl, scope creep), a ProviderName Literal (touches 6+ files), constructor-injecting the gateway client (Phase 7 deploy will rewire this), and pooling the httpx.AsyncClient across retry attempts (would create drift with the api-side copy that the gateway forwarder is faithful to today). 399 backend+modular + 18 frontend tests still pass.
Rename packages/gateway/tests/test_config.py to test_gateway_config.py: Pytest uses the file basename as the module name when the tests directory has no __init__.py, which collides with packages/mcp/tests/test_config.py on combined runs (pytest packages/mcp/tests/ packages/gateway/tests/ ...). Disambiguating the gateway filename lets the full 399-test regression (backend 289 + core 24 + mcp 39 + queue 22 + gateway 25) collect cleanly.
Document the Phase 5 split in README.md / README.tr.md / CLAUDE.md / PROJECT_SPEC.md: Both READMEs flip septum-queue and septum-gateway from "Planned" / "Planlanıyor" to "Released" / "Yayında" in the Package Layout table, with the queue row noting the file backend (air-gap default) and Redis Streams [redis] extra and the gateway row noting the three cloud providers, the no-septum-core-import invariant, and the optional FastAPI /health behind the [server] extra. CLAUDE.md Modular Packages command block grows install lines for [redis] / [server] extras and pytest packages/queue/tests/ / packages/gateway/tests/ invocations. PROJECT_SPEC.md flips Faz 5 to "✓ TAMAMLANDI" with per-item check marks; item 1 explicitly marks RabbitMQ as deferred (rare in Septum deployment profile, future [rabbitmq] extra). README structure parity preserved (identical heading and table-row counts across both locales).
Wire optional queue producer into septum-api for gateway-mode cloud LLM calls (Phase 5 — api producer slice): New AppSettings.use_gateway: bool = False column with a matching _sqlite_ensure_columns migration so existing databases pick up the flag on next start; build_default_app_settings() seeds it from USE_GATEWAY_DEFAULT. New septum_api/services/gateway_client.py owns the producer-side half: GatewayClient.complete(...) publishes a RequestEnvelope built from the current settings (provider-matched api_key threaded through so the gateway never needs its own secrets in a split deployment), then consumes the response topic until the matching correlation_id arrives, re-queuing any other waiter's reply along the way. Gateway-side error envelopes map to LLMRouterError and missing replies after timeout_seconds raise QueueTimeoutError — both fall through to the existing LLMRouter._fallback_via_ollama path so the user-visible failure mode is identical regardless of whether the cloud call went direct or via gateway. LLMRouter gains a _dispatch_cloud_call seam that consults use_gateway + a process-wide _gateway_client_factory (installed by deployment code, left unset in tests); when the factory is absent or raises, the router logs a warning and falls back to the direct-call path. This keeps 289/289 backend tests green without needing a queue backend in any existing fixture. 6 new tests cover the round-trip, error mapping, timeout, provider-matched api_key threading, and factory registration; the direct-call default path is already covered by the existing suite.
Scaffold septum-gateway with cloud LLM forwarders and consumer loop (Phase 5 — gateway slice): New packages/gateway/ package consumes masked requests from septum-queue, dispatches them to Anthropic / OpenAI / OpenRouter via httpx, and publishes the masked answers back on a reply topic. Dependency wall: package declares septum-queue + httpx + pydantic in its required deps and — by explicit code-review invariant — never imports septum-core, so raw PII cannot slip into the internet-facing zone even via a typo. ForwarderRegistry.from_config(GatewayConfig) wires the three cloud forwarders with env-driven default keys; envelope-carried api_key / base_url always wins over the config so a split deployment where the air-gapped side owns the secrets works unchanged. GatewayConsumer.run_once() / run_forever() pair each request with a ResponseEnvelope by correlation id; forwarder errors, unknown providers, malformed payloads, and arbitrary exceptions all funnel into error envelopes rather than taking down the loop. FastAPI /health endpoint lives behind the [server] extra so a bare worker process does not pull fastapi + uvicorn. README documents the provider table, env vars, and the no-core-import invariant. 25 tests (respx-mocked httpx for happy path / missing key / base_url override / 5xx retry / OpenRouter branding headers / unknown provider / registry substitution + file-queue + consumer round-trip for success / error envelope / unknown provider / malformed payload / unexpected exception + /health smoke) all pass. Drop the stray tests/__init__.py in both packages/queue/ and packages/gateway/ — having both makes pytest collapse the two tests namespaces during combined runs.
Add RedisStreamsQueueBackend for shared-infrastructure Septum deployments (Phase 5 — Redis backend slice): Consumer-group backed queue using XADD / XREADGROUP / XACK so multiple gateway instances can consume from a single stream with at-least-once semantics. Each topic maps to one stream (septum:{topic}) and each consumer joins a named group (gateway by default); if a consumer dies mid-processing the entry stays on the pending entries list until another consumer claims it. Payloads go into a single data field holding JSON text rather than fanned-out hash fields — that way one codec works for every envelope shape and nested payloads survive the round trip without manual flattening. First publish implicitly runs XGROUP CREATE … MKSTREAM so cold streams work without operator provisioning; BUSYGROUP from a racing consumer is silently swallowed. Nack-with-requeue simulates the missing native primitive by XRANGE-reading the entry, XADD-ing a fresh copy, then XACK-ing the original. from_url("redis://…") constructor mirrors the api side. redis.asyncio stays a lazy import gated behind the [redis] extra so a stdlib-only install never touches it. 7 tests using fakeredis.aioredis cover round-trip, cold-stream XGROUP creation, two-consumer group isolation (10 messages, no duplicates), ack / nack requeue / nack drop, and non-blocking empty consume. pytest.importorskip skips the suite gracefully when neither redis nor fakeredis is installed.
Add FileQueueBackend for air-gapped Septum deployments (Phase 5 — file backend slice): Stdlib-only concrete backend that persists one JSON payload per file across incoming/, processing/, done/ sibling directories inside a per-topic root. Atomic os.replace is the entire synchronization primitive — POSIX rename is atomic within a single filesystem, so claiming a message is just "move from incoming/ to processing/ and whichever racing consumer wins the rename wins the message." Publisher writes to a .json.tmp sibling first and then atomic-renames into place so a half-written JSON blob is never visible to a racing consumer. Supports async-native publish / consume / ack / nack(requeue=bool) / close via asyncio.to_thread so the blocking directory I/O does not stall the event loop. Zero infrastructure dependency — an air-gapped deployment that ships septum-api + septum-gateway on the same volume needs nothing else, and an operator debugging a stuck request can literally ls processing/ to see which correlation ids are in flight. At-least-once delivery; callers dedupe on correlation_id. 10 tests cover round-trip, FIFO order, nack requeue / drop, restart persistence, two-consumer race (30 messages, no duplicates), idempotent double-ack, closed-backend errors, and .tmp partial-write skip.
Scaffold septum-queue with abstract transport interface and envelope models (Phase 5 — queue interface slice): New packages/queue/ package defines the cross-zone bridge between the air-gapped septum-api and the internet-facing septum-gateway. QueueBackend Protocol (publish / consume / ack / nack / close) gives both sides a backend-agnostic surface; QueueSession async context manager ensures deterministic cleanup. RequestEnvelope and ResponseEnvelope dataclasses shape every payload that crosses the boundary — JSON-serializable, correlation-id paired, mutually-exclusive text / error fields on the response. Zero runtime deps on the core package (stdlib only); concrete backends (file, Redis streams) gate behind optional extras and are lazy-imported via PEP 562 __getattr__ so stdlib-only installs never touch redis.asyncio. 5 envelope round-trip tests pass.
Add a README for packages/web/ and mark Phase 1-3 complete in PROJECT_SPEC: packages/web/ was the only modular package without a README, violating Section 12's "README güncel" Definition of Done. New doc covers the Next.js 16 / React 19 stack, install and script commands, both deployment topologies (single-container via BACKEND_INTERNAL_URL + Next.js rewrites vs. split deployment via build-time NEXT_PUBLIC_API_BASE_URL + backend-side FRONTEND_ORIGIN), the src/ directory layout, and the "no direct fetch in components" rule. PROJECT_SPEC.md flips Faz 1, 2, 3 to "✓ TAMAMLANDI" with per-item check marks; Faz 3 item 7 (modular docker-compose variants) is explicitly marked as deferred to Faz 7 where the four variants are already planned. Faz 2 item 4 is clarified as "stdio ✓, SSE ileriye bırakıldı" — Claude Code / Desktop / Cursor all connect over stdio, so SSE is unused in practice.
Tighten Phase 4 hygiene after simplify-pass review: F1 — _resolve_cors_origins (sandwiched between two app.add_middleware() calls in septum_api/main.py) moves to a dedicated septum_api/utils/cors.py module and the new resolve_cors_origins is imported at the CORS registration site, so middleware setup stays declarative and the helper is testable on its own. F2 — packages/web/src/lib/api.ts factors the trailing-slash strip into a pure resolveBaseURL(value: string | undefined) helper; the module-level baseURL calls it once. Tests now exercise the helper directly with concrete inputs instead of round-tripping through jest.isolateModules + require() to swap process.env.NEXT_PUBLIC_API_BASE_URL, so the axios instance is no longer re-instantiated three times per run. F3 — overlong block comments and narration in test_cors.py / api.test.ts trimmed; the 12-line preamble in api.ts collapses to a 6-line WHY note covering only the non-obvious bits (rewrite proxy default, build-time override, slash stripping). 18 frontend + 283 backend tests pass.
Anchor the pre-commit secrets check so tsconfig.json no longer trips it: The hook regex was (^\.env|config\.json$), which matches any path ending in config.json — including tsconfig.json and jsconfig.json. The Phase 4 move surfaced this when packages/web/tsconfig.json got staged for the rename and the hook flagged it as a leaked secret. Anchored the right side to a path boundary ((^|/)config\.json$) so only top-level or directory-rooted config.json files match.
Document the Phase 4 split in README.md / README.tr.md / CLAUDE.md / PROJECT_SPEC.md: Both READMEs flip septum-web from "Planned" to "Released" in the Package Layout table, drop the "currently lives in frontend/" disclaimer, and pick up a build-time NEXT_PUBLIC_API_BASE_URL note plus a FRONTEND_ORIGIN-driven CORS note. CLAUDE.md updates the "Frontend (from frontend/)" command block to packages/web/, mentions the env var on src/lib/api.ts, and fixes the "verify version numbers against frontend/package.json" rule. PROJECT_SPEC.md checks Faz 4 off with the implemented item list. README structure parity preserved (identical heading and table-row counts across both locales).
Wire FRONTEND_ORIGIN into the FastAPI CORS allow-list (Phase 4 — CORS slice): The BootstrapConfig.frontend_origin field has lived in bootstrap.py since the wizard landed but was never read by anything — septum_api/main.py had allow_origins=["*"] hardcoded. The default flips from "http://localhost:3000" to "*" (preserving current backward-compat behavior) and a new _resolve_cors_origins helper parses the value as a comma-separated origin list, so split deployments can run FRONTEND_ORIGIN=https://app.example.com,https://admin.example.com and lock CORS down to just those two origins. Empty value or literal "*" still maps to the wildcard so a misconfigured deploy does not silently block every request. New tests/test_cors.py covers wildcard, single origin, comma-separated, and blank-segment cases (4 tests, parametrized to 6); test_bootstrap.py updated for the new default. 283/283 backend tests pass.
Make the dashboard API base URL configurable via NEXT_PUBLIC_API_BASE_URL (Phase 4 — env-driven URL): The baseURL constant in packages/web/src/lib/api.ts was hardcoded to "", locking the frontend to the same-origin proxy layout (Next.js rewrites in next.config.mjs forwarding /api/* to the backend). The new resolution reads process.env.NEXT_PUBLIC_API_BASE_URL at build time and strips trailing slashes so callers can keep concatenating ${baseURL}/api/... cleanly; unset still produces "" for the existing single-container Docker layout. Unblocks split deployments where packages/web and packages/api are hosted on different origins. Tests cover default, override, and trailing-slash normalization (Jest 17/17 pass); production build with the override set succeeds.
Move frontend/ into packages/web/ (Phase 4 — relocate slice): The Next.js dashboard relocates from the top-level frontend/ directory into packages/web/ to match the modular layout of septum-core, septum-mcp, and septum-api. git mv preserves history. Tooling references updated in lockstep: dev.sh cd's into packages/web for --setup and the dev server, the multi-stage Dockerfile copies from packages/web/ instead of frontend/, the three GitHub Actions jobs (frontend-tests, frontend-typecheck, frontend-security) now run with working-directory: packages/web, and .gitignore swaps frontend/coverage/ for packages/web/coverage/. No source-file edits — Jest still passes 15/15 from the new location, tsc --noEmit is clean, and the in-container path (/app/frontend/) is unchanged so existing volume mounts and start.sh keep working.
Document the REST API auth flows and modular package layout in both READMEs (Phase 3d): New "REST API & Authentication" section on both README.md and README.tr.md covers JWT login, API key creation/use/list/revoke via POST /api/api-keys + X-API-Key header, and the per-route rate-limit table (login 5/min, register 3/min, key-create 10/min, default 60/min). New "Package Layout" subsection under "For Developers" lists the seven modular packages (septum-core, septum-mcp, septum-api released; septum-queue, septum-gateway, septum-audit, septum-web planned) with their zone classification (air-gapped / bridge / internet-facing). Section structure mirrored across both locales; counts diff cleanly.
Document the X-API-Key security scheme in OpenAPI / Swagger UI (Phase 3d): FastAPI auto-generates the OAuth2 scheme from OAuth2PasswordBearer declared in utils/auth_dependency.py, but the API key path lives entirely in AuthMiddleware and was therefore invisible to /openapi.json and the Swagger / ReDoc "Authorize" dialog. A custom app.openapi() override in septum_api/main.py injects an ApiKeyAuth entry under components.securitySchemes so both flows show up alongside one another. Schema-only change — no runtime behavior modified.
Add API key authentication, auth middleware, and per-route rate limiting (Phase 3c): New ApiKey ORM model with SHA-256 hashed keys, 8-char prefix lookup, per-user scoping, and optional expiry. A sys.meta_path-style AuthMiddleware resolves both JWT Bearer tokens and X-API-Key headers into a User on request.state.user, letting existing Depends(get_current_user) callsites work without edits. API key CRUD router (POST/GET/DELETE /api/api-keys) lets admins create keys shown once, list by prefix, and revoke. Rate limiter refactored from inline main.py setup into middleware/rate_limit.py; sensitive endpoints get per-route limits (login 5/min, register 3/min, key-create 10/min) and API-key requests are rate-limited by key prefix instead of IP.
Move the services layer into septum-api with a lazy aliasing shim (Phase 3b — services slice): Every service module (35 top-level files plus the ingestion, llm_providers, national_ids, and recognizers subpackages) relocates from backend/app/services/ into packages/api/septum_api/services/. The legacy backend.app.services.* namespace keeps working via a sys.meta_path aliasing finder installed by the new backend/app/services/__init__.py, which resolves each shim-path import on demand so heavy ML imports (torch, faiss, paddle, whisper) still fire only when callers actually touch those modules and every service file is represented by exactly one module object across both namespaces. The shallow pkgutil.iter_modules pattern used for Phase 3a shims is not enough here because services has nested subpackages that Python would otherwise re-import under the shim namespace, producing duplicate classes and split singletons. services/auth.py is now in septum_api, unblocking the Phase 3a utils/auth_dependency.py TODO. Follow-up: auth_dependency moves in lockstep, resolving that TODO — it lives under packages/api/septum_api/utils/ alongside the other infrastructure helpers and the legacy Phase 3a backend.app.utils iter_modules shim picks it up automatically. 266 backend tests pass.
Move the FastAPI routers into septum-api (Phase 3b — routers slice): All 14 router modules (approval, audit, auth, chat, chat_sessions, chunks, documents, error_logs, regulations, settings, setup, text_normalization, users) relocate from backend/app/routers/ into packages/api/septum_api/routers/. Each of their transitive dependencies — services, models, database, utils — already lives in septum_api after Phase 3a and the earlier Phase 3b slices, so the moves are a clean copy with no import edits. backend/app/routers/__init__.py becomes an iter_modules-based aliasing shim in the Phase 3a style: routers is a flat package with no nested subpackages, so the shallow pattern is sufficient and avoids the meta_path machinery that services needed. 266 backend tests pass.
Move the FastAPI app factory into septum-api (Phase 3b — main slice): backend/app/main.py (321 lines — lifespan, middleware wiring, top-level exception handlers, the health endpoint) relocates to packages/api/septum_api/main.py. backend/app/main.py becomes a thin wildcard re-export shim so from app.main import app (used by the test suite fixtures) and uvicorn app.main:app (used by dev.sh) keep pointing at the exact same FastAPI instance that now lives in septum_api — only one app exists in the process regardless of import path. _app_version is switched from the brittle parents[2] / "VERSION" offset to a walk-up lookup so the repo-root VERSION file is still found after the file relocation, and the /health endpoint stops duplicating the version logic and calls _app_version() instead. 266 backend tests pass.

2026-04-15

Extract septum-core package with text_utils, national_ids, and anonymization_map: First slice of the modular monolith split. The three cleanest modules move to the new air-gap-safe packages/core/ package (presidio + spaCy + pydantic + regex only, zero network deps); backend keeps thin shims that re-export from septum_core so existing imports keep working without touching call sites.
Move recognizers, policy composer, config and ports into septum-core: Second slice of the modular split. The 17 built-in regulation packs plus base_recognizer and RecognizerRegistry are relocated under packages/core/septum_core/recognizers/; PolicyComposer.compose_from_data moves into septum_core.regulations.composer while the async compose(db) stays on the backend as a SQLAlchemy-aware shim. New SeptumCoreConfig, SemanticDetectionPort, NullSemanticDetectionPort, DetectedSpan / ResolvedSpan / SanitizeResult, and RegulationRulesetLike / CustomRecognizerLike / NonPiiRuleLike Protocols give the core package a database-free, network-free contract to plug Ollama and other adapters against. The backend RecognizerRegistry subclasses the core one and injects an Ollama-backed LLMContextRecognizer factory so detection_method='llm_prompt' custom rules keep working without dragging httpx into core.
Move the detector, unmasker and the SeptumEngine facade into septum-core: Final slice of the Phase-1 monolith split. sanitizer.py (2020 LoC) becomes septum_core.detector.Detector with all four Ollama paths (_ollama_validate_pii_candidates, _ollama_pii_detection, _ollama_semantic_detection, _resolve_pronoun_coreference) replaced by a single SemanticDetectionPort dispatch; deanonymizer.py becomes septum_core.unmasker.Unmasker with the Ollama strategy kept as a backend shim; and the new septum_core.engine.SeptumEngine facade wires the detector, unmasker and recognizer registry together behind a SeptumEngine(regulations=[...]).mask(text) / .unmask(text, session_id) API with an in-memory session registry. Heavy optional imports (NERModelRegistry, Detector, SeptumEngine) are lazily resolved via a PEP 562 __getattr__ so hosts that skip the [transformers] extra can still use the lightweight composer and recognizer primitives. non_pii_filter, span_processing and ner_model_registry (plus its device helper) also move into core; backend keeps thin re-export shims. New packages/core/tests/ gains 19 native unit tests covering the unmasker, the composer's duck-typed Protocol surface and the engine round-trip without touching the backend database.
Ship the septum-mcp package as a standalone MCP server for Claude Code / Desktop / Cursor: Phase 2 of the modular split. New packages/mcp/ wraps septum-core behind a stdio MCP server (official mcp SDK) with six local tools — mask_text, unmask_response, detect_pii, scan_file, list_regulations, get_session_map — that never touch the network; scan_file covers .txt / .md / .csv / .json / .pdf / .docx via pypdf + python-docx. SeptumEngine gains a public get_session_map() accessor and a TTL-based eviction loop so long-running MCP subprocesses don't accumulate anonymization maps, with engine construction deferred to the first tool call so idle cost stays near zero. Both root READMEs gain a matching "MCP Integration" section and a new Key Features bullet pointing editors at the Claude Code configuration snippet in packages/mcp/README.md.
Extract septum-api infrastructure primitives into packages/api/ (Phase 3a): Third slice of the modular split. The bootstrap, config, database, models/, seeds/, and utils/ (except auth_dependency) modules move from backend/app/ into the new air-gap-safe septum-api package; backend/app/ now ships wildcard re-export shims for the single-file modules and auto-aliasing __init__ shims for the subpackages that register each septum_api.models.*, septum_api.seeds.*, and septum_api.utils.* submodule in sys.modules under its legacy backend.app.* path, so every existing from app.models.settings import AppSettings style import resolves through the new package without any call-site edits. Routers, services, middleware, and the FastAPI app instance stay in backend/app/ for now and migrate in Phase 3b; the full 266-case backend test suite keeps passing untouched.
Wizard regulations step defaults to all built-in packs with a select-all toggle: The wizard used to pre-check only whatever backend marked is_active=true (in practice just GDPR), so first-run users had to manually tick 16 boxes to get full coverage. The step now defaults activeRegulations to every built-in pack on load and adds a {selected} / {total} counter plus a "Select all / Deselect all" toggle above the list so deselecting is equally one-click. i18n strings added in lockstep for both locales.
Remove "Skip Setup" bypass from the first-run wizard: The welcome screen exposed a small "Skip Setup" link that silently initialised infrastructure with database_type: "sqlite", flipped setup_completed=true, and let the user bypass the whole flow — no LLM provider chosen, no regulations activated, no admin account created. Dropped the handleSkip callback and the welcome-screen link; the setup.nav.skip i18n keys in both locales are removed. Start is now the only way forward.
Lazy-seed AppSettings on first read so partial bootstraps stop 500-spamming: When the lifespan skipped init_db() but the admin was later created via /api/auth/register, the users table existed without the AppSettings.id=1 row. GET /api/settings raised 500 "Application settings have not been initialized" and the Error Logs UI filled with cascading identical 500s. Extracted the default-row construction into database.build_default_app_settings() as the single source of truth, consumed by both _seed_defaults at startup and load_settings on demand. Fresh DBs now self-heal on the first read; the wizard's GET/PATCH/test-llm/test-local-models flow keeps working unchanged.
Capture stack traces for 5xx HTTPExceptions in the Error Logs UI: http_exception_handler was routing every HTTPException through log_backend_message, which writes stack_trace=None and exception_type=None, so clicking "Detay" on a 500 landed on a view with no stack frames and no pointer to where the exception was raised. 5xx HTTPExceptions now go through log_backend_error, which walks exc.__traceback__ via traceback.format_exception and persists the full stack. 4xx responses still use log_backend_message at WARNING level because the stack is noise there rather than signal.
Serve the NER default-model map from the backend instead of hardcoding it twice: The NER Models settings tab held its own hardcoded copy of the per-language HuggingFace model map, and it drifted out of sync the moment the backend upgraded its own DEFAULT_MODEL_MAP (the 2026-03-12 XLM-RoBERTa refresh) — the tab advertised six wrong model IDs, including one that was literally an English model listed under French. New GET /api/settings/ner-defaults exposes NERModelRegistry.DEFAULT_MODEL_MAP as the single source of truth; the frontend NerModelsTab fetches it on mount with a loading and error state, the duplicated NER_MODEL_DEFAULTS constant is deleted.

2026-04-14

Restore the NER Models settings tab: The per-language HuggingFace model-ID override tab was hidden in Phase 1 productization while the backend ner_model_overrides field and the NerModelsTab.tsx component stayed live but orphaned. Re-linked the component from settings/page.tsx — import, tab entry, and switch case — so operators can swap a noisy NER model for a specific language without a code release. All strings and the backend wiring were already in place.
Drop LOCATION entirely from both NER and Presidio outputs: Multilingual NER models mis-tag common nouns and form-field headers as LOC in every language Septum supports ("Doğum", "İş", "TARAFLAR" in Turkish; equivalents elsewhere), and Presidio's built-in SpacyRecognizer running en_core_web_sm over non-English text produces the same class of stochastic GPE → LOCATION mis-fires on any Title Case OOV token. Chasing these per-language with stopword lists or gazetteers cannot scale across 50+ locales. _map_ner_label now returns None for LOC, and _filter_presidio_results unconditionally drops LOCATION — address PII is captured exclusively by the deterministic StructuralAddressRecognizer and per-regulation POSTAL_ADDRESS / STREET_ADDRESS recognizers. Under GDPR Art. 4(1) a place name alone does not identify a person; the identifying anchor is PERSON_NAME, which the NER layer still detects.
Fix blank /redoc page by pinning Redoc to the stable v2 CDN: FastAPI's default /redoc embeds https://cdn.jsdelivr.net/npm/redoc@next/... — the @next dist-tag points at the unstable Redoc v3 alpha and currently renders blank. Disabled the default via redoc_url=None and added a custom /redoc route that calls get_redoc_html with redoc@2/bundles/redoc.standalone.js pinned to the stable line.
Per-entity-type colour palette across every PII rendering surface: entityColors.ts grew from 6 to 16 distinct Tailwind colour keys with an ordered regex classifier that maps every common PII family to its own colour; unknown types fall back to a deterministic hash % 16. The Document Preview filter chips, detected-entities list, and inline highlights all share the same per-type colour.
Align Document Preview entity chip counts with the detected-entities list: The filter chip bar counted distinct values in anon_map.entity_map while the right-hand panel counted entity_detections rows, producing mismatched totals. Chip counts now derive from detections whenever detection rows exist; the anon-map summary is only a legacy fallback.
Show the assembled-prompt preview in the read-only approval review modal: ApprovalModal's state-sync effect was gated on sessionId, which the review modal always passes as null, so the "Bulut LLM'e gönderilecek tam prompt" panel always showed the empty-state hint. Guard relaxed to !open so the sync runs on every modal open; live refresh is still independently gated so read-only mode never hits the backend.
Keep chat SSE alive while waiting for user approval: With the approval gate enabled, the chat stream awaited gate.wait_for_approval with no bytes flowing, so a slow decision let Next.js' proxy drop the idle socket and the post-approval events never reached the browser. The wait now runs in a heartbeat loop that yields a : keepalive SSE comment every 15s to keep the socket hot.
Reorder imports to satisfy ruff I001 and retrigger v0.1.10 Docker build: Yesterday's [release] push never built the image because ruff check failed on three I001 import-order violations in approval.py, chat.py, and the LGPD recognizer pack. ruff --fix reordered the imports (no runtime change) and the [release] marker retriggers the publish workflow.

2026-04-13

Stop PHONE and validator drops from silently eating NATIONAL_ID: Two independent bugs made national IDs disappear — ExtendedPhoneRecognizer greedily matched any 11–13 digit identifier as a phone and outranked NATIONAL_ID in dedup, and ValidatedPatternRecognizer dropped every checksum-failing span so synthetic test IDs never reached the pipeline. Tightened the phone regex, added an entity-type priority tiebreaker in deduplicate_spans, and gave RegexPatternConfig a new fallback_score so checksum-failing IDs survive at a reduced score. Cumulative detection count on the 26-doc multilingual fixture jumped from 1098 to 1311 (+19%).
Unblock setup wizard with bootstrap-mode auth relaxation: The earlier RBAC commit locked every wizard endpoint behind require_role("admin"), but the wizard runs before any admin exists. New require_admin_or_bootstrap / require_user_or_bootstrap dependencies relax enforcement while users is empty and setup_completed is false, then snap back to strict mode once the first admin is created.
GDPR and LGPD packs gain context-gated country-specific national-ID recognizers: Added format-descriptive detectors for German Personalausweis / Steuer-ID / Rentenversicherung, Spanish DNI / NIE / Seguridad Social, French NIR / SIREN in the GDPR pack, and a civil-identity-document detector for the Brazilian RG in the LGPD pack. All use narrow_to_group=1 so only the identifier value lands in the reported span.
Broaden IBAN / DOB / TAX_ID detection across locales: IBAN recognition now accepts both space-grouped and compact forms with a format-only fallback for synthetic values; DOB learned month-name dates in en/de/fr/es/pt/it/tr; TAX_ID is now a six-family alternation covering EU VAT, CIF, Steuernummer, Inscrição Estadual, Steuer-ID, and the compact form. Added parent-type alias expansion so declaring BANK_ACCOUNT_NUMBER no longer hides the IBAN recognizer.
Detection refresh script for existing documents: New scripts/reprocess_entity_detections.py re-runs the current sanitizer over already-ingested chunks, rewrites the entity_detections rows and the encrypted anon_maps/*.enc payloads, and updates documents.entity_count — all without re-extracting PDFs or rebuilding indexes.
Every built-in regulation now ships a first-class recognizer pack: Septum advertised 17 regulations but only GDPR, HIPAA, and KVKK had recognizer packs — the rest silently fell through to the baseline. Added packs for the remaining 14 regulations, each with at least one region-specific structural or checksum-validated detector; DPDP finally wires the long-orphaned AadhaarValidator.
Dead code cleanup and frontend dependency realignment: Removed the empty backend/app/schemas/ package and two orphaned frontend utilities; dropped clsx and tailwind-merge. Downgraded eslint from ^10.0.3 to ^9.39.4 to resolve the ERESOLVE peer conflict with eslint-plugin-react-hooks@7.0.1.
README lists all 17 built-in regulation packs in a table: Both READMEs only named 12 regulations inline; added a collapsible table listing every pack with region flag, pack code, and full name.
Document preview authenticates via blob URL instead of naked <iframe src>: Raw document pane showed "Not authenticated" because <iframe>, <img>, and <audio> cannot carry the axios Authorization header. Preview now fetches the raw file through the typed axios client as a Blob, wraps it with URL.createObjectURL, and revokes it on unmount.
Baseline sanitizer is now regulation-agnostic; KVKK-specific NATIONAL_ID detection lives in the KVKK pack: ValidatedNationalIDRecognizer defaulted its validator to TCKNValidator, meaning every sanitizer ran Turkish-specific checksum logic even under GDPR or HIPAA. Deleted the baseline class and moved TCKN-aware detection into the KVKK pack.
National ID detection now narrows the span to the digits and runs the TCKN checksum: The KVKK context recognizer's regex had no capture group so the reported span included the keyword prefix, and ValidatedPatternRecognizer never actually called the algorithmic validators. Added narrow_to_group and algorithmic_validator hooks; Presidio benchmark recall climbed from 94.4% to 95.7%.
Role-based access control enforced across every router: The role column on users was cosmetic — only /api/users/* honoured it. Every router now declares a concrete auth dependency: reads require get_current_user, document/chunk writes require admin or editor, and settings/regulation/audit mutations require admin.
User management with admin-only CRUD, self change-password, and role-gated navigation: New /api/users router (list/create/update/reset-password/delete) gated on admin, plus /api/auth/change-password for self-service rotation. Self-signup is now a pure bootstrap path — first user becomes admin, subsequent calls 403.

2026-04-10

README trust badges, star CTA, and Star History chart: Added CI status, Docker image version, GitHub stars, and MIT license badges, a star CTA, a Support the Project section, and a live Star History chart. Mirrored into README.tr.md.
"See It in Action" screenshot gallery with animated demos: Added three optimised slideshow GIFs (setup wizard, approval flow, document preview) and a collapsible grid of six configuration screens. 27 new PNGs replace the 11 legacy screenshots.

2026-04-09

Claude Code project skills migrated to directory format: Current Claude Code versions only auto-load project skills at .claude/skills/<name>/SKILL.md, so /security-scan, /new-regulation, /new-recognizer, and /new-ingester were returning Unknown skill. git mvd all four to the directory layout.
README clarification — chat messages are also sanitized: Both READMEs framed PII protection only around uploaded documents, but _sanitize_query runs user messages through the same pipeline. Intro, before/after example, and Local PII Protection bullet now make the dual coverage unambiguous.
Document Preview entity highlight states: Reworked the sanitized-content highlights so every detected entity is always visibly marked — default state is a thin coloured outline per entity type, and the navigation-focused entity gets a saturated filled background. Added src/lib to tailwind.config.js content so the JIT actually compiles entityColors.ts class strings.
Next.js 10MB body cap silently 500'd large uploads: Next.js 16's default middlewareClientMaxBodySize (10 MB) truncated large audio/PDF/image uploads, killing the backend connection and surfacing as a generic 500. Bumped to 500 MB in next.config.mjs.
HTTPException failures now land in the Error Logs UI: http_exception_handler translated HTTPExceptions to JSON but never wrote to errorlog, so any raise HTTPException(400, …) was invisible. Now logs every 4xx/5xx (404 excluded as healthcheck noise); frontend upload handlers also stop swallowing errors and forward failures via sendFrontendError.
NER pipeline thread-safety + ingestion errors land in Error Logs: ner_model_registry.get_pipeline was not thread-safe under parallel ingestion — two workers could trigger the lazy transformers.pipeline import and the second thread hit ImportError. Hoisted the import to module level and added a double-checked threading.Lock; background ingestion failures now also write to errorlog.
PostgreSQL deploy fixes — missing migration + tz-naive timestamp columns: First real PostgreSQL deployment surfaced two SQLite-masked bugs: use_ollama_semantic_layer had no Alembic migration (backfilled as 011), and five model files declared created_at/updated_at as bare DateTime which asyncpg refused to populate. Migration 012 ALTERs 9 columns to TIMESTAMP WITH TIME ZONE.
Approval modal 3-column layout: Restructured ApprovalModal.tsx into a 3-column grid — masked prompt + regulations + entity summary on the left, editable chunks in the middle, full assembled prompt on the right. Collapses to a single column on small screens.
Relevance-based chunk filter + full assembled-prompt preview in approval modal: _retrieve_chunks now max-normalises RRF scores and drops tail candidates below a 0.4 threshold. A shared _assemble_user_prompt helper builds the exact masked prompt that will be sent to the LLM, and a new /preview-prompt endpoint lets the approval modal refresh the preview as the user edits chunks.
Approval modal no longer rubber-stamps the entire document: Removed the FULL_DOCUMENT_CHUNK_THRESHOLD = 100 shortcut that shovelled every chunk into the prompt for small documents, defeating the whole point of per-chunk curation. Chat now always runs hybrid retrieval.
Approval gate timeout (no more infinite chat hangs): ApprovalGate.wait_for_decision now honours an optional timeout and auto-rejects with timed_out=True when it fires. Driven by a new app_settings.approval_timeout_seconds (default 300).
Chat per-phase timing logs: Added a _phase_timer(session_id, phase) async context manager that emits one chat phase session_id=… phase=… elapsed_ms=… log line per phase. When a chat hangs, grep the session id and see which phase was last.
Background ingestion PendingRollbackError masking transient SQLite locks: _run_full_background never committed its detected_language / ocr_confidence assignment, so a later autoflush tried to UPDATE documents while another worker held the write lock. Added an explicit commit and a _record_background_failure helper that rolls back cleanly.
Error Logs stack trace copy button: Added a CopyButton next to the "Stack trace" label in the Error Logs detail row.
Document Preview entity list panel: Added a third "Detected entities" column that lists every entity from the active filter — clickable rows scroll to the matching highlight and jump to the source PDF page.

2026-04-08

Docker proxy timeout fixes: Chat SSE endpoint returns StreamingResponse immediately (all pre-processing moved inside the generator), and document upload/reprocess run ingestion in background tasks so responses return instantly. Fixes socket hang up errors when proxied through Next.js.
Document processing progress: Real-time progress bar in the document list during sanitization and indexing, with an animated pulse indicator during OCR/Whisper. New GET /api/documents/progress endpoint.
Whisper download progress fix: Byte-level progress tracking replaces unreliable file-size polling; document upload reports byte-level progress via onUploadProgress.
Docker image size reduction (~6 GB): CPU-only PyTorch eliminates unused NVIDIA CUDA/Triton libraries, dropping image size from ~17 GB to ~6 GB. README includes a Docker vs Local comparison table.
Model cache persistence: New septum-models Docker volume persists Whisper, HuggingFace, and PaddleOCR models across container recreations.
SQLite WAL mode: Enables concurrent reads/writes and prevents database is locked errors during parallel document processing.
Trailing slash redirect fix: Routes using "/" changed to "" in error-logs and audit routers, preventing FastAPI 307 redirects that leaked the internal backend URL.
Orphaned document cleanup: Documents stuck in processing status after a server restart are now automatically marked failed on startup.
API baseURL fix: Axios instance uses an empty baseURL consistently, preventing SSR URL leaks to the client.
Multi-arch Docker image (amd64 + arm64): Docker image now builds for both architectures. Apple Silicon Macs run natively without x86 emulation, eliminating the 5–10x ML performance penalty.
Chat performance overhaul: Chunk masking at chat time uses pure string replacement against the document's existing anon map (no model calls per query). Query sanitization defaults to enable_ollama=False since alias/pronoun layers added seconds of latency. Typical chat round-trip is now seconds instead of minutes.
Sanitizer Ollama validation: Ollama responses are now filtered against word-boundary, ambiguity, fragment-length, ALL-CAPS heading, and ID-shape heuristics so small models stop polluting the anon map. The semantic detection layer is opt-in via use_ollama_semantic_layer (default off).
Whisper model cache: AudioIngester caches the loaded Whisper model at the class level so subsequent uploads reuse the in-memory weights.
PaddleOCR detection upgrade: Switched detection from PP-OCRv5_server_det to PP-OCRv5_mobile_det. Significantly faster on dense layouts and empirically catches more text regions with no accuracy loss.
Parallel document upload: Frontend uploadDocuments runs up to four uploads concurrently via a worker pool, reporting overall byte progress across all in-flight files.
Next.js compression disabled for SSE: compress: false in next.config.mjs so the proxy layer no longer gzip-buffers chat streaming events.
Audio transcription preview removed: The dedicated transcription button and modal mode were redundant — the standard preview already shows the transcript.
First-class Ollama LLM provider: Added a dedicated OllamaProvider and registered it in the LLM provider factory so llm_provider="ollama" goes through the normal provider path instead of erroring and falling back.
Background ingestion concurrency cap: New _INGESTION_SEMAPHORE (max 2) limits concurrent ingestion pipelines; beyond two jobs, SQLite write contention and NER model GIL contention hurt throughput.
SQLite tuning for parallel writes: Bumped busy timeout from 5s to 30s and set synchronous=NORMAL so bursts wait out short write locks instead of failing immediately.

2026-04-07

Single-port turnkey Docker deployment: All traffic served through port 3000 — port 8000 no longer exposed. Next.js rewrites proxy /api/*, /docs, /health, /metrics to the backend. --add-host=host.docker.internal:host-gateway added for cross-platform Ollama host access.
Ollama auto-detection: New GET /api/settings/ollama-probe tries host.docker.internal, ollama, and localhost (port 11434) and returns the first reachable URL; setup wizard auto-probes when Ollama is selected.
Ollama model discovery: Setup wizard and settings page list locally installed Ollama models in a searchable combobox. Users can also type a custom name. New GET /api/settings/ollama-models endpoint.
SSE streaming fix: PrometheusMiddleware converted from BaseHTTPMiddleware to pure ASGI middleware — BaseHTTPMiddleware was cancelling long-lived SSE connections and causing Connection closed errors under Next.js rewrites.
Media format support: video/mp4, audio/mp4, audio/m4a, audio/x-m4a added to accepted MIME types so phone audio recordings saved as .mp4 are ingested as audio.
Security hardening: SSRF protection on Ollama URL endpoints (hostname allow-list), ollama-pull timeout bounded at 10m, Content-Disposition filename sanitized to prevent header injection.
Docker & defaults improvements: CORS middleware reordered outermost, allow_credentials removed, Next.js standalone binds 0.0.0.0, sidebar version fetched from the backend, default privacy toggles now true, default OCR languages en,tr,de,ru,fr, ./dev.sh --reset flag, upload timeout raised to 5 minutes.

2026-04-06

Full entity type coverage — 37/37 regulation entity types now detectable: Added 12 new Presidio pattern recognizers (DATE_OF_BIRTH, MAC_ADDRESS, URL, COORDINATES, COOKIE_ID, DEVICE_ID, SSN, CPF, PASSPORT_NUMBER, DRIVERS_LICENSE, TAX_ID, LICENSE_PLATE) with multilingual context keywords. 9 semantic types now detected via the Ollama layer; Japanese NER model upgraded from base BERT to Davlan XLM-RoBERTa. README gains an honest "Detection Coverage & Limitations" section.
ALL CAPS PII detection and organisation name support: NER layer auto-normalises ALL CAPS text to title case before running XLM-RoBERTa; Turkish İ lowercasing fixed. ORGANIZATION_NAME is a new NER entity type with a false-positive filter.
Cross-device API access: Frontend API base URL now auto-resolves from the browser hostname when NEXT_PUBLIC_API_URL is not set, so phones and other devices on the same network can reach Septum without manual env configuration.
Ollama model discovery (chat + de-anon + LLM fields): Setup wizard and settings page list locally installed Ollama models in a searchable combobox across all three model fields.
Setup wizard v2 — complete onboarding flow: 8-step wizard (Welcome → Database → Cache → Provider → Regulations → Whisper → Create Admin → Done) with inline test + auto-advance, popularity-sorted regulations, Whisper SSE progress, and first-user registration built into the final step.
Settings page restructure: Cloud LLM and Local Models merged into a single "LLM Provider" tab; new Infrastructure tab for database + cache; Swagger/API Docs links and logout moved to the top-right header.
Docker improvements: VOLUME directive for data persistence, VERSION file baked into the image, Docker Hub description auto-updated from README on release, paddlepaddle pinned to 3.2.2 for Linux ARM64.
Update notification: GET /api/setup/check-update checks Docker Hub for newer versions; sidebar shows a banner with the docker pull command.
Misc fixes: API key fields use WebkitTextSecurity to prevent browser password prompts; Whisper downloads via whisper._download() to avoid OOM; regulation activation uses the correct /activate endpoint.
README adoption improvements: Added "Who Is This For?" section, a before/after anonymisation example, and a quick curl API example.
CI-gated Docker releases and backend import hygiene: Docker Hub publish now runs only after the main CI passes and only when the head commit contains [release]. Cleaned up unused imports and import order.

2026-04-05

Docker Hub distribution: Single combined image byerlikaya/septum replaces the previous separate backend/frontend images; one pull, one run. docker-compose.yml uses the combined image with PostgreSQL + Redis.
LLM API keys in AppSettings: Alembic migration 009 adds Anthropic/OpenAI/OpenRouter key columns. Settings API accepts updates and returns has_*_key booleans (never raw secrets). POST /api/settings/ollama-pull streams model-pull progress via SSE.
Zero-config setup wizard — complete .env elimination: New bootstrap.py manages config.json (auto-generated encryption key + JWT secret), the database engine is lazily initialised by the wizard, and a 6-step flow replaces the old .env requirement. docker run + wizard is now the only setup needed.
AuthGuard React 19 fix: Moved router.replace() from render into useEffect to prevent "Cannot update component while rendering" under React 19.
Chat retry path & ingestion hygiene: Chat handler skips chunk retrieval and top-k tuning when pre_approved_chunks is present. Document pipeline persists entity detections with add_all. Refactored entityColors with shared classification and badge/highlight maps.
README restructure for product positioning: Split README into a user-focused README and a technical ARCHITECTURE doc (EN + TR). README now leads with value proposition, how-it-works, a Why-Septum table, and a simplified Quick Start.
Comprehensive 3-layer PII detection benchmark: Rewrote tests/benchmark_detection.py with 1618 entities across 10 types and all 17 regulations. Grand total: 100% precision, 99.7% recall, 99.9% F1. Per-layer tables and Mermaid charts added to both READMEs.

2026-04-04

Approval data persistence & review: Approval context (masked prompt, chunks, decision) is now stored on user messages via a new approval_data JSON column (migration 008). After approve/reject, the question gets a green/red badge that opens a read-only review modal; rejected messages include a "Tekrar gönder" button that resends with pre-approved chunks.
Regulations page redesign: Replaced the stacked 3-section layout with tab navigation (Regulations / Custom Rules / Advanced). Built-in regulations render as a compact sorted list instead of a card grid; each row expands to show entity type badges.
Remove chunks page: Deleted /chunks route, ChunkCard / EntityBadge components, and 52 i18n keys. Moved entity colour utilities to lib/entityColors.ts.
Chat UX cleanup: Removed edit/delete and regenerate buttons from messages, added a copy button to user messages, added a chat history "delete all" button, and added a "reprocess all" button in the documents page.
Entity highlight UX improvements: Sticky filter bar, entity navigation with "1/N" counter and prev/next, side-by-side layout for PDF/image/audio documents, and PDF page navigation that jumps the iframe to the source page.
Audit cleanup on document deletion: Deleting a document now also removes its audit events.
Chat bug fixes: Messages no longer disappear after the first question (decoupled ChatWindow's React key from activeSessionId); debug button on restored history messages is fixed.
Chat approval — no auto-timeout: ApprovalGate waits indefinitely until approve or reject; countdown UI removed.
Debug popup redesign: Masked prompt/answer rendered with colour-coded entity placeholder badges via a shared PlaceholderText utility.
CORS cleanup: Removed localhost:3001 from FRONTEND_ORIGIN defaults.
Audit Trail v2 — entity location tracking & visual highlighting: Sanitizer now returns per-entity position data (offsets + confidence). New EntityDetection model (migration 007) stores per-chunk locations; frontend HighlightedText renders colour-coded inline highlights with an entity-type filter bar; audit cards get a "View detected entities" button.
UX extras: Anonymization map viewer in Document Preview via /anon-summary, regulation tooltips on chat pills, mobile-responsive settings sidebar, chat session rename via inline edit, message delete and edit-to-resend on user messages.
Multi-tenancy + RBAC: Added role (admin/editor/viewer) to the User model with require_role() / require_admin dependencies. Document and chat session lists now filter by user_id. Migration 006.
CI/CD expansion: Pipeline grew from 2 to 6 jobs — added backend-lint (ruff + bandit), backend-security (pip-audit), frontend-typecheck (tsc --noEmit), frontend-security (npm audit).
Structured logging + Prometheus metrics: JSON structured logging via python-json-logger; Prometheus metrics via prometheus-client — request counter/histogram plus domain counters for document uploads, chat requests, and PII entities by type. Exposed at GET /metrics.
UX polish: Escape key stops streaming in chat. Chat PDF export via jsPDF alongside JSON export. Document Preview for all formats via a new /api/documents/{id}/raw endpoint.
Phase 3 enterprise readiness: JWT auth with bcrypt + PyJWT, /api/auth/register|login|me, AuthGuard + axios interceptors, per-session user_id FK (migration 005). Rate limiting via SlowAPI (Redis → memory fallback). Async document ingestion via BackgroundTasks. Data export via /api/chat-sessions/{id}/export.
Phase 2 UX improvements: Chat regenerate button, document list filtering/sorting, bulk delete/reprocess with row checkboxes, and reusable Skeleton / ErrorWithRetry components across pages.

2026-04-03

Phase 1 productization: Setup wizard for first-time users, persistent chat history with session management (new ChatSession/ChatMessage models + sidebar), upload progress bar, save toasts, and hidden incomplete settings tabs.
GitHub Actions CI: Prune unused runner images before the backend job to avoid out-of-disk failures; cache ~/.cache/pip keyed on requirements.txt.
Docker Compose production deployment: Added PostgreSQL 16 and Redis 7 to docker-compose.yml. New docker-entrypoint.sh validates env vars, auto-generates encryption keys, and runs Alembic migrations. Backend falls back to SQLite when DATABASE_URL is not set.
Pronoun coreference resolution via Ollama: New _resolve_pronoun_coreference() layer uses a language-agnostic prompt to identify pronouns referring to already-detected persons and adds them to the anon map. Degrades gracefully when Ollama is unavailable.
README professionalization: Added Audit Trail, LLM Resilience, and API Reference sections to both READMEs; updated Technology Stack and Security sections for PostgreSQL/Redis/Alembic.
LLM provider circuit breaker: Module-level breaker (3 failures in 120s → 60s cooldown → half-open probe). LLMRouter.stream_chat skips to the Ollama fallback when the breaker is open.
Coreference: possessive form handling: New strip_possessive_suffix() supports English ('s, s') and Turkish genitive markers ('in, 'ın, 'un, 'ün). Possessive forms now resolve to the base name's placeholder.
PII detection quality metrics endpoint: New GET /api/audit/metrics returns per-entity-type distribution, regulation usage frequency, average entities per document, and coverage ratios.
Redis-based anonymization map caching: Optional Redis tier (24h TTL, plaintext JSON) between in-memory cache and encrypted disk, making multi-worker deployments feasible. document_anon_store.py refactored to an async 3-tier hierarchy with graceful degradation.
Test suite hardening: Added 22 new tests across test_pii_escape_scenarios.py, test_concurrent_anon_maps.py, and test_audit.py. Total count: 77 → 99.
GDPR/KVKK audit trail and compliance report: Append-only AuditEvent model tracks PII detection, de-anonymization, document lifecycle, and regulation changes — never stores raw PII. New /api/audit endpoints and a frontend audit log viewer with entity breakdown.
SQLite → PostgreSQL migration with Alembic: Replaced SQLite-only dialect imports with sqlalchemy.JSON. Refactored database.py to support both backends via DATABASE_URL. Removed manual _ensure_*_columns functions in favour of Alembic.
CORS configuration: FRONTEND_ORIGIN in .env now drives FastAPI CORSMiddleware.
Document reprocess endpoint + blocklist persistence + delete fix: New POST /api/documents/{id}/reprocess re-runs the full pipeline. token_to_placeholder / token_counter now persisted so coreference survives restarts. Fixed document deletion silently skipping FAISS and BM25 index cleanup.
Fix false positive LOCATION detections from Presidio and NER: Common Turkish words ("kabul", "gibi", "sözlü", "başka") were tagged as LOCATION. Added a proper-noun capitalisation check; LOCATION detections on the test fixture dropped from 28 to 4.
Fix credit card number leaking to LLM + entity coverage system: All 17 regulation seeds defined CREDIT_CARD_NUMBER but no recognizer existed (Presidio's built-in uses CREDIT_CARD). Added a new recognizer, a Presidio entity alias map, and startup coverage validation that warns about entity types with no recognizer.
Rewrite PDF OCR strategy — image-based instead of character-count: Replaced the flawed <200 char heuristic with an image-aware approach: extract text normally, then OCR only embedded images (≥150×150 px). Text PDFs now process in milliseconds.
Memory optimization — subprocess OCR + model pre-loading: Run PaddleOCR in a persistent subprocess pool so its ~1.5 GB footprint never enters the main process. Background model pre-loading at startup makes the first upload instant instead of waiting ~15s.

2026-04-02

Memory optimization — lazy loading and singletons: Reduced backend steady-state memory from ~3.1 GB to ~1.5 GB (-52%). Switched Presidio from en_core_web_lg to en_core_web_sm, lazy-imported heavy ML frameworks, and cached spaCy / NER / SentenceTransformer as process-wide singletons.
Dependency conflict fix & pre-commit validation: Fixed a langchain-text-splitters version conflict and bumped presidio-analyzer/anonymizer and the langchain ecosystem. Added pip/npm dry-run checks to the pre-commit hook when dependency files are staged.
DRY/SOLID refactoring: Backend extracted get_or_404, load_settings, detect_language, validate_regex, sanitizer_factory.py, and a BaseCustomRecognizer. Frontend extracted 7 settings tab components, useChatStream / useChatApproval / useChunkManager hooks, and shared ToggleSwitch / ErrorAlert / CopyButton / DataTable components.
OCR engine: replace EasyOCR with PaddleOCR: Switched default OCR provider for significantly better character recognition (₺ symbol, number/letter confusion), built-in layout analysis, and spatial text ordering. Removed EasyOCR entirely.

2026-04-01

Remove Desktop Assistant mode: Removed the ChatGPT/Claude desktop app OS-automation feature entirely — it did not align with the project's privacy-first middleware purpose.
Address anonymization: Added StructuralAddressRecognizer to detect postal addresses by structural cues; re-enabled NER LOC → LOCATION mapping so declared addresses are detected. Employer/company addresses are correctly excluded.
Settings override fix: Removed an environment variable override in the chat router's _load_settings() that was ignoring database-persisted LLM model/provider changes.
204 response body fix: Added response_model=None to the chunks DELETE endpoint to comply with FastAPI's HTTP 204 no-body assertion.
Regulation entity type alignment: Added missing entity types to 7 regulation seeds (pdpl_sa, australia_pa, pdpa_th, appi, pipl, nzpa, popia) based on legal research. Detailed legal basis in REGULATION_ENTITY_SOURCES.md.
Codebase compliance sweep: Removed ~17 redundant inline comments, replaced ~28 hardcoded frontend strings with i18n, fixed README file references, and mirrored .cursor/rules into .claude/rules / .claude/skills.

2026-03-19

Chat no-document flow and approval gate fix: Fixed chat behaviour when no document is selected via a dedicated no-context prompt path, and fixed the approval gate so require_approval is enforced even in no-document chats.
Sanitization pipeline overhaul — broken masking and false positives: Removed LOC → LOCATION mapping from NER, raised the NER confidence threshold to 0.85, restricted blocklist propagation to person-identifying entity types only, skipped NER for short texts, and made the Ollama alias prompt strictly person-name-focused.
Architecture rule compliance sweep: NER layer now filters against the active policy's entity types. Moved 3 inline prompt fragments into PromptCatalog, removed hardcoded language-specific terms, and replaced hardcoded language="en" with a configurable constant.
Critical: Ollama PII validation no longer strips structured identifiers: The validation layer was sending high-priority structured IDs (NATIONAL_ID, IBAN, PHONE_NUMBER) to Ollama; empty responses cleared the list entirely, dropping every detection. High-priority types now bypass LLM validation, empty responses fall back to keeping non-passthrough candidates, and adjacent PERSON_NAME spans are merged.
Repository hygiene: Added bm25_indexes/ to .gitignore.
Frontend TypeScript environment typing sync: Updated next-env.d.ts to match the current Next.js bootstrap output.

2026-03-18

Desktop mode approval gate: Desktop assistant mode now shows the same approval modal as Cloud LLM when require-approval is enabled.
Query-time PII sanitization and validation layer: Chunks remain raw in the DB; sanitization runs at query time for both Cloud and Desktop flows with the full pipeline including the Ollama validation layer.
Ollama PII validation and JSON robustness: Added a language-agnostic pii_validation_prompt in PromptCatalog. Made extract_json_array tolerant of markdown fences, multiple arrays, trailing commas, and extra text.
use_ollama_validation_layer setting: New backend setting and Privacy UI toggle to enable/disable the Ollama PII validation layer (default true).
Next.js and lockfile: Set outputFileTracingRoot in next.config.mjs to resolve the multi-lockfile warning; removed the redundant root package-lock.json.
CI test fixes: Desktop assistant factory uses lazy imports for macOS/Windows modules so Linux CI does not require pyautogui or pygetwindow. ChatWindow agent-log fetch calls guarded with a typeof fetch check so Jest tests pass.
README (EN/TR): Documented desktop mode approval gate and the optional Ollama PII validation layer; Turkish README brought into sync.

2026-03-16

Desktop Assistant Mode with RAG support: Added an optional Desktop Assistant Mode that sends the user's question (or a RAG-enabled prompt) directly to a local ChatGPT or Claude desktop client via OS automation (AppleScript on macOS, window activation + keystrokes on Windows). Extracted ChatContextPayload and build_chat_prompt so Cloud and Desktop flows share one prompt builder; feature is opt-in behind desktop_assistant_enabled.

2026-03-15

Audio transcription accuracy and fixes: Whisper now uses the selected model from settings (previously hardcoded "base") and an optional default_audio_language. Fixed "decoder produced no samples or text" by using the correct file extension from the upload MIME type; forced CPU on Apple Silicon to avoid MPS NaN logits.
Changelog maintenance: Rules updated so the changelog documents changes since last push and groups by logical development unit.
Multi-document chat: When multiple documents are selected, all are sent to the API and included in context. Backend prioritises document_ids over document_id; chunk retrieval enforces a minimum of 10 chunks per document in multi-doc mode.
Generic retrieval improvements for holistic queries: Adaptive top_k, document-theme retrieval with RRF merge, last-chunk inclusion, and a holistic-interpretation prompt so broad questions get sufficient context.

2026-03-12

Vector search async fix: Wrapped blocking VectorStore.search / hybrid_search (FAISS + ML model ops) in asyncio.to_thread() so they stop freezing FastAPI's event loop. Chat retrieval now completes in ~3s instead of timing out at ~95s.
Hybrid retrieval (BM25 + FAISS): Implemented hybrid search combining BM25 with semantic FAISS via Reciprocal Rank Fusion. Dramatically improved retrieval quality for contract queries by combining exact term matching with semantic similarity.
Table and field extraction: Added pdfplumber-based structured extraction for legal/contract documents. Chunk model now carries chunk_type, field_label, field_value, field_type; key-value pairs are indexed as separate FieldChunks.
Enhanced semantic chunking: Upgraded StructuredDocumentChunker with LangChain's SemanticChunker. Large sections now split by semantic coherence instead of arbitrary paragraph boundaries, with a fallback chain semantic → paragraph → raw.
Prompt hardening: Strengthened the chat prompt with anti-hallucination rules: only answer if the exact info is in context, respond "I cannot find that information" otherwise, never invent or merge placeholders.
Dependencies: Added rank-bm25, pdfplumber, langchain-text-splitters, and langchain-experimental.
PII detection improvements: Upgraded all language NER models to state-of-the-art XLM-RoBERTa variants. Improved the Ollama Layer 3 prompt, made blocklist include all entity tokens regardless of casing, and made Ollama span matching case-insensitive.
KVKK ruleset (6698): Expanded entity types to align with Madde 3(d) and Madde 6: added SSN, TAX_ID, COORDINATES, IP_ADDRESS, COOKIE_ID, DEVICE_ID, DNA_PROFILE, MEDICATION, CLINICAL_NOTE, SEXUAL_ORIENTATION, and financial fields.
Regulation rulesets (GDPR, UK GDPR, CCPA/CPRA, PIPEDA): Aligned entity types with official legal texts. Added FIRST/LAST_NAME, ETHNICITY, STREET_ADDRESS, CITY, COORDINATES, MAC_ADDRESS, and related fields.
Regulation entity sources doc and rule: Added backend/docs/REGULATION_ENTITY_SOURCES.md and a rule requiring updates when built-in regulation entity types change.
Sanitizer and PII pipeline: Ollama PII layer wrapped in try/except so failures don't break sanitization. NER uses a language-aware confidence threshold. Blocklist adds single-token entities so residual mentions are redacted.
Chat query sanitization: User messages are sanitised before retrieval and the LLM call — same pipeline as document text.
E2E test and ingestion errors: Turkish PII E2E test no longer skips — uses Presidio-detectable email, language + anon-map mocks, and an LLM mock. Document upload 500 response now includes ingestion_error in the detail for easier debugging.
NER model overrides: NER Models settings tab supports per-language overrides — edit the HuggingFace model ID, restore defaults per row, and save to app_settings.ner_model_overrides.
Error logging and Error Logs UI: Centralized error logging via a new ErrorLog model, error_logger service, global exception handler, and POST /api/error-logs/frontend. New Error Logs page lists, filters, and clears logs; sidebar shows an error count badge.
Document preview: Copy button shows a "Copied" state for user feedback.
Chat: Badge under assistant messages when the answer was produced by the Ollama fallback.
Docs: README (EN/TR) add Changelog and License links in the header.
Changelog and rules: Split same-day entries by commit date and updated the rule to verify the date via date +%Y-%m-%d and git log --date=short.
Backend dependencies: Bumped langchain-experimental from 0.3.6 to 0.4.1 for Python 3.13 / GitHub Actions compatibility.
Recognizer regex and E2E test: Fixed Presidio regex patterns in GDPR, HIPAA, and KVKK packs (correct \b word boundary and escapes). Set TLDEXTRACT_CACHE in conftest before imports to avoid permission errors.
Backend tests and CI: Fixed the E2E Turkish PII test in CI by patching PdfIngester._run_ocr_on_page so the text layer is preserved. Replaced deprecated datetime.utcnow with datetime.now(timezone.utc). Added pytest.ini with filterwarnings to suppress noisy deprecation warnings.

2026-03-11

OCR and PII improvements: Enhanced image/PDF OCR quality, improved OCR ingestion flow, and refined person name masking and PII handling.
Spreadsheet enhancements: Added spreadsheet schema metadata, numeric-aware chat for tabular content, and limited schema display to truly tabular documents.
Infrastructure and tooling cleanup: Unified environment loading defaults (including Ollama), and removed legacy coverage/Codecov tooling.
ODS support: Added ODS (OpenDocument Spreadsheet) ingestion support and documented it in both English and Turkish READMEs.
LLM routing and prompt catalog: Refactored the LLM router into a provider-strategy layer, introduced a document processing pipeline orchestrator, centralized all backend LLM/Ollama prompts under PromptCatalog, and added a shared AppSettings factory plus updated tests.

2026-03-10

Documentation and licensing: Expanded README content with PII pipeline and AI gateway sections, screenshot gallery, and clarified extension workflow; added MIT license and kept EN/TR READMEs in sync.
Testing and quality: Improved backend and frontend coverage, added Jest setup, fixed async engine and aiosqlite warnings, and ensured backend tests import the app package correctly.
Sanitization and PII pipeline: Hardened sanitizer structure and robustness, generalized the PII pipeline, added configurable text normalization rules and non‑PII filters, and localized deanonymization banner copy.
Chat experience: Added global i18n for chat UI, approval flow localization, chat debug tools, document‑optional chats, generic prompts, and post‑processing for malformed LLM output.
Platform and tooling: Introduced Dockerfiles and docker‑compose for backend/frontend, tracked env templates, pinned backend dependencies, and updated docs and dependencies.
UI and layout: Refined documents, chunks, and settings UIs; improved sidebar layout; and added the Septum logo across the app.

2026-03-09

Core platform foundation: Bootstrapped the Septum project skeleton with core utils, crypto, database models, and health checks.
Ingestion pipeline: Implemented ingestion base and office ingesters (documents, spreadsheets, presentations), plus image and audio ingesters with health checks.
Privacy and recognition engine: Added national ID validators and tests, a multilayer sanitizer, anonymization map with coreference handling, and a regulation‑aware recognizer registry and policy composer.
Vector store and retrieval: Introduced an encrypted FAISS vector store per document and ignored local index artifacts from version control.
Backend services and frontend shell: Added LLM router, deanonymizer, approval gate, chat pipeline wiring, settings sync, settings UI, regulations UI, documents UI, and the initial Next.js frontend shell with layout and API client.

Changelog ​

v1.0.0 — 2026-04-21 — Modular architecture ​

2026-04-28 ​

2026-04-26 ​

2026-04-24 ​

2026-04-22 ​

2026-04-21 ​

2026-04-20 ​

2026-04-19 ​

2026-04-17 ​

2026-04-16 ​

2026-04-15 ​

2026-04-14 ​

2026-04-13 ​

2026-04-10 ​

2026-04-09 ​

2026-04-08 ​

2026-04-07 ​

2026-04-06 ​

2026-04-05 ​

2026-04-04 ​

2026-04-03 ​

2026-04-02 ​

2026-04-01 ​

2026-03-19 ​

2026-03-18 ​

2026-03-16 ​

2026-03-15 ​

2026-03-12 ​

2026-03-11 ​

2026-03-10 ​

2026-03-09 ​