Changelog
All notable changes to this project are documented here in a high‑level, date‑based format.
v1.0.0 — 2026-04-21 — Modular architecture
First major release. Septum is now seven independently installable packages across three security zones; the monolithic backend/ is gone.
- 7 modules:
septum-core(PII engine, zero net deps),septum-mcp,septum-api,septum-web,septum-queue(file / Redis bridge),septum-gateway(cloud LLM forwarder — cannot import core by invariant),septum-audit. - 4 compose topologies: standalone (SQLite) · full dev stack · air-gapped zone · internet-facing zone.
- Auto-RAG routing + MCP over stdio / streamable-http / sse + Audit Trail v2 (every detection links back to its audit event).
- 17 regulation packs with canonical
RegulationIdregistry, national-ID checksum validators, legal-sources doc. - Honest benchmark: 3,468 values × 16 languages + 5 external datasets
- adversarial pack. Combined F1 96.6%, 0.00 FP/1k on clean text. Full methodology in
docs/benchmark.md.
- adversarial pack. Combined F1 96.6%, 0.00 FP/1k on clean text. Full methodology in
- Security: Redis AUTH, parameterised compose passwords (
${VAR:?}fail-fast,.env.example),pickle→jsonin BM25, 18 security-scan findings addressed. - Docker: six multi-arch images, CPU + GPU variants for torch-dependent ones, git-tag-driven release workflow.
Breaking
from backend.app.*→from septum_api.*.- Compose files require
POSTGRES_PASSWORD/REDIS_PASSWORDin.env(see.env.example); no dev defaults.
Date-based ledger below has the full incremental history.
2026-04-28
- Critical fix — Ollama validation was silently dropping deterministic detections:
use_ollama_validation_layergated every span outside the high-priority identifier set through Ollama and kept only the LLM's "validated" subset, so when the model omitted a real address, email, date of birth, or customer reference, the recognizer's correct hit was discarded. Reproduced on a real KVKK consent form: 19 detections persisted out of 24 the recognizers actually produced — every "Adres :" line, the second(KVKK) :email, the secondDoğum Tarihi :, and both customer references were stripped._SEMANTIC_VALIDATION_PASSTHROUGH_TYPESnow covers every deterministic type (POSTAL_ADDRESS, STREET_ADDRESS, EMAIL_ADDRESS, DATE_OF_BIRTH, URL, IP_ADDRESS, MAC_ADDRESS, COORDINATES, COOKIE_ID, DEVICE_ID, CUSTOMER_REFERENCE_ID); only NER-driven types (PERSON_NAME / LOCATION / ORGANIZATION_NAME) reach the validator. - KVKK customer-reference recognizer + new entity type:
Müşteri No: ETP-2021-00489-style alphanumeric reference IDs were not recognised under any pack; KVKK Md. 3's broad personal-data definition covers them. NewCUSTOMER_REFERENCE_IDentity type plus a Turkish-label pattern recognizer that handlesMüşteri / Üye / CRM / Hesapcues, optional(varsa)qualifiers, and the double-colon spacing PDF extraction sometimes produces. - Per-language NER ensembles:
NERModelRegistry.DEFAULT_MODEL_MAPnow carries an ordered list of HuggingFace model IDs per language; pipelines are cached by model id so a model shared across languages (Davlan wikiann across 11 locales) loads once. The Detector NER loop iterates every pipeline, applies the ALL-CAPS title-cased re-run per pipeline, and unions detections via a(start, end, entity_type)seen-set. Default Turkish ensemble:akdeniz27/xlm-roberta-base-turkish-ner+savasy/bert-base-turkish-ner-cased— different architectures catch different rare surnames the XLM-RoBERTa encoder underweights. Other locales stay single-model to bound memory cost; users opt into ensembles per language by entering a comma-separated list in the NER override input. - Settings UX audit: dropped the unused
ollamade-anonymization strategy (cloud LLM preserves placeholders verbatim — deterministic Unmasker is faster, predictable, and free of hallucination risk); removed the deadextract_embedded_images/recursive_email_attachmentstoggles that no ingester ever read; surfaced the missinguse_ollama_semantic_layertoggle in PrivacyTab so the only layer that detects DIAGNOSIS / MEDICATION / RELIGION / ETHNICITY / POLITICAL_OPINION is no longer silently disabled. All five sanitization-layer descriptions rewritten with cost/benefit guidance in EN and TR. - Custom-rules reference + inline UI help: new
docs/custom-rules.md(EN + TR) walks through all three detection methods with worked examples (internal project codes, codenames, medication mentions) plus the test loop, common fields, and audit-trail behaviour. Regulations page Custom Rules and Advanced (Non-PII) tabs gain inline help banners with copy-paste pattern examples and a doc link, both locales. - Toggle visual fix:
ToggleSwitchknob was rendering separated from the track in production due to inline-flex centering quirks. Rebuilt on absolute positioning withtop-1/2 + -translate-y-1/2, addedrole="switch"+aria-checkedfor screen readers, replaced the always-on border with a focus-visible ring, dropped thetext-[0px]hack. Single component change restandardises every toggle in the app (Privacy, Ingestion, RAG, Custom Rule Builder, User Form modal). - Person-name expansion no longer crosses newlines:
expand_person_name_spansskipped past\nwhile looking for an adjacent surname/given-name token, so on form-style layouts (Ad Soyad\nFatma Nur Öztürk\nT.C. Kimlik No) the name span absorbed the previous-row label. The polluted text became the entity_index HMAC value, so chat-time entity routing for that document never matched ("Fatma Nur Öztürk" → "Doğrudan yanıt (doküman kullanılmadı)" on five separate questions in our hasta-kayit reproduction). Both expansion directions now bail on the first\n/\r; inline-Latin "John" → "John Smith" expansion still works. - Chat input lag eliminated: pressing Enter used to await a
/api/chat/analyzeround-trip (~500–1000 ms of query sanitization + entity-index lookup) before the user's message rendered in the chat — every send had a visible "where did my text go" gap. The analyze now runs in the background; the user message + assistant placeholder appear synchronously. If the analyze later detects a multi-document ambiguity the in-flight stream is cancelled and the picker modal surfaces, so the rare disambiguation path still works.
2026-04-26
- Entity-aware RAG routing across documents: Auto-RAG now narrows retrieval to the documents that actually contain the queried PII (new cross-document entity index, HMAC-SHA256 keyed for privacy; relationship cache + cytoscape-based graph page; disambiguation picker when an ambiguous person/term matches several documents; intent classifier is bypassed on any entity hit so a weak match still grounds the answer rather than falling through to general chat). Multi-doc chat now unifies per-document anon maps into a single placeholder space so the LLM never sees two unrelated
[PERSON_NAME_1]entries that mean different people. - Critical fix: chunks were storing raw text under the
sanitized_textcolumn. A 1.5-year-old latent bug indocument_pipelinesaved unmasked text as "sanitized", so the chat sanitizer's "context already masked" fast-path was passing raw PII through. Pipeline now persistssanitize_result.sanitized_text, with a newchunks.raw_textcolumn behind it for the document-preview UI (preview keeps showing the original text without re-leaking it to chat). - PII detection tightening: KVKK Turkish-label recognizers for
T.C./Vergi No(closed a leak path the keyword-alternation regex couldn't reach); IBAN spans trimmed to longest checksum-valid prefix; entity normalization key now collapses whitespace variants so OCR/PDF spacing differences land in the same placeholder; per-entity NER confidence thresholds externalized into one table withPERSON_NAMElifted 0.85 → 0.80 (privacy-first recall on rare Turkish surnames); LOCATION spans absorbed by overlapping ORG / postal-address spans so "Antalya Sağlık Merkezi" stays one placeholder instead of two. - Per-document source citation in chat: SSE
metaevent now carriesmatched_documentswith per-document chunk counts; the chat bubble shows "{N} belgeden, {M} parça kullanıldı" with an expandable per-document list. Works for both auto-RAG and manual-document modes. - Per-language NER model presets in settings: NER Models tab gains one-click preset chips next to the override input so users can swap between curated alternatives (3 Turkish + 2 English) without typing HuggingFace IDs by hand. Backend
SUGGESTED_MODELStable is the single source of truth, surfaced via/api/settings/ner-defaults. - Pluggable Ollama benchmark:
OLLAMA_MODELinbenchmark_detection.pyreadsSEPTUM_BENCHMARK_OLLAMA_MODELfrom env; newscripts/benchmark_ollama_models.shruns the harness against a list of models in turn (default trio: llama3.2:3b, aya-expanse:8b, qwen2.5:14b) and persists per-model logs. Benchmark docs (EN + TR) gain an "Ollama model comparison" section with TBD numbers waiting on the host that owns the GPU + model downloads. - Database hygiene + chat resilience:
PRAGMA foreign_keys=ONwas missing on SQLite, so detection rows could outlive the documents they referenced and surface in chat as bogus matches; startup now enforces FKs and one-time purges any pre-existing orphans fromentity_detections/entity_index. Includesentity_typein theentity_indexuniqueness key so legitimate token-level hash collisions across types coexist. Approval timeout in chat now treats the timeout as a user rejection so the retry button works again.
2026-04-24
- Quickstart
$EDITORfix: Thecp .env.example .env && $EDITOR .envline in README and the installation guide broke on shells without$EDITORset (zsh tried to execute.envas a command). Split into acp+ explicit "open in your editor" comment; mirrored in the Turkish copies. - Compose files usable out of the box:
docker-compose*.ymltagged services asseptum/api,septum/web,septum/gateway,septum/audit,septum/mcp,septum/standalone— none of which exist on Docker Hub — sodocker compose upfailed with "pull access denied" for any image not already built locally. Retagged all six services to the real published names (byerlikaya/septum-*). Also fixed the Redis healthcheck in the same compose files:REDISCLI_AUTH=$${REDIS_PASSWORD}escaped the dollar sign so the in-container shell tried to readREDIS_PASSWORD(never set inside the redis container) and errored out, marking Redis unhealthy and cascading to every dependent service; switched to single-dollar so compose interpolates the password into the healthcheck string (same visibility as--requirepass, no new leak). - Published septum-web image baked against compose topology:
.github/workflows/docker-publish.ymldid not passBACKEND_INTERNAL_URLto theseptum-webbuild, so the published image defaulted tohttp://127.0.0.1:8000(single-container topology) and broke thedocker-compose.ymlmulti-container topology where the api is reachable athttp://api:8000. Added per-imageextra_build_argsto the publish matrix and wired it through the build step so future releases bake the correct proxy target into the web image. - Drop YAML frontmatter from tracked markdowns: GitHub renders YAML frontmatter as a visible table at the top of every markdown it shows, which cluttered the README and every doc page browsed through the repo UI. Stripped frontmatter from 21 files (5 root MDs + 16
docs/pages, EN + TR). Titles and descriptions moved into a centralPAGE_METAmap indocs/.vitepress/config.mjs; the existingtransformPageDatahook now injectspageData.title/pageData.descriptionplus Open Graph / Twitter card meta tags per route, so VitePress page titles, Google search snippets, and social share cards stay unique per page. Promoted the## ChangelogH2 inCHANGELOG.mdto an H1 so VitePress picks it up as the page title. - Symmetric language link on package READMEs: Every
packages/*/README.tr.mdalready carried a🇬🇧 English versionpointer back to the English file, but the English READMEs had no link in the other direction. Added a matching🇹🇷 Türkçe sürümline right below the H1 on all seven English package READMEs (api, audit, core, gateway, mcp, queue, web) so Turkish readers landing on the English page have an obvious switch. - Native arm64 builder for septum-web in CI: The septum-web image was the only one in the publish matrix that ran a heavy Next.js production build during
docker build. Under QEMU arm64 emulation that step had been stretching past two hours and OOM'ing silently on GitHub's hosted runners, blocking entire releases. Split septum-web out of the matrix into a fan-outpublish-webjob that builds linux/amd64 onubuntu-latestand linux/arm64 on the freeubuntu-24.04-armrunner, plus a follow-upmerge-webjob that assembles the multi-arch manifest withdocker buildx imagetools create. Pulled images are bit-for-bit identical to before — only the build path changes. The five other images (api, audit, gateway, mcp, septum + GPU variants) keep the QEMU path because their builds finish in seconds under emulation.
2026-04-22
- Dedicated installation guide + compose-first quickstart: Added
docs/installation.md/.tr.md— nine-section guide covering quickstart, system requirements, five supported topologies (full local stack, standalone demo, air-gapped zone, internet-facing zone, native dev), first-launch wizard, LLM providers, volumes, upgrade, troubleshooting, and uninstall. README quickstart sections shortened to one compose command + pointer at the new page;docs/readme.md/.tr.mdindex tables gain an Installation row. Compose becomes the blessed path for non-trivial installs because it ships Ollama bundled — the standalone single-container image is now positioned as the "hızlı deneme" demo rather than the recommended install. - Nav reshuffle: Top/bottom nav bars now show
🏠 Home · 🚀 Installation · 📈 Benchmark · ✨ Features · 🏗️ Architecture · 📊 Document Ingestion · 📸 Screenshots. Installation moves to second position (most-requested resource for new users); Benchmark precedes Features (the "proof before the pitch"). The📝 Changelogentry leaves every nav — GitHub's repo sidebar and release pages already surface it.
2026-04-21
- Split benchmark into its own page: Moved the benchmark section out of
docs/features.md/.tr.mdinto dedicateddocs/benchmark.md/.tr.mdpages. Added a📈 Benchmarkentry to the top + bottom nav of every markdown file and a source-link block on the benchmark pages (HF model cards, dataset papers, regulation primary sources). - Nav cleanup: Dropped the
🤝 Contributingentry from every top/bottom nav bar; the GitHub sidebar already surfaces it. - Require explicit
POSTGRES_PASSWORD/REDIS_PASSWORDin compose files: Replaced theseptum_secret/septum_redisdev defaults with${VAR:?...}— compose now fails fast if either is missing. Added.env.exampleas the canonical template; README, README.tr and CLAUDE.md updated to point at it. GitGuardian no longer flags the compose files. - Turkish literary polish on
docs/tr/features.md: Rewrote calques and awkward inversions across the Detection Pipeline, Regulation Packs, Auto-RAG, Why Septum, MCP, and REST API sections. - Per-image Docker Hub overviews: Replaced the single
DOCKERHUB.md(standalone-only, pushed to all six repos) with six role-specific READMEs underdocker/readmes/— each image now gets an Overview page that matches what's actually in it (air-gapped vs internet-facing zone badge, role-specific quick-start, transport options forseptum-mcp, etc.). Workflowreadme-filepathis now matrix-driven.
2026-04-20
- Audit Trail entity provenance: every
EntityDetectionrow now carries anaudit_event_idFK back to the event that produced it. NewGET /api/audit/{event_id}/entity-detectionsendpoint plus anentity_typequery filter (viaEXISTScorrelated subquery). Frontend adds a "Focus on these entities" button on each audit card that opens the document preview highlighting only that event's detections, with an event-scoped navigator and an entity-type filter dropdown on the log itself. - MCP over HTTP:
septum-mcpnow speaks all three standard MCP transports — stdio (default, unchanged), streamable-http, and sse. Bearer-token ASGI middleware gates non-loopback HTTP with constant-time comparison;/healthalways bypasses auth. CLI flags (--transport/--host/--port/--token/--mount-path) override the matchingSEPTUM_MCP_*env vars. Docker image defaults to streamable-http on port 8765 with a healthcheck;docker-compose.ymland.airgap.ymlgained an opt-inmcpprofile. - Repo layout + architecture diagram refresh:
ARCHITECTURE.md/.tr.mdmoved intodocs/next toFEATURES,screenshots/renamed toassets/(logo joined it), cross-file link references updated in lockstep. Replaced the Mermaid architecture diagram with hand-crafted SVGs (assets/architecture.svg+.tr.svg) — dashed zone borders, orthogonal L-routedqueue → gatewayarrow, paint-order halo labels, explicit response-path arrows — because Mermaid's auto-layout fought the 7-module topology. - Contributor onboarding: added
CONTRIBUTING.md/.tr.md(dev setup, code style, PR process, security reporting) so GitHub surfaces the Contribute button. Extracted every screenshot into newdocs/screenshots.md/.tr.mdso README and FEATURES stay text-focused. README picked up a Roadmap section. - Simplify-pass cleanup after feature reviews:
IN (SELECT DISTINCT)→EXISTScorrelated subquery in the audit filter; N individualEntityDetectionUPDATEs consolidated into one bulk statement; the new endpoint returnsEntityDetectionListResponsefor shape parity with its sibling;septum_mcp/config.pyreusesparse_active_regulations_envinstead of re-implementing it;DocumentPreviewuseEffect split so modal reopens don't refetch chunks / anon-summary / schema; orphanaudit.card.viewEntitiesi18n keys deleted. - Readability pass on
docs/tr/features.md: two rough spots introduced while drafting the Turkish deep-dive today. "Seçim retrieval'ı sürdürür" — I meant "drives" but picked the wrong verb; fixed to "retrieval seçilen dokümanlarda çalışır". "Doküman İngest" — mixed Turkish-dotted İ with an English stem that reads as a neologism; dropped to "Doküman Ingest" so the English tech term stays clean. No other calques detected after a full sweep. - Clean up pre-existing anglicism calques in
ARCHITECTURE.tr.md: five calques that pre-date today's auto-RAG / TR-term commit but read as forced translations to the target audience. "ek bileşen / ek bileşeni" (calque of "extra") → "extra" / "extra'sı" — Python devs writepip install pkg[extra]in every language, so Turkish tech writing does the same. "Sözleşme gereği sıfır ağ bağımlılığı" (calque of "zero network deps by contract") → "Kod seviyesinde ağ bağımlılığı yok". "Boşta duran maliyet sıfıra yakın" (calque of "idle cost near zero") → "Boşta dururken neredeyse hiç maliyet üretmez". "Sıfır runtime bağımlılığı" → "runtime bağımlılığı yok". "Servis tanımları tekrarı kaldırılmıştır" (calque of "dedupes service definitions") → "servis tanımları tek yerde tutulur". Also fixed a "extra'sınin" double-genitive typo that the globalek bileşeni→extra'sıreplacement introduced. - Update ARCHITECTURE docs for features that landed after the Phase 8 rewrite + stop translating technical zone terms in TR: Auto-RAG routing (
103c6a0, 2026-04-18), the 17-pack default behavior (768d10b), and the core-side canonical regulation registry (6eb2335+7bcfca0) were never documented inARCHITECTURE.md/ARCHITECTURE.tr.md— the last substantive update (1ffa995) pre-dated them. Three surgical edits in both locales: (a) the Policy Composition section now notes theRegulationIdStrEnum +BUILTIN_REGULATION_IDStuple +parse_active_regulations_envhelper and the all-17-packs default in the standaloneSeptumEngine; (b) the AI Privacy Gateway section gains an "Auto-RAG routing" subsection describing the three chat paths (manual / auto / none), the Ollama intent classifier, therag_relevance_thresholdsetting, and therag_mode+matched_document_idsSSE meta fields; (c) theseptum-corepackage internals entry spells out per-packENTITY_TYPESconstants and locates the canonical registry inrecognizers/__init__.py. Across all TR docs (README.tr.md, ARCHITECTURE.tr.md), "hava boşluklu" and "internet-yönlü / İnternete açık" are reverted to "air-gapped" and "internet-facing" — the same rule the codebase already applies to "framework" / "gateway" / "worker" / "broker": technical zone terms stay English in Turkish tech writing; forced calques like "hava boşluklu" read as jokes to the target audience. Heading count still matches 1:1 across EN/TR. - Split READMEs into a slim overview +
docs/features.mddeep-dive: Both READMEs were 768 / 772 lines of dense prose — too long to scan and mostly duplicated what belonged in feature reference docs. Trimmed to ~275 lines each: hook, five-step flow, the single 7-module architecture mermaid, a compact feature list, two money-shot screenshots (setup wizard + approval gate), and a Docker quick start. The 17-regulation table, detection benchmark, Auto-RAG walkthrough, Why-Septum comparison, MCP integration, REST API + auth reference, and the full UI gallery (document preview GIF + 5 settings PNGs + audit trail) moved to newdocs/features.md+docs/tr/features.md. The Turkish versions were rewritten as native Turkish (not a word-for-word translation): sentence order, idiom, and mermaid diagram labels — "Kullanıcı sorusu", "Maskeli istek", "Maskeleme + Map", "Köprü" — all now read naturally instead of betraying an English source. Net: 1540 lines of README became 1380 lines across two files per locale, with mermaid diagram count going from 5 to 8 and with the navigation structure matching exactly across EN/TR.
2026-04-19
- Fix MCP/standalone dropping NER entity types + canonical regulation registry in core: MCP and standalone
SeptumEnginewere silently droppingPERSON_NAME/LOCATION/ORGANIZATION_NAMEbecause the 17 packs' entity lists only existed in the API seed. Moved to core as per-packENTITY_TYPESconstants plusRegulationIdStrEnum andBUILTIN_REGULATION_IDS; MCP/API defaults now load all 17 packs, cross-pack duplicate recognizers are deduped on load (46 → 29 per mask call), and every downstream import goes through core — including a sharedparse_active_regulations_envhelper that replaces three duplicated env-parsing blocks.PATCH /api/settingsnow rejects typo'd regulation ids (previously stored silently and later filtered to an empty policy). - Dead-code sweep (~500 LOC removed): 23 unused Python imports across tests and one unused local in
benchmark_detection.py(ruff); 5 unused API client functions, 1 unused type, and 3 orphan components (theme.tsx,AuthGuard.tsx,TextNormalizationTab.tsx) from the Next.js package; emptypackages/api/documents/directory; and 9 stalebackend/*lines in.gitignoreleft behind after the Phase 8 shim removal. - Fix
dev.sh --resetleaving live processes, SQLite sidecars, and webpack cache behind: reset nowpkills the running uvicorn/next dev servers before wiping so aiosqlite worker threads release the DB handle (otherwise the next uvicorn hitdisk I/O erroron the firstPRAGMA table_info), stripsseptum.db-wal/septum.db-shmand the legacypackages/api/septum.db*copy from before the Phase 8 working-directory change, and nukespackages/web/.next+.turbo+node_modules/.cacheso a dev server that saw a source file deleted mid-session doesn't keep serving the old chunk withChunkLoadError. - Fix 403 on whisper endpoints during setup wizard: the 04-17 hardening pass guarded
/api/setup/whisper-statusand/api/setup/install-whisperwith_require_setup_phase(), but the wizard calls them after/api/setup/initializehas already flippeddatabase_configured=true— soneeds_setup()returns False and the guard rejects the very flow that needs them. Removed the guard from both whisper endpoints (downloading a public OpenAI model is not a security boundary); other guarded endpoints (test-database, test-redis, check-update) keep their guards. - Fix ChunkLoadError on first dev-server load after
--reset: the 04-17npm audit fix --forcebumped Next.js 16.1.6 → 16.2.4, which regressed the deprecated--webpackdev path with a cold-compile race where the HTML was served before its chunks finished emitting. Switchedpackages/web/package.jsondevscript to Next 16's default turbopack (NODE_OPTIONS=--max_old_space_size=4096 next dev); webpack was originally chosen in 2026-03 for lower RAM, turbopack-on-16.2 measurably fits in the same 4 GB heap. - Fix
QueuePool limit of size 5 overflow 10 reachedduring parallel document upload: the SQLite async engine was using SQLAlchemy's default pool (5 + 10 = 15 connections), which saturated as soon as two ingestion background tasks pinned abg_dbsession each for the full OCR/Whisper pipeline while the frontend simultaneously uploaded 4 files in parallel and polled/documents/progress+/auth/me+/settings+/regulationsfrom a freshly opened dashboard. Raised SQLite-dialectpool_sizeto 20 andmax_overflowto 30 (50 total) inpackages/api/septum_api/database.py::_engine_kwargs; SQLite connections are just cheap file handles and WAL already serialises the real contention point (single-writer lock), so the oversized pool costs nothing. Postgres kwargs untouched.
2026-04-17
- Fix web proxy target under docker-compose: Next.js bakes
rewrites()destinations at build time, so the runtimeBACKEND_INTERNAL_URLenv var had no effect — the dashboard proxied every/apicall to its own127.0.0.1:8000instead of the api container. Converted to a build-arg indocker/web.Dockerfile; both compose files now passhttp://api:8000at build. - Add Alembic migration 013 for
use_gatewaycolumn: The Phase 5 model addition lacked a PostgreSQL migration, causingUndefinedColumnErroron the firstSELECTafter the setup wizard. Adds the column asBOOLEAN NOT NULL DEFAULT false. - Eliminate AI rule duplication across tools: Removed
.claude/rules/(5 files),.cursor/rules/(12.mdcfiles),.cursor/skills/, and thenew-ingester/new-recognizer/new-regulationskill templates.CLAUDE.mdis now the single source of truth for all AI tools; Cursor gets a 3-line.cursorrulespointer. New tool = one pointer file, zero content duplication. - Add Redis authentication + eliminate curl|bash supply chain risk: All four compose files now start Redis with
--requirepassand everyredis://URL includes the password via${REDIS_PASSWORD:-septum_redis}substitution. The standalone Dockerfile'scurl | bashNodeSource install is replaced with a multi-stageCOPY --from=node:20-alpine— no remote scripts executed at build time. - Security hardening (18 findings from
/security-scan): Bumped 8 vulnerable Python deps (cryptography, PyJWT, python-multipart, pillow, langchain-*, pdfminer-six, pytest) and fixed 7 npm picomatch vulns vianpm audit fix --force. AddedSecurityHeadersMiddleware(X-Content-Type-Options, X-Frame-Options, Referrer-Policy, Permissions-Policy). Replacedpickle.loadswithjson.loadsin BM25 retriever (eliminates code execution risk). Guarded 5 setup router endpoints with_require_setup_phase()so test-database, test-redis, whisper-status, install-whisper, and check-update return 403 after setup completes (closes SSRF vector). Changed default log level from DEBUG to INFO. Parameterized Postgres password in compose files via${POSTGRES_PASSWORD:-septum_secret}, bound Postgres port to localhost only, added*.pem/*.keyto .gitignore, and appliedos.chmod(0o600)to config.json writes. - Promote Ollama to a default service in
docker-compose.yml: Ollama now starts alongside Postgres and Redis ondocker compose upinstead of requiring--profile ollama. AddedOLLAMA_BASE_URL+ healthcheck-gateddepends_onto the api service. Newdocker-compose.no-ollama.ymloverride disables it via!reset nullfor cloud-only deployments.
2026-04-16
- CPU / GPU image variants for torch-dependent images: Split
byerlikaya/septumandbyerlikaya/septum-apiinto CPU (default, multi-arch, ~250 MB torch wheel) and GPU (linux/amd64 only, full CUDA runtime) variants via aTORCH_VARIANTbuild-arg. CPU keeps:latest+ rolling tags; GPU floats on:gpuwith-gpu-suffixed rolling aliases. Other images (web,gateway,audit,mcp) stay CPU-only —docker/mcp.Dockerfilenow explicitly callsinstall-torch.sh cpubefore the editable install ofpackages/core[transformers], so the mcp image no longer accidentally drags in ~5 GB of CUDA shared libs through the default PyPI torch wheel (GPU offers no benefit for the stdio-attached short-call pattern; users needing GPU-accelerated NER runseptum-apiinstead). Local./dev.shis unaffected — it picks up whatever torch variant PyPI serves for the host (CUDA on NVIDIA Linux, MPS on Apple Silicon, CPU elsewhere). - Git-tag based release process + modular Docker publish: Replaced the
VERSIONfile +[release]commit gating + bot auto-bump with a git-tag trigger (git tag v0.2.0 && git push --tags). Workflow now builds all six images via an 8-entry matrix, stamps OCI labels, and creates a GitHub Release with autogenerated notes.VERSIONis a build-arg;_app_version()reads the stamped/app/VERSIONor falls back to0.0.0-dev. CLAUDE.md grows a "Release process" section. - Remove the
backend/shim layer entirely (Phase 8): Seven months after the Phase 3a scaffold introduced thebackend/app/*re-export shims, every caller has migrated to directseptum_api.*imports and the shim layer is now dead weight. Phase 8 lands in six ordered slices: (6.1)backend/tests/(33 files + conftest + factories) moves topackages/api/tests/viagit mvwith history preserved, everyfrom app.X/from backend.app.Ximport rewritten tofrom septum_api.X(including string patch paths in@patchdecorators),backend/pytest.inimerged intopackages/api/pyproject.toml[tool.pytest.ini_options]filterwarnings block,BACKEND_ROOT + sys.path.insertremoved from conftest (editable install resolvesseptum_apicleanly), two__init__.pys removed to fix thetestsnamespace collision between core and api,test_policy_composer.py→test_policy_composer_api.pyrename. (6.2)backend/alembic/,backend/alembic.ini,backend/scripts/move topackages/api/withenv.py+scripts/*.py+docker-entrypoint.shimport paths rewritten toseptum_api.*. (6.3)backend/requirements.txt→packages/api/requirements.txtas a pure rename (extras split deferred to a follow-up refresh). (6.4) 10 shim files underbackend/app/**/__init__.py+backend/app/{bootstrap,config,database,main}.pydeleted (forwarders that had no callers left). (6.5) 3 Dockerfiles (api,standalone, rootDockerfile) +.github/workflows/tests.yml+dev.sh+.claude/pre-commit-check.shrepointed at the new paths.WORKDIR /app/backend→WORKDIR /app/packages/api;PATH=/app/backend/.venv→PATH=/app/.venv; standalone start.shcd /app/backend→cd /app/packages/api; CI droppedworking-directory: backendand switched Ruff/Bandit to scan all 6 package source roots. Docker builds verified: api and standalone both produce working images withfrom septum_api.main import apploading cleanly. (6.6)backend/directory removed; remaining tracked content (docs/REGULATION_ENTITY_SOURCES.md,migrations/002_add_chunk_field_metadata.sql,.coveragerc,.dockerignore) relocated or deleted; docstring sweep updated 16 recognizer files + 9 rule/skill files + ARCHITECTURE.md + ARCHITECTURE.tr.md + CLAUDE.md + both READMEs to point at modular paths. 446 / 446 backend+modular tests + 18 / 18 frontend Jest tests pass at every step. - Tighten Phase 8 hygiene after simplify-pass review: F1 — fix
.coveragercsource = app(stale shim reference) →source = septum_api; coverage runs were silently measuring nothing since the Phase-8 move. F2 — add the missing CHANGELOG entry for Phase 8 itself (project rule: changelog update in the same commit as the code change). F3 — sweep the remainingbackend/-era narrative out of ten file-docstrings, comment blocks, and skill files:septum_api/__init__.py("the real FastAPI app still lives inbackend/app/"),packages/api/pyproject.toml("heavy libraries stay inbackend/"),packages/api/README.md(Phase-3a shim narrative in "Status" block),utils/text_utils.py+services/anonymization_map.py+services/national_ids/__init__.py(docstrings still claimed "re-exports forfrom app.*imports"),.claude/pre-commit-check.sh("Both the legacybackend/apppath and the new…" comment where only one path now exists),scripts/reprocess_entity_detections.py(BACKEND_DIR var + "launched frombackend/" comment),scripts/test_ollama_layer.py("Run from backend:" usage hint),tests/test_recognizer_packs.py(docstring pointing atbackend/app/seeds/), and 4.claude/skills/*/SKILL.md+ 1.cursor/skills/*/SKILL.mdstill instructing users to editbackend/tests/test_*.pypaths that no longer exist. F4 — rootDockerfilewas a 158-line byte-identical copy ofdocker/standalone.Dockerfile(the two files contradicted each other: the standalone header said standalone was canonical, yet the root was never deleted). Removed the root copy and repointed.github/workflows/docker-publish.ymlto./docker/standalone.Dockerfiledirectly — a symlink was considered and rejected because Windows checkouts withcore.symlinks=falseturn it into a text file that breaks the Docker build. F5 —packages/api/docs/REGULATION_ENTITY_SOURCES.mdmoved topackages/core/docs/(its natural home — the 16 regulation recognizer docstrings inseptum-coreall point at it, andseptum-coreis a zero-dependency package that cannot reference api-side paths without inverting the zone wall); 16 recognizer docstrings, both READMEs, ARCHITECTURE docs, CLAUDE.md,.claude/rules/git-and-changelog.md, and.claude/pre-commit-check.shall updated in lockstep. F6 — CIbackend-testsjob installed all 6 packages (core + queue + api + mcp + gateway + audit); dropped mcp / gateway / audit sincepackages/api/testsonly exercises api's dependency closure and those packages are covered by the separatemodular-testsjob. Skipped: A2-F10 (alembiclegacy/subdir naming — cosmetic), A2-F11 (tests/factories/__init__.pyblank file — confirmed legitimate by pytest import resolution), A2-F12 (extractinstall-all-packages.shfromdev.sh+ CI — bigger refactor, low urgency), A2-F13 (narrative Dockerfile comment — marginal), A3-F2/F3/F5/F6/F7/F9/F10 (premature/microscopic for Septum's scale). - Tighten Phase 7 hygiene after simplify-pass review: F1 — extract the duplicated
_build_queue(topic)helper fromseptum_gateway/worker.py+septum_audit/worker.pyinto a singlebackend_from_env(topic)inseptum_queue/__init__.py. Redis URL vs file-dir dispatch now lives in one place; both workers shrink to a single-line backend construction.septum-queueis the natural home because it already ownsQueueBackendand both workers already depend on it — no zone-wall violation. F2 — drop deadif TYPE_CHECKING: from septum_queue import QueueBackendblocks from both workers (forward-ref only needed when the annotation is stringified, butfrom __future__ import annotationsalready makes all annotations strings). F3 — drop unusedargv: list[str] | None = Noneparameter from bothmain()entry points (speculative API surface;__main__.pyalways callsmain()with no args). F4 — shrink both worker module docstrings from 3-4 paragraphs to a single-line summary (CLAUDE.md strict rule; theSystemExiterror message already documents the env var contract at the point where operators actually hit it). F5 — drop thecurlapt-get installfromdocker/gateway.Dockerfileanddocker/audit.Dockerfileand standardize every HTTP HEALTHCHECK (both DockerfileHEALTHCHECKdirectives and matching composetest:entries) onpython -c "import urllib.request; urllib.request.urlopen('http://.../health')"— python is already in the runtime image, curl install was ~3 MB of wasted layer weight. F6 — add YAML anchors (x-gateway-base,x-audit-base+ inline env&gateway-env/&audit-env) todocker-compose.gateway.ymlso thegateway-worker/gateway-healthpair andaudit-worker/audit-apipair no longer duplicate their 10-line build / image / env blocks; ~40 lines removed from the file. F7 — drop the 3-line decorative preamble from.dockerignore(unverifiable "shave several hundred MB" claim) and the narrative# Default: run the worker. Override CMD with uvicorn...comment fromdocker/gateway.Dockerfile(restates what compose files already do). Tests: the 5test_gateway_worker.py+test_audit_worker.pycases are deleted and replaced by a singlepackages/queue/tests/test_backend_from_env.pycovering file-dir dispatch, missing-envSystemExit, and Redis URL precedence — one test file for one shared function. Skipped: A1-F2 / A2-F3run_consumer_foreverextraction (would coupleseptum-queueto consumer semantics, scope creep), A2-F5except Exceptionnarrowing aroundqueue.close()(code already logs viaexc_info=True— not truly silent; A3 confirmed no leak), A2-F10 fail-fast on missing API keys (envelope-first deployment is a legitimate pattern), A2-F15 standalone heredoc extraction (drift risk with top-level Dockerfile), A2-F17 shared Dockerfile base (per-image specialization diverges immediately after FROM), A3-F4 drop build-essential from api.Dockerfile (untested build, risk > reward). 10 files changed, +83 / −253 net. 446 backend+modular + 18 frontend tests pass (two worker test files removed; functionality covered by the 3 backend_from_env tests). - Mark Phase 7 complete in PROJECT_SPEC (Phase 7 — closeout):
PROJECT_SPEC.mdPhase 7 flips to "✓ DONE" with per-item check marks — 6 Dockerfiles (api + web + gateway + audit + mcp + standalone) +.dockerignore, 4 compose variants (full dev + airgap + gateway + standalone) verified viadocker-compose config, Docker HEALTHCHECK on every HTTP image, worker CLI entrypoints (python -m septum_gateway/-m septum_audit) + SIGINT/SIGTERM graceful shutdown. All 7 phases complete: monolithic Septum is now split into 7 independent modules (core, mcp, api, web, queue, gateway, audit), each with its ownpyproject.toml/ Dockerfile and deployable on its own. Air-gapped zone (core + mcp + api + web) has no internet access; bridge (queue) carries masked data only; internet-facing zone (gateway + audit) cannot importseptum-coreby code-review invariant. 448/448 backend+modular + 18/18 frontend Jest tests pass. - Document deployment topologies in both READMEs + CLAUDE.md (Phase 7 — docs slice): New "Deployment Topologies" subsection under "Docker Compose" in
README.md/README.tr.mdgives a 4-row table mapping each compose variant (standalone / full dev / airgap / gateway) to its host count, zone-split property, and when-to-use guidance; follow-up paragraph explains that a true air-gapped deployment runsairgap.yml+gateway.ymlon two hosts pointing at the same Redis over a VPN and that only masked text crosses the queue. Per-module Dockerfile list added so operators running custom orchestrators (K8s / Nomad / ECS) know which images they can cherry-pick.CLAUDE.mdDocker block expanded to show the three modular compose commands alongside the standalonedocker runanddocker compose up; the "refactor/modular-architecture branch" caveat is dropped (the files are now on main-track). README structure parity preserved across both locales: 79 table rows, 18 H3 headings in each. - Add 4 compose variants covering every deployment topology (Phase 7 — compose slice):
docker-compose.airgap.ymlruns only the air-gapped zone (api + web + postgres + redis);USE_GATEWAY_DEFAULT=true+SEPTUM_QUEUE_URL=redis://redis:6379/0wire the api into gateway mode so cloud LLM calls leave the host only via Redis Streams.docker-compose.gateway.ymlruns only the internet-facing zone (gateway-worker + gateway-health + audit-worker + audit-api + redis); two containers share each image — one runspython -m septum_gateway/-m septum_audit(stdio worker), the other runsuvicorn septum_gateway.main:create_app --factory/septum_audit.main:create_appfor/health+/api/audit/export.docker-compose.standalone.ymlis the all-in-one (one container fromdocker/standalone.Dockerfile, SQLite, no external services) — the simplest install. The existingdocker-compose.ymlis rewritten as the full dev stack on one host: every module + postgres + redis + optional Ollama profile; it's the fastest path fromgit cloneto a working local Septum. Every compose file hasdepends_on: condition: service_healthyso a freshupwaits for Redis / Postgres before the first consumer starts — nosleep 10shell races. All four files validate clean viadocker-compose config. - Split the monolithic image into per-module Dockerfiles (Phase 7 — Dockerfile slice): New
docker/api.Dockerfileships only the Python backend (FastAPI + SQLAlchemy + Presidio + torch); the Next.js dashboard moves todocker/web.Dockerfilewhich takes a build-argNEXT_PUBLIC_API_BASE_URLso a split deployment can point the bundled bundle at a separate api origin. Newdocker/gateway.Dockerfileanddocker/audit.Dockerfileare deliberately lightweight — no torch, no Presidio, no spaCy; justseptum-queue[redis]+septum-gateway[server]/septum-audit[queue,server]— so each internet-facing image lands around ~100 MB instead of the ~4 GB the api image needs. Critical invariant: neither gateway nor audit Dockerfile copiespackages/core/, enforcing the "noseptum-corein the internet-facing zone" rule at the image layer. Newdocker/mcp.Dockerfilebundlesseptum-mcp+septum-core[transformers]for the rare case where an orchestrator runs the stdio MCP server as a subprocess container (most users still useuvx septum-mcplocally, so no MCPHEALTHCHECK— the parent orchestrator observes subprocess exit codes). The legacy top-levelDockerfilekeeps building the combined standalone image unchanged (still published asbyerlikaya/septum:latest); a matchingdocker/standalone.Dockerfileis the canonical modular-naming copy. New.dockerignoreat the repo root shaves a few hundred MB off every build context by excluding.git/,node_modules/,.next/, backend runtime state dirs, local*.db, docs screenshots, and thedocker/+ compose files themselves (never needed inside an image). - Add worker CLI entrypoints for
septum-gatewayandseptum-audit(Phase 7 — worker slice): Newseptum_gateway/worker.py+__main__.pyandseptum_audit/worker.py+__main__.pysopython -m septum_gateway/python -m septum_auditboot a long-running process — the prerequisite for Dockerizing either module. Queue backend picked from the environment:SEPTUM_QUEUE_URL=redis://host:6379/0selectsRedisStreamsQueueBackend.from_url(...),SEPTUM_QUEUE_DIR=/srv/septum/queueselectsFileQueueBackend(...); Redis URL wins when both are set. Neither set is aSystemExitwith a clear error — an air-gapped deployment without a declared queue is almost always a misconfiguration, silent defaulting would mask it. Graceful shutdown viaSIGINT/SIGTERMhandlers that set anasyncio.Event, cancel therun_forevertask, and close every queue/sink handle in afinally. 5 new tests cover file-backend selection, missing-env →SystemExit(SEPTUM_QUEUE_URL), and Redis URL taking precedence (usingfakeredis.aioredismonkeypatched intoredis.asyncio.Redis.from_url).test_worker.pyrenamed totest_gateway_worker.py/test_audit_worker.pyto avoid the Phase 5/6 basename-collision pattern. - Tighten Phase 6 hygiene after simplify-pass review: F1 — drop a dead
if TYPE_CHECKING: passblock and anOptional[QueueBackend]import ingateway/response_handler.py;X | Noneis the codebase standard. F2 — drop_import_queue_backend()fromaudit/consumer.py; the fail-fast wrapper was redundant because any deployment withoutseptum-queueinstalled already fails earlier at thefrom septum_queue import ...caller site, and the test suite'spytest.importorskip("septum_queue")handles the dev path. F3 — collapse the three exporters' copy-pastedto_string()/write()bodies into a sharedBaseExporter(ABC)with a singleiter_chunks(records) -> Iterator[str]streaming primitive; concrete exporters now only implementiter_chunks(~15 LOC saved, removes threefrom io import StringIOimports, and makes the next exporter addition one subclass + two class attrs). F4/F5 — strip narrating module / class docstrings across every new Phase 6 module (events / sink / retention / config / consumer / main / exporters / gateway response_handler); per CLAUDE.md the rule is "docstrings required, but redundant/obvious/decorative comments must be avoided", and every module had 2–6 sentences of backstory that belonged in the README or commit message. Inline comments that restated the code (e.g. "# Run the blocking write under to_thread so the async consumer loop is never stalled", "# Nothing to release; kept to satisfy the protocol") are deleted; only genuine non-obvious WHY lines are kept (PIPE_BUF / logrotate note onJsonlFileSink, "truncated tail of an actively-written log" note on_iter_records). F6 —/api/audit/exportformatquery param is now typed asLiteral["jsonl", "csv", "siem"]so FastAPI surfaces it as an OpenAPI enum and returns a 422 with the allowed set for invalid values; thetry/except KeyErrorbranch and itsHTTPException(400)are gone. F7 —MemorySink(initial_records=...)constructor replaces thesink._records.append(r)private-attribute poke thattest_app.py::_buildwas doing; the helper now seeds the sink through the public constructor. F8 — renametest_consumer_drops_malformed_payload_without_crashingtotest_consumer_nacks_when_sink_write_fails(the test was actually exercising the sink-failure → re-queue path, not malformed payloads) and delete the 6-line narrative comment block explaining the mismatch. F9 — deleteJsonlFileSink.aread_alland its test; no production caller used it, and a syncread_all()already works for the FastAPI export path viaasyncio.to_thread. F10 —/api/audit/exportswitches from aResponse(content=full_body)toStreamingResponsedriven by a_stream_exportasync generator that pulls each chunk throughasyncio.to_threadso large dumps never block the event loop and never materialize the whole body in memory; theX-Audit-Record-Countheader is dropped (cannot be known before the stream starts). F11 — replaceif not self._path.exists(): return iter(())inJsonlFileSink.read_allandapply_retention_to_jsonlwithtry: open(...); except FileNotFoundError: return iter(())/return 0; removes the TOCTOU race and one stat syscall per call. Skipped: A1-F3 (_now()/_new_id()duplication withseptum_queue.models— justified by the zero-dep contract of each package), A1-F5 (lazy__getattr__triplication across audit/queue/gateway — justified by per-package independence), A2-F8 (movingAuditRecordshape intoseptum-queueto unify the gateway's hand-rolled audit dict — cross-package refactor, defer), A2-F9 (status vs event_type redundancy — both fields are real filter axes for downstream consumers), A3-F2/F3/F4/F6/F7 (premature optimization at chat-scale traffic). 15 files changed, +148 / −411 net. 443 backend+modular tests pass. - Document the Phase 6 split in
README.md/README.tr.md/CLAUDE.md/PROJECT_SPEC.md: Both READMEs flipseptum-auditfrom "Planned" / "Planlanıyor" to "Released" / "Yayında" in the Package Layout table, with the audit row noting the JSON / CSV / Splunk HEC exporters, age + count retention, the optional queue consumer ([queue]extra), the FastAPI/health+/api/audit/export([server]extra), and the no-septum-core-import invariant.CLAUDE.mdModular Packages command block grows install lines for[queue]/[server]extras and apytest packages/audit/tests/invocation.PROJECT_SPEC.mdflips Faz 6 to "✓ TAMAMLANDI" with per-item check marks plus a final tally line. README structure parity preserved (identical heading and table-row counts across both locales). - Add FastAPI
/healthand/api/audit/exportendpoints behind the[server]extra (Phase 6 — audit FastAPI slice): Newseptum_audit/main.pyexposescreate_app(config, *, sink=None)mirroring the gateway factory pattern — deployment code passes a sink (or accepts the defaultJsonlFileSink(cfg.sink_path)) and the FastAPI dependency itself is lazy-imported so a queue-only audit deployment never pulls fastapi/uvicorn./healthreports the configuredaudit_topic/sink_path/ supported formats so an operator can probe wiring without touching the audit log./api/audit/export?format=jsonl|csv|siemstreams the current sink contents through the matching exporter, sets the rightContent-Type(application/x-ndjson/text/csv/application/json) andContent-Disposition(attachment; filename="septum-audit.{ext}"), and adds anX-Audit-Record-Countheader so a downstream pipeline can verify the dump size without parsing the body. The format string is matched via a single_EXPORTERSdict so adding Loki line protocol or OTLP later is a one-line change. Unknown formats return 400 with the choice list rather than a generic 422; format matching is case-insensitive. 7 new tests cover health, jsonl/csv/siem export shapes (including content-type / disposition / record-count assertions), unknown-format 400, case insensitivity, and empty-sink zero-count behavior. test_app.py kept under that name since no other audit/gateway/queue test shares the basename. - Wire optional audit hook into
GatewayConsumerso each handled request emits a PII-free telemetry envelope (Phase 6 — gateway audit hook slice): Newaudit_queue: QueueBackend | Noneparameter onGatewayConsumer; when set, every handled message produces a small dict on the audit topic after the response publish completes —source: "septum-gateway",event_type: "llm.request.completed"or"llm.request.failed",correlation_id, plus anattributesblock carrying provider, model, status,latency_ms(monotonic-clock measured around the forwarder call),message_count, optionalmax_tokens, and the error string on failure. Discipline check: the envelope contains no prompt content, no response text, no api keys, and no base URLs — only metadata an internet-facing observer is allowed to see — so the gateway can stream telemetry intoseptum-auditwithout breaking the no-PII-leaves-the-zone invariant. Audit-side failures are swallowed (warning-logged vialogging.exception-style, never raised) so a transient audit-queue outage cannot block the primary request/response path.GatewayConfiggrows anaudit_topic: str | Nonefield plusSEPTUM_GATEWAY_AUDIT_TOPICenv loading (empty string treated as unset);/healthreports the topic so operators can tell whether telemetry is wired in. 6 new tests cover the no-audit-queue zero-overhead default, the success-path envelope shape (including the explicit "no messages / no text / no api_key" PII-discipline assertion), the error-envelope variant with the error field populated, and a broken-audit-queue resilience test that asserts the response still goes out and a warning is logged. - Add
AuditConsumersoseptum-auditcan ingest events from a queue topic (Phase 6 — queue consumer slice): Newseptum_audit/consumer.pymirrors theGatewayConsumershape (run_once(block_ms=...)/run_forever) and persists each delivered message into the configuredAuditSinkafter rebuilding it throughAuditRecord.from_dict. Theseptum-queueimport is gated behind the[queue]extra and forced at construction time so a misconfigured deployment fails fast at startup rather than deep inside the loop. Failure handling matches the gateway's: malformed payloads ack-and-drop with a logged error (so one poison pill cannot stall the audit pipeline), but sink write failures nack-with-requeue so a transient disk-full / permission glitch does not lose events. 4 new tests using theFileQueueBackendcover memory-sink and jsonl-sink round-trips, empty-queuerun_oncereturningFalse, and the sink-failure → re-queue path.test_consumer.pyrenamed totest_audit_consumer.pyto avoid pytest basename collision withpackages/gateway/tests/test_consumer.py. - Scaffold
septum-auditwith records, sinks, exporters, and retention (Phase 6 — audit core slice): Newpackages/audit/package persists already-masked compliance event records and ships them to downstream SIEM pipelines. Like the gateway, it lives in the internet-facing zone and — by explicit dependency-wall invariant — never importsseptum-core, so raw PII cannot land in the audit store even via a typo.AuditRecordenvelope (id, timestamp, source, event_type, correlation_id, attributes) is the single shape every component round-trips.AuditSinkProtocol with two bundled implementations:JsonlFileSink(append-only newline-delimited JSON, POSIX-safe concurrent writes viaO_APPEND+asyncio.Lock, logrotate-friendly because each line opens-and-closes the file) andMemorySink(snapshot-iterating list for tests + ephemeral counts). Three exporters cover the SIEM matrix:JsonExporter(jsonl, content-typeapplication/x-ndjson),CsvExporter(RFC 4180 withattributesJSON-flattened into one cell),SplunkHecExporter(HEC envelope withtime/host/source/sourcetype/eventplus optionalindex).RetentionPolicy(max_age_days, max_records)withapply_retention_to_jsonl(path, policy, *, now=None)does an atomic in-place rewrite via.tmpsibling +os.replace, so a crash mid-pass leaves the original file untouched; corrupt lines count as removals.AuditConfig.from_env()readsSEPTUM_AUDIT_SINK_PATH/SEPTUM_AUDIT_TOPIC/SEPTUM_AUDIT_RETENTION_DAYS/SEPTUM_AUDIT_RETENTION_MAX_RECORDS, treating empty strings as unset. Exporters are lazy-imported via PEP 562__getattr__so a stdlib-only audit pipeline never pays thecsv/iocost. 28 tests cover envelope round-trip, snapshot-safe iteration, concurrent writes, missing files, blank/corrupt line handling, all three exporter shapes, retention age + count + combined caps, and env-var overrides. - Tighten Phase 5 hygiene after simplify-pass review: F1 — drop the dead
try/except TypeErroraround anawait llm_router.set_gateway_client_factory(None)intest_gateway_client.py(the setter is sync, theawaitalways raised, so the except branch ran every run as pure noise). F2 —LLMRouter._resolve_gateway_clientno longer swallows factory-construction exceptions into a silentNone-fallback; it lets the caller see the failure, per the project rule "never swallow errors". F3 — define aGatewayClientProtocol(Protocol)so the module-level_gateway_client_factoryand_resolve_gateway_clientcarry a real type instead ofobject | None; the LLMRouter narrative comment that opened with "Phase 5 — optional gateway delegation" is gone (the Protocol + setter docstring already convey the intent). F4 —RequestEnvelope/ResponseEnvelopegrow ato_dict()helper thatgateway_client.publishandGatewayConsumer._handlenow call instead of importingdataclasses.asdictdirectly, keeping serialization knowledge in one place. F5 —gateway_client._build_envelopecollapses the three-branchif provider == "anthropic"/"openai"/"openrouter"into a_PROVIDER_API_KEY_FIELDdict +getattrlookup. F6 —_await_responsedrops the doubleddelivered_for_ussentinel, returns directly on the matching correlation id, and adds aremaining_ms <= 0guard before each consume call so a sub-millisecond deadline does not buy one wasted poll. F7 —forwarder._post_with_retriesregains thebody_len=…log field and the(status=…)suffix that the api-sidehttp_client.post_with_retriesalready had, closing the observability gap that simplify-agent#1 flagged as drift. F8 —ForwarderRegistry.from_configpicks up its missingconfig: "GatewayConfig"annotation via aTYPE_CHECKINGimport. F9 —LLMRouter._dispatch_cloud_callreadsself._settings.use_gatewaydirectly instead of the defensivegetattr(..., False)fallback (the column is always present after_sqlite_ensure_columns). F10 —_nowdocstring corrected from the misleading "Monotonic-ish wall-clock" to plain "Wall-clock (needed for cross-host transport)". F11 —file_backend.pymodule docstring now explicitly tells operators thatdone/is never trimmed and recommends afind done/ -mtime +7 -deletecron. F12 — strip the "Phase 5 — opt-in delegation… Phase 7 deploy flips this…" narrative onAppSettings.use_gateway. Skipped: introducing aCompletionRequestdataclass (pre-existing parameter-sprawl, scope creep), aProviderNameLiteral(touches 6+ files), constructor-injecting the gateway client (Phase 7 deploy will rewire this), and pooling thehttpx.AsyncClientacross retry attempts (would create drift with the api-side copy that the gateway forwarder is faithful to today). 399 backend+modular + 18 frontend tests still pass. - Rename
packages/gateway/tests/test_config.pytotest_gateway_config.py: Pytest uses the file basename as the module name when the tests directory has no__init__.py, which collides withpackages/mcp/tests/test_config.pyon combined runs (pytest packages/mcp/tests/ packages/gateway/tests/ ...). Disambiguating the gateway filename lets the full 399-test regression (backend 289 + core 24 + mcp 39 + queue 22 + gateway 25) collect cleanly. - Document the Phase 5 split in
README.md/README.tr.md/CLAUDE.md/PROJECT_SPEC.md: Both READMEs flipseptum-queueandseptum-gatewayfrom "Planned" / "Planlanıyor" to "Released" / "Yayında" in the Package Layout table, with the queue row noting the file backend (air-gap default) and Redis Streams[redis]extra and the gateway row noting the three cloud providers, the no-septum-core-import invariant, and the optional FastAPI/healthbehind the[server]extra.CLAUDE.mdModular Packages command block grows install lines for[redis]/[server]extras andpytest packages/queue/tests//packages/gateway/tests/invocations.PROJECT_SPEC.mdflips Faz 5 to "✓ TAMAMLANDI" with per-item check marks; item 1 explicitly marks RabbitMQ as deferred (rare in Septum deployment profile, future[rabbitmq]extra). README structure parity preserved (identical heading and table-row counts across both locales). - Wire optional queue producer into
septum-apifor gateway-mode cloud LLM calls (Phase 5 — api producer slice): NewAppSettings.use_gateway: bool = Falsecolumn with a matching_sqlite_ensure_columnsmigration so existing databases pick up the flag on next start;build_default_app_settings()seeds it fromUSE_GATEWAY_DEFAULT. Newseptum_api/services/gateway_client.pyowns the producer-side half:GatewayClient.complete(...)publishes aRequestEnvelopebuilt from the current settings (provider-matchedapi_keythreaded through so the gateway never needs its own secrets in a split deployment), then consumes the response topic until the matchingcorrelation_idarrives, re-queuing any other waiter's reply along the way. Gateway-sideerrorenvelopes map toLLMRouterErrorand missing replies aftertimeout_secondsraiseQueueTimeoutError— both fall through to the existingLLMRouter._fallback_via_ollamapath so the user-visible failure mode is identical regardless of whether the cloud call went direct or via gateway.LLMRoutergains a_dispatch_cloud_callseam that consultsuse_gateway+ a process-wide_gateway_client_factory(installed by deployment code, left unset in tests); when the factory is absent or raises, the router logs a warning and falls back to the direct-call path. This keeps 289/289 backend tests green without needing a queue backend in any existing fixture. 6 new tests cover the round-trip, error mapping, timeout, provider-matched api_key threading, and factory registration; the direct-call default path is already covered by the existing suite. - Scaffold
septum-gatewaywith cloud LLM forwarders and consumer loop (Phase 5 — gateway slice): Newpackages/gateway/package consumes masked requests fromseptum-queue, dispatches them to Anthropic / OpenAI / OpenRouter viahttpx, and publishes the masked answers back on a reply topic. Dependency wall: package declaresseptum-queue+httpx+pydanticin its required deps and — by explicit code-review invariant — never importsseptum-core, so raw PII cannot slip into the internet-facing zone even via a typo.ForwarderRegistry.from_config(GatewayConfig)wires the three cloud forwarders with env-driven default keys; envelope-carriedapi_key/base_urlalways wins over the config so a split deployment where the air-gapped side owns the secrets works unchanged.GatewayConsumer.run_once()/run_forever()pair each request with aResponseEnvelopeby correlation id; forwarder errors, unknown providers, malformed payloads, and arbitrary exceptions all funnel into error envelopes rather than taking down the loop. FastAPI/healthendpoint lives behind the[server]extra so a bare worker process does not pullfastapi+uvicorn. README documents the provider table, env vars, and the no-core-import invariant. 25 tests (respx-mocked httpx for happy path / missing key / base_url override / 5xx retry / OpenRouter branding headers / unknown provider / registry substitution + file-queue + consumer round-trip for success / error envelope / unknown provider / malformed payload / unexpected exception +/healthsmoke) all pass. Drop the straytests/__init__.pyin bothpackages/queue/andpackages/gateway/— having both makes pytest collapse the twotestsnamespaces during combined runs. - Add
RedisStreamsQueueBackendfor shared-infrastructure Septum deployments (Phase 5 — Redis backend slice): Consumer-group backed queue using XADD / XREADGROUP / XACK so multiple gateway instances can consume from a single stream with at-least-once semantics. Each topic maps to one stream (septum:{topic}) and each consumer joins a named group (gatewayby default); if a consumer dies mid-processing the entry stays on the pending entries list until another consumer claims it. Payloads go into a singledatafield holding JSON text rather than fanned-out hash fields — that way one codec works for every envelope shape and nested payloads survive the round trip without manual flattening. First publish implicitly runsXGROUP CREATE … MKSTREAMso cold streams work without operator provisioning;BUSYGROUPfrom a racing consumer is silently swallowed. Nack-with-requeue simulates the missing native primitive by XRANGE-reading the entry, XADD-ing a fresh copy, then XACK-ing the original.from_url("redis://…")constructor mirrors the api side.redis.asynciostays a lazy import gated behind the[redis]extra so a stdlib-only install never touches it. 7 tests usingfakeredis.aiorediscover round-trip, cold-stream XGROUP creation, two-consumer group isolation (10 messages, no duplicates), ack / nack requeue / nack drop, and non-blocking empty consume.pytest.importorskipskips the suite gracefully when neitherredisnorfakeredisis installed. - Add
FileQueueBackendfor air-gapped Septum deployments (Phase 5 — file backend slice): Stdlib-only concrete backend that persists one JSON payload per file acrossincoming/,processing/,done/sibling directories inside a per-topic root. Atomicos.replaceis the entire synchronization primitive — POSIX rename is atomic within a single filesystem, so claiming a message is just "move fromincoming/toprocessing/and whichever racing consumer wins the rename wins the message." Publisher writes to a.json.tmpsibling first and then atomic-renames into place so a half-written JSON blob is never visible to a racing consumer. Supports async-nativepublish/consume/ack/nack(requeue=bool)/closeviaasyncio.to_threadso the blocking directory I/O does not stall the event loop. Zero infrastructure dependency — an air-gapped deployment that shipsseptum-api+septum-gatewayon the same volume needs nothing else, and an operator debugging a stuck request can literallyls processing/to see which correlation ids are in flight. At-least-once delivery; callers dedupe oncorrelation_id. 10 tests cover round-trip, FIFO order, nack requeue / drop, restart persistence, two-consumer race (30 messages, no duplicates), idempotent double-ack, closed-backend errors, and.tmppartial-write skip. - Scaffold
septum-queuewith abstract transport interface and envelope models (Phase 5 — queue interface slice): Newpackages/queue/package defines the cross-zone bridge between the air-gappedseptum-apiand the internet-facingseptum-gateway.QueueBackendProtocol (publish/consume/ack/nack/close) gives both sides a backend-agnostic surface;QueueSessionasync context manager ensures deterministic cleanup.RequestEnvelopeandResponseEnvelopedataclasses shape every payload that crosses the boundary — JSON-serializable, correlation-id paired, mutually-exclusivetext/errorfields on the response. Zero runtime deps on the core package (stdlib only); concrete backends (file, Redis streams) gate behind optional extras and are lazy-imported via PEP 562__getattr__so stdlib-only installs never touchredis.asyncio. 5 envelope round-trip tests pass. - Add a README for
packages/web/and mark Phase 1-3 complete in PROJECT_SPEC:packages/web/was the only modular package without a README, violating Section 12's "README güncel" Definition of Done. New doc covers the Next.js 16 / React 19 stack, install and script commands, both deployment topologies (single-container viaBACKEND_INTERNAL_URL+ Next.js rewrites vs. split deployment via build-timeNEXT_PUBLIC_API_BASE_URL+ backend-sideFRONTEND_ORIGIN), thesrc/directory layout, and the "no direct fetch in components" rule.PROJECT_SPEC.mdflips Faz 1, 2, 3 to "✓ TAMAMLANDI" with per-item check marks; Faz 3 item 7 (modular docker-compose variants) is explicitly marked as deferred to Faz 7 where the four variants are already planned. Faz 2 item 4 is clarified as "stdio ✓, SSE ileriye bırakıldı" — Claude Code / Desktop / Cursor all connect over stdio, so SSE is unused in practice. - Tighten Phase 4 hygiene after simplify-pass review: F1 —
_resolve_cors_origins(sandwiched between twoapp.add_middleware()calls inseptum_api/main.py) moves to a dedicatedseptum_api/utils/cors.pymodule and the newresolve_cors_originsis imported at the CORS registration site, so middleware setup stays declarative and the helper is testable on its own. F2 —packages/web/src/lib/api.tsfactors the trailing-slash strip into a pureresolveBaseURL(value: string | undefined)helper; the module-levelbaseURLcalls it once. Tests now exercise the helper directly with concrete inputs instead of round-tripping throughjest.isolateModules+require()to swapprocess.env.NEXT_PUBLIC_API_BASE_URL, so the axios instance is no longer re-instantiated three times per run. F3 — overlong block comments and narration intest_cors.py/api.test.tstrimmed; the 12-line preamble inapi.tscollapses to a 6-line WHY note covering only the non-obvious bits (rewrite proxy default, build-time override, slash stripping). 18 frontend + 283 backend tests pass. - Anchor the pre-commit secrets check so
tsconfig.jsonno longer trips it: The hook regex was(^\.env|config\.json$), which matches any path ending inconfig.json— includingtsconfig.jsonandjsconfig.json. The Phase 4 move surfaced this whenpackages/web/tsconfig.jsongot staged for the rename and the hook flagged it as a leaked secret. Anchored the right side to a path boundary ((^|/)config\.json$) so only top-level or directory-rootedconfig.jsonfiles match. - Document the Phase 4 split in
README.md/README.tr.md/CLAUDE.md/PROJECT_SPEC.md: Both READMEs flipseptum-webfrom "Planned" to "Released" in the Package Layout table, drop the "currently lives infrontend/" disclaimer, and pick up a build-timeNEXT_PUBLIC_API_BASE_URLnote plus aFRONTEND_ORIGIN-driven CORS note.CLAUDE.mdupdates the "Frontend (fromfrontend/)" command block topackages/web/, mentions the env var onsrc/lib/api.ts, and fixes the "verify version numbers againstfrontend/package.json" rule.PROJECT_SPEC.mdchecks Faz 4 off with the implemented item list. README structure parity preserved (identical heading and table-row counts across both locales). - Wire
FRONTEND_ORIGINinto the FastAPI CORS allow-list (Phase 4 — CORS slice): TheBootstrapConfig.frontend_originfield has lived inbootstrap.pysince the wizard landed but was never read by anything —septum_api/main.pyhadallow_origins=["*"]hardcoded. The default flips from"http://localhost:3000"to"*"(preserving current backward-compat behavior) and a new_resolve_cors_originshelper parses the value as a comma-separated origin list, so split deployments can runFRONTEND_ORIGIN=https://app.example.com,https://admin.example.comand lock CORS down to just those two origins. Empty value or literal"*"still maps to the wildcard so a misconfigured deploy does not silently block every request. Newtests/test_cors.pycovers wildcard, single origin, comma-separated, and blank-segment cases (4 tests, parametrized to 6);test_bootstrap.pyupdated for the new default. 283/283 backend tests pass. - Make the dashboard API base URL configurable via
NEXT_PUBLIC_API_BASE_URL(Phase 4 — env-driven URL): ThebaseURLconstant inpackages/web/src/lib/api.tswas hardcoded to"", locking the frontend to the same-origin proxy layout (Next.js rewrites innext.config.mjsforwarding/api/*to the backend). The new resolution readsprocess.env.NEXT_PUBLIC_API_BASE_URLat build time and strips trailing slashes so callers can keep concatenating${baseURL}/api/...cleanly; unset still produces""for the existing single-container Docker layout. Unblocks split deployments wherepackages/webandpackages/apiare hosted on different origins. Tests cover default, override, and trailing-slash normalization (Jest 17/17 pass); production build with the override set succeeds. - Move
frontend/intopackages/web/(Phase 4 — relocate slice): The Next.js dashboard relocates from the top-levelfrontend/directory intopackages/web/to match the modular layout ofseptum-core,septum-mcp, andseptum-api.git mvpreserves history. Tooling references updated in lockstep:dev.shcd's intopackages/webfor--setupand the dev server, the multi-stageDockerfilecopies frompackages/web/instead offrontend/, the three GitHub Actions jobs (frontend-tests,frontend-typecheck,frontend-security) now run withworking-directory: packages/web, and.gitignoreswapsfrontend/coverage/forpackages/web/coverage/. No source-file edits — Jest still passes 15/15 from the new location,tsc --noEmitis clean, and the in-container path (/app/frontend/) is unchanged so existing volume mounts andstart.shkeep working. - Document the REST API auth flows and modular package layout in both READMEs (Phase 3d): New "REST API & Authentication" section on both
README.mdandREADME.tr.mdcovers JWT login, API key creation/use/list/revoke viaPOST /api/api-keys+X-API-Keyheader, and the per-route rate-limit table (login 5/min, register 3/min, key-create 10/min, default 60/min). New "Package Layout" subsection under "For Developers" lists the seven modular packages (septum-core,septum-mcp,septum-apireleased;septum-queue,septum-gateway,septum-audit,septum-webplanned) with their zone classification (air-gapped / bridge / internet-facing). Section structure mirrored across both locales; counts diff cleanly. - Document the
X-API-Keysecurity scheme in OpenAPI / Swagger UI (Phase 3d): FastAPI auto-generates the OAuth2 scheme fromOAuth2PasswordBearerdeclared inutils/auth_dependency.py, but the API key path lives entirely inAuthMiddlewareand was therefore invisible to/openapi.jsonand the Swagger / ReDoc "Authorize" dialog. A customapp.openapi()override inseptum_api/main.pyinjects anApiKeyAuthentry undercomponents.securitySchemesso both flows show up alongside one another. Schema-only change — no runtime behavior modified. - Add API key authentication, auth middleware, and per-route rate limiting (Phase 3c): New
ApiKeyORM model with SHA-256 hashed keys, 8-char prefix lookup, per-user scoping, and optional expiry. Asys.meta_path-styleAuthMiddlewareresolves both JWT Bearer tokens andX-API-Keyheaders into aUseronrequest.state.user, letting existingDepends(get_current_user)callsites work without edits. API key CRUD router (POST/GET/DELETE /api/api-keys) lets admins create keys shown once, list by prefix, and revoke. Rate limiter refactored from inlinemain.pysetup intomiddleware/rate_limit.py; sensitive endpoints get per-route limits (login 5/min, register 3/min, key-create 10/min) and API-key requests are rate-limited by key prefix instead of IP. - Move the services layer into
septum-apiwith a lazy aliasing shim (Phase 3b — services slice): Every service module (35 top-level files plus theingestion,llm_providers,national_ids, andrecognizerssubpackages) relocates frombackend/app/services/intopackages/api/septum_api/services/. The legacybackend.app.services.*namespace keeps working via asys.meta_pathaliasing finder installed by the newbackend/app/services/__init__.py, which resolves each shim-path import on demand so heavy ML imports (torch, faiss, paddle, whisper) still fire only when callers actually touch those modules and every service file is represented by exactly one module object across both namespaces. The shallowpkgutil.iter_modulespattern used for Phase 3a shims is not enough here because services has nested subpackages that Python would otherwise re-import under the shim namespace, producing duplicate classes and split singletons.services/auth.pyis now in septum_api, unblocking the Phase 3autils/auth_dependency.pyTODO. Follow-up:auth_dependencymoves in lockstep, resolving that TODO — it lives underpackages/api/septum_api/utils/alongside the other infrastructure helpers and the legacy Phase 3abackend.app.utilsiter_modulesshim picks it up automatically. 266 backend tests pass. - Move the FastAPI routers into
septum-api(Phase 3b — routers slice): All 14 router modules (approval,audit,auth,chat,chat_sessions,chunks,documents,error_logs,regulations,settings,setup,text_normalization,users) relocate frombackend/app/routers/intopackages/api/septum_api/routers/. Each of their transitive dependencies — services, models, database, utils — already lives inseptum_apiafter Phase 3a and the earlier Phase 3b slices, so the moves are a clean copy with no import edits.backend/app/routers/__init__.pybecomes aniter_modules-based aliasing shim in the Phase 3a style: routers is a flat package with no nested subpackages, so the shallow pattern is sufficient and avoids the meta_path machinery that services needed. 266 backend tests pass. - Move the FastAPI app factory into
septum-api(Phase 3b — main slice):backend/app/main.py(321 lines — lifespan, middleware wiring, top-level exception handlers, the health endpoint) relocates topackages/api/septum_api/main.py.backend/app/main.pybecomes a thin wildcard re-export shim sofrom app.main import app(used by the test suite fixtures) anduvicorn app.main:app(used bydev.sh) keep pointing at the exact same FastAPI instance that now lives in septum_api — only one app exists in the process regardless of import path._app_versionis switched from the brittleparents[2] / "VERSION"offset to a walk-up lookup so the repo-rootVERSIONfile is still found after the file relocation, and the/healthendpoint stops duplicating the version logic and calls_app_version()instead. 266 backend tests pass.
2026-04-15
- Extract septum-core package with text_utils, national_ids, and anonymization_map: First slice of the modular monolith split. The three cleanest modules move to the new air-gap-safe
packages/core/package (presidio + spaCy + pydantic + regex only, zero network deps); backend keeps thin shims that re-export fromseptum_coreso existing imports keep working without touching call sites. - Move recognizers, policy composer, config and ports into septum-core: Second slice of the modular split. The 17 built-in regulation packs plus
base_recognizerandRecognizerRegistryare relocated underpackages/core/septum_core/recognizers/;PolicyComposer.compose_from_datamoves intoseptum_core.regulations.composerwhile the asynccompose(db)stays on the backend as a SQLAlchemy-aware shim. NewSeptumCoreConfig,SemanticDetectionPort,NullSemanticDetectionPort,DetectedSpan/ResolvedSpan/SanitizeResult, andRegulationRulesetLike/CustomRecognizerLike/NonPiiRuleLikeProtocols give the core package a database-free, network-free contract to plug Ollama and other adapters against. The backendRecognizerRegistrysubclasses the core one and injects an Ollama-backedLLMContextRecognizerfactory sodetection_method='llm_prompt'custom rules keep working without dragginghttpxinto core. - Move the detector, unmasker and the SeptumEngine facade into septum-core: Final slice of the Phase-1 monolith split.
sanitizer.py(2020 LoC) becomesseptum_core.detector.Detectorwith all four Ollama paths (_ollama_validate_pii_candidates,_ollama_pii_detection,_ollama_semantic_detection,_resolve_pronoun_coreference) replaced by a singleSemanticDetectionPortdispatch;deanonymizer.pybecomesseptum_core.unmasker.Unmaskerwith the Ollama strategy kept as a backend shim; and the newseptum_core.engine.SeptumEnginefacade wires the detector, unmasker and recognizer registry together behind aSeptumEngine(regulations=[...]).mask(text)/.unmask(text, session_id)API with an in-memory session registry. Heavy optional imports (NERModelRegistry,Detector,SeptumEngine) are lazily resolved via a PEP 562__getattr__so hosts that skip the[transformers]extra can still use the lightweight composer and recognizer primitives.non_pii_filter,span_processingandner_model_registry(plus itsdevicehelper) also move into core; backend keeps thin re-export shims. Newpackages/core/tests/gains 19 native unit tests covering the unmasker, the composer's duck-typed Protocol surface and the engine round-trip without touching the backend database. - Ship the septum-mcp package as a standalone MCP server for Claude Code / Desktop / Cursor: Phase 2 of the modular split. New
packages/mcp/wrapsseptum-corebehind a stdio MCP server (officialmcpSDK) with six local tools —mask_text,unmask_response,detect_pii,scan_file,list_regulations,get_session_map— that never touch the network;scan_filecovers.txt / .md / .csv / .json / .pdf / .docxviapypdf+python-docx.SeptumEnginegains a publicget_session_map()accessor and a TTL-based eviction loop so long-running MCP subprocesses don't accumulate anonymization maps, with engine construction deferred to the first tool call so idle cost stays near zero. Both root READMEs gain a matching "MCP Integration" section and a new Key Features bullet pointing editors at the Claude Code configuration snippet inpackages/mcp/README.md. - Extract
septum-apiinfrastructure primitives intopackages/api/(Phase 3a): Third slice of the modular split. Thebootstrap,config,database,models/,seeds/, andutils/(exceptauth_dependency) modules move frombackend/app/into the new air-gap-safeseptum-apipackage;backend/app/now ships wildcard re-export shims for the single-file modules and auto-aliasing__init__shims for the subpackages that register eachseptum_api.models.*,septum_api.seeds.*, andseptum_api.utils.*submodule insys.modulesunder its legacybackend.app.*path, so every existingfrom app.models.settings import AppSettingsstyle import resolves through the new package without any call-site edits. Routers, services, middleware, and the FastAPIappinstance stay inbackend/app/for now and migrate in Phase 3b; the full 266-case backend test suite keeps passing untouched. - Wizard regulations step defaults to all built-in packs with a select-all toggle: The wizard used to pre-check only whatever backend marked
is_active=true(in practice just GDPR), so first-run users had to manually tick 16 boxes to get full coverage. The step now defaultsactiveRegulationsto every built-in pack on load and adds a{selected} / {total}counter plus a "Select all / Deselect all" toggle above the list so deselecting is equally one-click. i18n strings added in lockstep for both locales. - Remove "Skip Setup" bypass from the first-run wizard: The welcome screen exposed a small "Skip Setup" link that silently initialised infrastructure with
database_type: "sqlite", flippedsetup_completed=true, and let the user bypass the whole flow — no LLM provider chosen, no regulations activated, no admin account created. Dropped thehandleSkipcallback and the welcome-screen link; thesetup.nav.skipi18n keys in both locales are removed. Start is now the only way forward. - Lazy-seed AppSettings on first read so partial bootstraps stop 500-spamming: When the lifespan skipped
init_db()but the admin was later created via/api/auth/register, the users table existed without theAppSettings.id=1row.GET /api/settingsraised 500 "Application settings have not been initialized" and the Error Logs UI filled with cascading identical 500s. Extracted the default-row construction intodatabase.build_default_app_settings()as the single source of truth, consumed by both_seed_defaultsat startup andload_settingson demand. Fresh DBs now self-heal on the first read; the wizard's GET/PATCH/test-llm/test-local-models flow keeps working unchanged. - Capture stack traces for 5xx HTTPExceptions in the Error Logs UI:
http_exception_handlerwas routing every HTTPException throughlog_backend_message, which writesstack_trace=Noneandexception_type=None, so clicking "Detay" on a 500 landed on a view with no stack frames and no pointer to where the exception was raised. 5xx HTTPExceptions now go throughlog_backend_error, which walksexc.__traceback__viatraceback.format_exceptionand persists the full stack. 4xx responses still uselog_backend_messageat WARNING level because the stack is noise there rather than signal. - Serve the NER default-model map from the backend instead of hardcoding it twice: The NER Models settings tab held its own hardcoded copy of the per-language HuggingFace model map, and it drifted out of sync the moment the backend upgraded its own
DEFAULT_MODEL_MAP(the 2026-03-12 XLM-RoBERTa refresh) — the tab advertised six wrong model IDs, including one that was literally an English model listed under French. NewGET /api/settings/ner-defaultsexposesNERModelRegistry.DEFAULT_MODEL_MAPas the single source of truth; the frontendNerModelsTabfetches it on mount with a loading and error state, the duplicatedNER_MODEL_DEFAULTSconstant is deleted.
2026-04-14
- Restore the NER Models settings tab: The per-language HuggingFace model-ID override tab was hidden in Phase 1 productization while the backend
ner_model_overridesfield and theNerModelsTab.tsxcomponent stayed live but orphaned. Re-linked the component fromsettings/page.tsx— import, tab entry, and switch case — so operators can swap a noisy NER model for a specific language without a code release. All strings and the backend wiring were already in place. - Drop LOCATION entirely from both NER and Presidio outputs: Multilingual NER models mis-tag common nouns and form-field headers as LOC in every language Septum supports ("Doğum", "İş", "TARAFLAR" in Turkish; equivalents elsewhere), and Presidio's built-in
SpacyRecognizerrunningen_core_web_smover non-English text produces the same class of stochastic GPE → LOCATION mis-fires on any Title Case OOV token. Chasing these per-language with stopword lists or gazetteers cannot scale across 50+ locales._map_ner_labelnow returnsNonefor LOC, and_filter_presidio_resultsunconditionally dropsLOCATION— address PII is captured exclusively by the deterministicStructuralAddressRecognizerand per-regulation POSTAL_ADDRESS / STREET_ADDRESS recognizers. Under GDPR Art. 4(1) a place name alone does not identify a person; the identifying anchor is PERSON_NAME, which the NER layer still detects. - Fix blank
/redocpage by pinning Redoc to the stable v2 CDN: FastAPI's default/redocembedshttps://cdn.jsdelivr.net/npm/redoc@next/...— the@nextdist-tag points at the unstable Redoc v3 alpha and currently renders blank. Disabled the default viaredoc_url=Noneand added a custom/redocroute that callsget_redoc_htmlwithredoc@2/bundles/redoc.standalone.jspinned to the stable line. - Per-entity-type colour palette across every PII rendering surface:
entityColors.tsgrew from 6 to 16 distinct Tailwind colour keys with an ordered regex classifier that maps every common PII family to its own colour; unknown types fall back to a deterministichash % 16. The Document Preview filter chips, detected-entities list, and inline highlights all share the same per-type colour. - Align Document Preview entity chip counts with the detected-entities list: The filter chip bar counted distinct values in
anon_map.entity_mapwhile the right-hand panel countedentity_detectionsrows, producing mismatched totals. Chip counts now derive fromdetectionswhenever detection rows exist; the anon-map summary is only a legacy fallback. - Show the assembled-prompt preview in the read-only approval review modal:
ApprovalModal's state-sync effect was gated onsessionId, which the review modal always passes asnull, so the "Bulut LLM'e gönderilecek tam prompt" panel always showed the empty-state hint. Guard relaxed to!openso the sync runs on every modal open; live refresh is still independently gated so read-only mode never hits the backend. - Keep chat SSE alive while waiting for user approval: With the approval gate enabled, the chat stream awaited
gate.wait_for_approvalwith no bytes flowing, so a slow decision let Next.js' proxy drop the idle socket and the post-approval events never reached the browser. The wait now runs in a heartbeat loop that yields a: keepaliveSSE comment every 15s to keep the socket hot. - Reorder imports to satisfy ruff I001 and retrigger v0.1.10 Docker build: Yesterday's
[release]push never built the image becauseruff checkfailed on three I001 import-order violations inapproval.py,chat.py, and the LGPD recognizer pack.ruff --fixreordered the imports (no runtime change) and the[release]marker retriggers the publish workflow.
2026-04-13
- Stop PHONE and validator drops from silently eating NATIONAL_ID: Two independent bugs made national IDs disappear —
ExtendedPhoneRecognizergreedily matched any 11–13 digit identifier as a phone and outranked NATIONAL_ID in dedup, andValidatedPatternRecognizerdropped every checksum-failing span so synthetic test IDs never reached the pipeline. Tightened the phone regex, added an entity-type priority tiebreaker indeduplicate_spans, and gaveRegexPatternConfiga newfallback_scoreso checksum-failing IDs survive at a reduced score. Cumulative detection count on the 26-doc multilingual fixture jumped from 1098 to 1311 (+19%). - Unblock setup wizard with bootstrap-mode auth relaxation: The earlier RBAC commit locked every wizard endpoint behind
require_role("admin"), but the wizard runs before any admin exists. Newrequire_admin_or_bootstrap/require_user_or_bootstrapdependencies relax enforcement whileusersis empty andsetup_completedis false, then snap back to strict mode once the first admin is created. - GDPR and LGPD packs gain context-gated country-specific national-ID recognizers: Added format-descriptive detectors for German Personalausweis / Steuer-ID / Rentenversicherung, Spanish DNI / NIE / Seguridad Social, French NIR / SIREN in the GDPR pack, and a civil-identity-document detector for the Brazilian RG in the LGPD pack. All use
narrow_to_group=1so only the identifier value lands in the reported span. - Broaden IBAN / DOB / TAX_ID detection across locales: IBAN recognition now accepts both space-grouped and compact forms with a format-only fallback for synthetic values; DOB learned month-name dates in en/de/fr/es/pt/it/tr; TAX_ID is now a six-family alternation covering EU VAT, CIF, Steuernummer, Inscrição Estadual, Steuer-ID, and the compact form. Added parent-type alias expansion so declaring
BANK_ACCOUNT_NUMBERno longer hides the IBAN recognizer. - Detection refresh script for existing documents: New
scripts/reprocess_entity_detections.pyre-runs the current sanitizer over already-ingested chunks, rewrites theentity_detectionsrows and the encryptedanon_maps/*.encpayloads, and updatesdocuments.entity_count— all without re-extracting PDFs or rebuilding indexes. - Every built-in regulation now ships a first-class recognizer pack: Septum advertised 17 regulations but only GDPR, HIPAA, and KVKK had recognizer packs — the rest silently fell through to the baseline. Added packs for the remaining 14 regulations, each with at least one region-specific structural or checksum-validated detector; DPDP finally wires the long-orphaned
AadhaarValidator. - Dead code cleanup and frontend dependency realignment: Removed the empty
backend/app/schemas/package and two orphaned frontend utilities; droppedclsxandtailwind-merge. Downgradedeslintfrom^10.0.3to^9.39.4to resolve the ERESOLVE peer conflict witheslint-plugin-react-hooks@7.0.1. - README lists all 17 built-in regulation packs in a table: Both READMEs only named 12 regulations inline; added a collapsible table listing every pack with region flag, pack code, and full name.
- Document preview authenticates via blob URL instead of naked
<iframe src>: Raw document pane showed "Not authenticated" because<iframe>,<img>, and<audio>cannot carry the axiosAuthorizationheader. Preview now fetches the raw file through the typed axios client as aBlob, wraps it withURL.createObjectURL, and revokes it on unmount. - Baseline sanitizer is now regulation-agnostic; KVKK-specific NATIONAL_ID detection lives in the KVKK pack:
ValidatedNationalIDRecognizerdefaulted its validator toTCKNValidator, meaning every sanitizer ran Turkish-specific checksum logic even under GDPR or HIPAA. Deleted the baseline class and moved TCKN-aware detection into the KVKK pack. - National ID detection now narrows the span to the digits and runs the TCKN checksum: The KVKK context recognizer's regex had no capture group so the reported span included the keyword prefix, and
ValidatedPatternRecognizernever actually called the algorithmic validators. Addednarrow_to_groupandalgorithmic_validatorhooks; Presidio benchmark recall climbed from 94.4% to 95.7%. - Role-based access control enforced across every router: The
rolecolumn onuserswas cosmetic — only/api/users/*honoured it. Every router now declares a concrete auth dependency: reads requireget_current_user, document/chunk writes require admin or editor, and settings/regulation/audit mutations require admin. - User management with admin-only CRUD, self change-password, and role-gated navigation: New
/api/usersrouter (list/create/update/reset-password/delete) gated on admin, plus/api/auth/change-passwordfor self-service rotation. Self-signup is now a pure bootstrap path — first user becomes admin, subsequent calls 403.
2026-04-10
- README trust badges, star CTA, and Star History chart: Added CI status, Docker image version, GitHub stars, and MIT license badges, a star CTA, a Support the Project section, and a live Star History chart. Mirrored into
README.tr.md. - "See It in Action" screenshot gallery with animated demos: Added three optimised slideshow GIFs (setup wizard, approval flow, document preview) and a collapsible grid of six configuration screens. 27 new PNGs replace the 11 legacy screenshots.
2026-04-09
- Claude Code project skills migrated to directory format: Current Claude Code versions only auto-load project skills at
.claude/skills/<name>/SKILL.md, so/security-scan,/new-regulation,/new-recognizer, and/new-ingesterwere returningUnknown skill.git mvd all four to the directory layout. - README clarification — chat messages are also sanitized: Both READMEs framed PII protection only around uploaded documents, but
_sanitize_queryruns user messages through the same pipeline. Intro, before/after example, and Local PII Protection bullet now make the dual coverage unambiguous. - Document Preview entity highlight states: Reworked the sanitized-content highlights so every detected entity is always visibly marked — default state is a thin coloured outline per entity type, and the navigation-focused entity gets a saturated filled background. Added
src/libtotailwind.config.jscontentso the JIT actually compilesentityColors.tsclass strings. - Next.js 10MB body cap silently 500'd large uploads: Next.js 16's default
middlewareClientMaxBodySize(10 MB) truncated large audio/PDF/image uploads, killing the backend connection and surfacing as a generic 500. Bumped to 500 MB innext.config.mjs. - HTTPException failures now land in the Error Logs UI:
http_exception_handlertranslated HTTPExceptions to JSON but never wrote toerrorlog, so anyraise HTTPException(400, …)was invisible. Now logs every 4xx/5xx (404 excluded as healthcheck noise); frontend upload handlers also stop swallowing errors and forward failures viasendFrontendError. - NER pipeline thread-safety + ingestion errors land in Error Logs:
ner_model_registry.get_pipelinewas not thread-safe under parallel ingestion — two workers could trigger the lazytransformers.pipelineimport and the second thread hitImportError. Hoisted the import to module level and added a double-checkedthreading.Lock; background ingestion failures now also write toerrorlog. - PostgreSQL deploy fixes — missing migration + tz-naive timestamp columns: First real PostgreSQL deployment surfaced two SQLite-masked bugs:
use_ollama_semantic_layerhad no Alembic migration (backfilled as 011), and five model files declaredcreated_at/updated_atas bareDateTimewhich asyncpg refused to populate. Migration 012 ALTERs 9 columns toTIMESTAMP WITH TIME ZONE. - Approval modal 3-column layout: Restructured
ApprovalModal.tsxinto a 3-column grid — masked prompt + regulations + entity summary on the left, editable chunks in the middle, full assembled prompt on the right. Collapses to a single column on small screens. - Relevance-based chunk filter + full assembled-prompt preview in approval modal:
_retrieve_chunksnow max-normalises RRF scores and drops tail candidates below a0.4threshold. A shared_assemble_user_prompthelper builds the exact masked prompt that will be sent to the LLM, and a new/preview-promptendpoint lets the approval modal refresh the preview as the user edits chunks. - Approval modal no longer rubber-stamps the entire document: Removed the
FULL_DOCUMENT_CHUNK_THRESHOLD = 100shortcut that shovelled every chunk into the prompt for small documents, defeating the whole point of per-chunk curation. Chat now always runs hybrid retrieval. - Approval gate timeout (no more infinite chat hangs):
ApprovalGate.wait_for_decisionnow honours an optionaltimeoutand auto-rejects withtimed_out=Truewhen it fires. Driven by a newapp_settings.approval_timeout_seconds(default 300). - Chat per-phase timing logs: Added a
_phase_timer(session_id, phase)async context manager that emits onechat phase session_id=… phase=… elapsed_ms=…log line per phase. When a chat hangs, grep the session id and see which phase was last. - Background ingestion
PendingRollbackErrormasking transient SQLite locks:_run_full_backgroundnever committed itsdetected_language/ocr_confidenceassignment, so a later autoflush tried to UPDATE documents while another worker held the write lock. Added an explicit commit and a_record_background_failurehelper that rolls back cleanly. - Error Logs stack trace copy button: Added a
CopyButtonnext to the "Stack trace" label in the Error Logs detail row. - Document Preview entity list panel: Added a third "Detected entities" column that lists every entity from the active filter — clickable rows scroll to the matching highlight and jump to the source PDF page.
2026-04-08
- Docker proxy timeout fixes: Chat SSE endpoint returns
StreamingResponseimmediately (all pre-processing moved inside the generator), and document upload/reprocess run ingestion in background tasks so responses return instantly. Fixessocket hang uperrors when proxied through Next.js. - Document processing progress: Real-time progress bar in the document list during sanitization and indexing, with an animated pulse indicator during OCR/Whisper. New
GET /api/documents/progressendpoint. - Whisper download progress fix: Byte-level progress tracking replaces unreliable file-size polling; document upload reports byte-level progress via
onUploadProgress. - Docker image size reduction (~6 GB): CPU-only PyTorch eliminates unused NVIDIA CUDA/Triton libraries, dropping image size from ~17 GB to ~6 GB. README includes a Docker vs Local comparison table.
- Model cache persistence: New
septum-modelsDocker volume persists Whisper, HuggingFace, and PaddleOCR models across container recreations. - SQLite WAL mode: Enables concurrent reads/writes and prevents
database is lockederrors during parallel document processing. - Trailing slash redirect fix: Routes using
"/"changed to""in error-logs and audit routers, preventing FastAPI 307 redirects that leaked the internal backend URL. - Orphaned document cleanup: Documents stuck in
processingstatus after a server restart are now automatically markedfailedon startup. - API baseURL fix: Axios instance uses an empty
baseURLconsistently, preventing SSR URL leaks to the client. - Multi-arch Docker image (amd64 + arm64): Docker image now builds for both architectures. Apple Silicon Macs run natively without x86 emulation, eliminating the 5–10x ML performance penalty.
- Chat performance overhaul: Chunk masking at chat time uses pure string replacement against the document's existing anon map (no model calls per query). Query sanitization defaults to
enable_ollama=Falsesince alias/pronoun layers added seconds of latency. Typical chat round-trip is now seconds instead of minutes. - Sanitizer Ollama validation: Ollama responses are now filtered against word-boundary, ambiguity, fragment-length, ALL-CAPS heading, and ID-shape heuristics so small models stop polluting the anon map. The semantic detection layer is opt-in via
use_ollama_semantic_layer(default off). - Whisper model cache:
AudioIngestercaches the loaded Whisper model at the class level so subsequent uploads reuse the in-memory weights. - PaddleOCR detection upgrade: Switched detection from
PP-OCRv5_server_dettoPP-OCRv5_mobile_det. Significantly faster on dense layouts and empirically catches more text regions with no accuracy loss. - Parallel document upload: Frontend
uploadDocumentsruns up to four uploads concurrently via a worker pool, reporting overall byte progress across all in-flight files. - Next.js compression disabled for SSE:
compress: falseinnext.config.mjsso the proxy layer no longer gzip-buffers chat streaming events. - Audio transcription preview removed: The dedicated transcription button and modal mode were redundant — the standard preview already shows the transcript.
- First-class Ollama LLM provider: Added a dedicated
OllamaProviderand registered it in the LLM provider factory sollm_provider="ollama"goes through the normal provider path instead of erroring and falling back. - Background ingestion concurrency cap: New
_INGESTION_SEMAPHORE(max 2) limits concurrent ingestion pipelines; beyond two jobs, SQLite write contention and NER model GIL contention hurt throughput. - SQLite tuning for parallel writes: Bumped busy timeout from 5s to 30s and set
synchronous=NORMALso bursts wait out short write locks instead of failing immediately.
2026-04-07
- Single-port turnkey Docker deployment: All traffic served through port 3000 — port 8000 no longer exposed. Next.js rewrites proxy
/api/*,/docs,/health,/metricsto the backend.--add-host=host.docker.internal:host-gatewayadded for cross-platform Ollama host access. - Ollama auto-detection: New
GET /api/settings/ollama-probetrieshost.docker.internal,ollama, andlocalhost(port 11434) and returns the first reachable URL; setup wizard auto-probes when Ollama is selected. - Ollama model discovery: Setup wizard and settings page list locally installed Ollama models in a searchable combobox. Users can also type a custom name. New
GET /api/settings/ollama-modelsendpoint. - SSE streaming fix:
PrometheusMiddlewareconverted fromBaseHTTPMiddlewareto pure ASGI middleware —BaseHTTPMiddlewarewas cancelling long-lived SSE connections and causingConnection closederrors under Next.js rewrites. - Media format support:
video/mp4,audio/mp4,audio/m4a,audio/x-m4aadded to accepted MIME types so phone audio recordings saved as.mp4are ingested as audio. - Security hardening: SSRF protection on Ollama URL endpoints (hostname allow-list),
ollama-pulltimeout bounded at 10m, Content-Disposition filename sanitized to prevent header injection. - Docker & defaults improvements: CORS middleware reordered outermost,
allow_credentialsremoved, Next.js standalone binds0.0.0.0, sidebar version fetched from the backend, default privacy toggles now true, default OCR languagesen,tr,de,ru,fr,./dev.sh --resetflag, upload timeout raised to 5 minutes.
2026-04-06
- Full entity type coverage — 37/37 regulation entity types now detectable: Added 12 new Presidio pattern recognizers (DATE_OF_BIRTH, MAC_ADDRESS, URL, COORDINATES, COOKIE_ID, DEVICE_ID, SSN, CPF, PASSPORT_NUMBER, DRIVERS_LICENSE, TAX_ID, LICENSE_PLATE) with multilingual context keywords. 9 semantic types now detected via the Ollama layer; Japanese NER model upgraded from base BERT to Davlan XLM-RoBERTa. README gains an honest "Detection Coverage & Limitations" section.
- ALL CAPS PII detection and organisation name support: NER layer auto-normalises ALL CAPS text to title case before running XLM-RoBERTa; Turkish İ lowercasing fixed. ORGANIZATION_NAME is a new NER entity type with a false-positive filter.
- Cross-device API access: Frontend API base URL now auto-resolves from the browser hostname when
NEXT_PUBLIC_API_URLis not set, so phones and other devices on the same network can reach Septum without manual env configuration. - Ollama model discovery (chat + de-anon + LLM fields): Setup wizard and settings page list locally installed Ollama models in a searchable combobox across all three model fields.
- Setup wizard v2 — complete onboarding flow: 8-step wizard (Welcome → Database → Cache → Provider → Regulations → Whisper → Create Admin → Done) with inline test + auto-advance, popularity-sorted regulations, Whisper SSE progress, and first-user registration built into the final step.
- Settings page restructure: Cloud LLM and Local Models merged into a single "LLM Provider" tab; new Infrastructure tab for database + cache; Swagger/API Docs links and logout moved to the top-right header.
- Docker improvements:
VOLUMEdirective for data persistence, VERSION file baked into the image, Docker Hub description auto-updated from README on release,paddlepaddlepinned to 3.2.2 for Linux ARM64. - Update notification:
GET /api/setup/check-updatechecks Docker Hub for newer versions; sidebar shows a banner with thedocker pullcommand. - Misc fixes: API key fields use
WebkitTextSecurityto prevent browser password prompts; Whisper downloads viawhisper._download()to avoid OOM; regulation activation uses the correct/activateendpoint. - README adoption improvements: Added "Who Is This For?" section, a before/after anonymisation example, and a quick curl API example.
- CI-gated Docker releases and backend import hygiene: Docker Hub publish now runs only after the main CI passes and only when the head commit contains
[release]. Cleaned up unused imports and import order.
2026-04-05
- Docker Hub distribution: Single combined image
byerlikaya/septumreplaces the previous separate backend/frontend images; one pull, one run.docker-compose.ymluses the combined image with PostgreSQL + Redis. - LLM API keys in AppSettings: Alembic migration 009 adds Anthropic/OpenAI/OpenRouter key columns. Settings API accepts updates and returns
has_*_keybooleans (never raw secrets).POST /api/settings/ollama-pullstreams model-pull progress via SSE. - Zero-config setup wizard — complete .env elimination: New
bootstrap.pymanagesconfig.json(auto-generated encryption key + JWT secret), the database engine is lazily initialised by the wizard, and a 6-step flow replaces the old.envrequirement.docker run+ wizard is now the only setup needed. - AuthGuard React 19 fix: Moved
router.replace()from render intouseEffectto prevent "Cannot update component while rendering" under React 19. - Chat retry path & ingestion hygiene: Chat handler skips chunk retrieval and top-k tuning when
pre_approved_chunksis present. Document pipeline persists entity detections withadd_all. RefactoredentityColorswith shared classification and badge/highlight maps. - README restructure for product positioning: Split README into a user-focused README and a technical ARCHITECTURE doc (EN + TR). README now leads with value proposition, how-it-works, a Why-Septum table, and a simplified Quick Start.
- Comprehensive 3-layer PII detection benchmark: Rewrote
tests/benchmark_detection.pywith 1618 entities across 10 types and all 17 regulations. Grand total: 100% precision, 99.7% recall, 99.9% F1. Per-layer tables and Mermaid charts added to both READMEs.
2026-04-04
- Approval data persistence & review: Approval context (masked prompt, chunks, decision) is now stored on user messages via a new
approval_dataJSON column (migration 008). After approve/reject, the question gets a green/red badge that opens a read-only review modal; rejected messages include a "Tekrar gönder" button that resends with pre-approved chunks. - Regulations page redesign: Replaced the stacked 3-section layout with tab navigation (Regulations / Custom Rules / Advanced). Built-in regulations render as a compact sorted list instead of a card grid; each row expands to show entity type badges.
- Remove chunks page: Deleted
/chunksroute,ChunkCard/EntityBadgecomponents, and 52 i18n keys. Moved entity colour utilities tolib/entityColors.ts. - Chat UX cleanup: Removed edit/delete and regenerate buttons from messages, added a copy button to user messages, added a chat history "delete all" button, and added a "reprocess all" button in the documents page.
- Entity highlight UX improvements: Sticky filter bar, entity navigation with "1/N" counter and prev/next, side-by-side layout for PDF/image/audio documents, and PDF page navigation that jumps the iframe to the source page.
- Audit cleanup on document deletion: Deleting a document now also removes its audit events.
- Chat bug fixes: Messages no longer disappear after the first question (decoupled
ChatWindow's React key fromactiveSessionId); debug button on restored history messages is fixed. - Chat approval — no auto-timeout:
ApprovalGatewaits indefinitely until approve or reject; countdown UI removed. - Debug popup redesign: Masked prompt/answer rendered with colour-coded entity placeholder badges via a shared
PlaceholderTextutility. - CORS cleanup: Removed
localhost:3001fromFRONTEND_ORIGINdefaults. - Audit Trail v2 — entity location tracking & visual highlighting: Sanitizer now returns per-entity position data (offsets + confidence). New
EntityDetectionmodel (migration 007) stores per-chunk locations; frontendHighlightedTextrenders colour-coded inline highlights with an entity-type filter bar; audit cards get a "View detected entities" button. - UX extras: Anonymization map viewer in Document Preview via
/anon-summary, regulation tooltips on chat pills, mobile-responsive settings sidebar, chat session rename via inline edit, message delete and edit-to-resend on user messages. - Multi-tenancy + RBAC: Added
role(admin/editor/viewer) to the User model withrequire_role()/require_admindependencies. Document and chat session lists now filter byuser_id. Migration 006. - CI/CD expansion: Pipeline grew from 2 to 6 jobs — added
backend-lint(ruff + bandit),backend-security(pip-audit),frontend-typecheck(tsc --noEmit),frontend-security(npm audit). - Structured logging + Prometheus metrics: JSON structured logging via
python-json-logger; Prometheus metrics viaprometheus-client— request counter/histogram plus domain counters for document uploads, chat requests, and PII entities by type. Exposed atGET /metrics. - UX polish: Escape key stops streaming in chat. Chat PDF export via jsPDF alongside JSON export. Document Preview for all formats via a new
/api/documents/{id}/rawendpoint. - Phase 3 enterprise readiness: JWT auth with bcrypt + PyJWT,
/api/auth/register|login|me,AuthGuard+ axios interceptors, per-sessionuser_idFK (migration 005). Rate limiting via SlowAPI (Redis → memory fallback). Async document ingestion viaBackgroundTasks. Data export via/api/chat-sessions/{id}/export. - Phase 2 UX improvements: Chat regenerate button, document list filtering/sorting, bulk delete/reprocess with row checkboxes, and reusable
Skeleton/ErrorWithRetrycomponents across pages.
2026-04-03
- Phase 1 productization: Setup wizard for first-time users, persistent chat history with session management (new
ChatSession/ChatMessagemodels + sidebar), upload progress bar, save toasts, and hidden incomplete settings tabs. - GitHub Actions CI: Prune unused runner images before the backend job to avoid out-of-disk failures; cache
~/.cache/pipkeyed onrequirements.txt. - Docker Compose production deployment: Added PostgreSQL 16 and Redis 7 to
docker-compose.yml. Newdocker-entrypoint.shvalidates env vars, auto-generates encryption keys, and runs Alembic migrations. Backend falls back to SQLite whenDATABASE_URLis not set. - Pronoun coreference resolution via Ollama: New
_resolve_pronoun_coreference()layer uses a language-agnostic prompt to identify pronouns referring to already-detected persons and adds them to the anon map. Degrades gracefully when Ollama is unavailable. - README professionalization: Added Audit Trail, LLM Resilience, and API Reference sections to both READMEs; updated Technology Stack and Security sections for PostgreSQL/Redis/Alembic.
- LLM provider circuit breaker: Module-level breaker (3 failures in 120s → 60s cooldown → half-open probe).
LLMRouter.stream_chatskips to the Ollama fallback when the breaker is open. - Coreference: possessive form handling: New
strip_possessive_suffix()supports English ('s,s') and Turkish genitive markers ('in,'ın,'un,'ün). Possessive forms now resolve to the base name's placeholder. - PII detection quality metrics endpoint: New
GET /api/audit/metricsreturns per-entity-type distribution, regulation usage frequency, average entities per document, and coverage ratios. - Redis-based anonymization map caching: Optional Redis tier (24h TTL, plaintext JSON) between in-memory cache and encrypted disk, making multi-worker deployments feasible.
document_anon_store.pyrefactored to an async 3-tier hierarchy with graceful degradation. - Test suite hardening: Added 22 new tests across
test_pii_escape_scenarios.py,test_concurrent_anon_maps.py, andtest_audit.py. Total count: 77 → 99. - GDPR/KVKK audit trail and compliance report: Append-only
AuditEventmodel tracks PII detection, de-anonymization, document lifecycle, and regulation changes — never stores raw PII. New/api/auditendpoints and a frontend audit log viewer with entity breakdown. - SQLite → PostgreSQL migration with Alembic: Replaced SQLite-only dialect imports with
sqlalchemy.JSON. Refactoreddatabase.pyto support both backends viaDATABASE_URL. Removed manual_ensure_*_columnsfunctions in favour of Alembic. - CORS configuration:
FRONTEND_ORIGINin.envnow drives FastAPICORSMiddleware. - Document reprocess endpoint + blocklist persistence + delete fix: New
POST /api/documents/{id}/reprocessre-runs the full pipeline.token_to_placeholder/token_counternow persisted so coreference survives restarts. Fixed document deletion silently skipping FAISS and BM25 index cleanup. - Fix false positive LOCATION detections from Presidio and NER: Common Turkish words ("kabul", "gibi", "sözlü", "başka") were tagged as LOCATION. Added a proper-noun capitalisation check; LOCATION detections on the test fixture dropped from 28 to 4.
- Fix credit card number leaking to LLM + entity coverage system: All 17 regulation seeds defined
CREDIT_CARD_NUMBERbut no recognizer existed (Presidio's built-in usesCREDIT_CARD). Added a new recognizer, a Presidio entity alias map, and startup coverage validation that warns about entity types with no recognizer. - Rewrite PDF OCR strategy — image-based instead of character-count: Replaced the flawed
<200 charheuristic with an image-aware approach: extract text normally, then OCR only embedded images (≥150×150 px). Text PDFs now process in milliseconds. - Memory optimization — subprocess OCR + model pre-loading: Run PaddleOCR in a persistent subprocess pool so its ~1.5 GB footprint never enters the main process. Background model pre-loading at startup makes the first upload instant instead of waiting ~15s.
2026-04-02
- Memory optimization — lazy loading and singletons: Reduced backend steady-state memory from ~3.1 GB to ~1.5 GB (-52%). Switched Presidio from
en_core_web_lgtoen_core_web_sm, lazy-imported heavy ML frameworks, and cached spaCy / NER / SentenceTransformer as process-wide singletons. - Dependency conflict fix & pre-commit validation: Fixed a langchain-text-splitters version conflict and bumped presidio-analyzer/anonymizer and the langchain ecosystem. Added pip/npm dry-run checks to the pre-commit hook when dependency files are staged.
- DRY/SOLID refactoring: Backend extracted
get_or_404,load_settings,detect_language,validate_regex,sanitizer_factory.py, and aBaseCustomRecognizer. Frontend extracted 7 settings tab components,useChatStream/useChatApproval/useChunkManagerhooks, and sharedToggleSwitch/ErrorAlert/CopyButton/DataTablecomponents. - OCR engine: replace EasyOCR with PaddleOCR: Switched default OCR provider for significantly better character recognition (₺ symbol, number/letter confusion), built-in layout analysis, and spatial text ordering. Removed EasyOCR entirely.
2026-04-01
- Remove Desktop Assistant mode: Removed the ChatGPT/Claude desktop app OS-automation feature entirely — it did not align with the project's privacy-first middleware purpose.
- Address anonymization: Added
StructuralAddressRecognizerto detect postal addresses by structural cues; re-enabled NER LOC → LOCATION mapping so declared addresses are detected. Employer/company addresses are correctly excluded. - Settings override fix: Removed an environment variable override in the chat router's
_load_settings()that was ignoring database-persisted LLM model/provider changes. - 204 response body fix: Added
response_model=Noneto the chunks DELETE endpoint to comply with FastAPI's HTTP 204 no-body assertion. - Regulation entity type alignment: Added missing entity types to 7 regulation seeds (pdpl_sa, australia_pa, pdpa_th, appi, pipl, nzpa, popia) based on legal research. Detailed legal basis in
REGULATION_ENTITY_SOURCES.md. - Codebase compliance sweep: Removed ~17 redundant inline comments, replaced ~28 hardcoded frontend strings with i18n, fixed README file references, and mirrored
.cursor/rulesinto.claude/rules/.claude/skills.
2026-03-19
- Chat no-document flow and approval gate fix: Fixed chat behaviour when no document is selected via a dedicated no-context prompt path, and fixed the approval gate so
require_approvalis enforced even in no-document chats. - Sanitization pipeline overhaul — broken masking and false positives: Removed LOC → LOCATION mapping from NER, raised the NER confidence threshold to 0.85, restricted blocklist propagation to person-identifying entity types only, skipped NER for short texts, and made the Ollama alias prompt strictly person-name-focused.
- Architecture rule compliance sweep: NER layer now filters against the active policy's entity types. Moved 3 inline prompt fragments into
PromptCatalog, removed hardcoded language-specific terms, and replaced hardcodedlanguage="en"with a configurable constant. - Critical: Ollama PII validation no longer strips structured identifiers: The validation layer was sending high-priority structured IDs (NATIONAL_ID, IBAN, PHONE_NUMBER) to Ollama; empty responses cleared the list entirely, dropping every detection. High-priority types now bypass LLM validation, empty responses fall back to keeping non-passthrough candidates, and adjacent PERSON_NAME spans are merged.
- Repository hygiene: Added
bm25_indexes/to.gitignore. - Frontend TypeScript environment typing sync: Updated
next-env.d.tsto match the current Next.js bootstrap output.
2026-03-18
- Desktop mode approval gate: Desktop assistant mode now shows the same approval modal as Cloud LLM when require-approval is enabled.
- Query-time PII sanitization and validation layer: Chunks remain raw in the DB; sanitization runs at query time for both Cloud and Desktop flows with the full pipeline including the Ollama validation layer.
- Ollama PII validation and JSON robustness: Added a language-agnostic
pii_validation_promptinPromptCatalog. Madeextract_json_arraytolerant of markdown fences, multiple arrays, trailing commas, and extra text. - use_ollama_validation_layer setting: New backend setting and Privacy UI toggle to enable/disable the Ollama PII validation layer (default true).
- Next.js and lockfile: Set
outputFileTracingRootinnext.config.mjsto resolve the multi-lockfile warning; removed the redundant rootpackage-lock.json. - CI test fixes: Desktop assistant factory uses lazy imports for macOS/Windows modules so Linux CI does not require
pyautoguiorpygetwindow.ChatWindowagent-log fetch calls guarded with atypeof fetchcheck so Jest tests pass. - README (EN/TR): Documented desktop mode approval gate and the optional Ollama PII validation layer; Turkish README brought into sync.
2026-03-16
- Desktop Assistant Mode with RAG support: Added an optional Desktop Assistant Mode that sends the user's question (or a RAG-enabled prompt) directly to a local ChatGPT or Claude desktop client via OS automation (AppleScript on macOS, window activation + keystrokes on Windows). Extracted
ChatContextPayloadandbuild_chat_promptso Cloud and Desktop flows share one prompt builder; feature is opt-in behinddesktop_assistant_enabled.
2026-03-15
- Audio transcription accuracy and fixes: Whisper now uses the selected model from settings (previously hardcoded "base") and an optional
default_audio_language. Fixed "decoder produced no samples or text" by using the correct file extension from the upload MIME type; forced CPU on Apple Silicon to avoid MPS NaN logits. - Changelog maintenance: Rules updated so the changelog documents changes since last push and groups by logical development unit.
- Multi-document chat: When multiple documents are selected, all are sent to the API and included in context. Backend prioritises
document_idsoverdocument_id; chunk retrieval enforces a minimum of 10 chunks per document in multi-doc mode. - Generic retrieval improvements for holistic queries: Adaptive top_k, document-theme retrieval with RRF merge, last-chunk inclusion, and a holistic-interpretation prompt so broad questions get sufficient context.
2026-03-12
- Vector search async fix: Wrapped blocking
VectorStore.search/hybrid_search(FAISS + ML model ops) inasyncio.to_thread()so they stop freezing FastAPI's event loop. Chat retrieval now completes in ~3s instead of timing out at ~95s. - Hybrid retrieval (BM25 + FAISS): Implemented hybrid search combining BM25 with semantic FAISS via Reciprocal Rank Fusion. Dramatically improved retrieval quality for contract queries by combining exact term matching with semantic similarity.
- Table and field extraction: Added pdfplumber-based structured extraction for legal/contract documents.
Chunkmodel now carrieschunk_type,field_label,field_value,field_type; key-value pairs are indexed as separate FieldChunks. - Enhanced semantic chunking: Upgraded
StructuredDocumentChunkerwith LangChain'sSemanticChunker. Large sections now split by semantic coherence instead of arbitrary paragraph boundaries, with a fallback chain semantic → paragraph → raw. - Prompt hardening: Strengthened the chat prompt with anti-hallucination rules: only answer if the exact info is in context, respond "I cannot find that information" otherwise, never invent or merge placeholders.
- Dependencies: Added rank-bm25, pdfplumber, langchain-text-splitters, and langchain-experimental.
- PII detection improvements: Upgraded all language NER models to state-of-the-art XLM-RoBERTa variants. Improved the Ollama Layer 3 prompt, made blocklist include all entity tokens regardless of casing, and made Ollama span matching case-insensitive.
- KVKK ruleset (6698): Expanded entity types to align with Madde 3(d) and Madde 6: added SSN, TAX_ID, COORDINATES, IP_ADDRESS, COOKIE_ID, DEVICE_ID, DNA_PROFILE, MEDICATION, CLINICAL_NOTE, SEXUAL_ORIENTATION, and financial fields.
- Regulation rulesets (GDPR, UK GDPR, CCPA/CPRA, PIPEDA): Aligned entity types with official legal texts. Added FIRST/LAST_NAME, ETHNICITY, STREET_ADDRESS, CITY, COORDINATES, MAC_ADDRESS, and related fields.
- Regulation entity sources doc and rule: Added
backend/docs/REGULATION_ENTITY_SOURCES.mdand a rule requiring updates when built-in regulation entity types change. - Sanitizer and PII pipeline: Ollama PII layer wrapped in try/except so failures don't break sanitization. NER uses a language-aware confidence threshold. Blocklist adds single-token entities so residual mentions are redacted.
- Chat query sanitization: User messages are sanitised before retrieval and the LLM call — same pipeline as document text.
- E2E test and ingestion errors: Turkish PII E2E test no longer skips — uses Presidio-detectable email, language + anon-map mocks, and an LLM mock. Document upload 500 response now includes
ingestion_errorin the detail for easier debugging. - NER model overrides: NER Models settings tab supports per-language overrides — edit the HuggingFace model ID, restore defaults per row, and save to
app_settings.ner_model_overrides. - Error logging and Error Logs UI: Centralized error logging via a new
ErrorLogmodel,error_loggerservice, global exception handler, andPOST /api/error-logs/frontend. New Error Logs page lists, filters, and clears logs; sidebar shows an error count badge. - Document preview: Copy button shows a "Copied" state for user feedback.
- Chat: Badge under assistant messages when the answer was produced by the Ollama fallback.
- Docs: README (EN/TR) add Changelog and License links in the header.
- Changelog and rules: Split same-day entries by commit date and updated the rule to verify the date via
date +%Y-%m-%dandgit log --date=short. - Backend dependencies: Bumped
langchain-experimentalfrom 0.3.6 to 0.4.1 for Python 3.13 / GitHub Actions compatibility. - Recognizer regex and E2E test: Fixed Presidio regex patterns in GDPR, HIPAA, and KVKK packs (correct
\bword boundary and escapes). SetTLDEXTRACT_CACHEinconftestbefore imports to avoid permission errors. - Backend tests and CI: Fixed the E2E Turkish PII test in CI by patching
PdfIngester._run_ocr_on_pageso the text layer is preserved. Replaced deprecateddatetime.utcnowwithdatetime.now(timezone.utc). Addedpytest.iniwithfilterwarningsto suppress noisy deprecation warnings.
2026-03-11
- OCR and PII improvements: Enhanced image/PDF OCR quality, improved OCR ingestion flow, and refined person name masking and PII handling.
- Spreadsheet enhancements: Added spreadsheet schema metadata, numeric-aware chat for tabular content, and limited schema display to truly tabular documents.
- Infrastructure and tooling cleanup: Unified environment loading defaults (including Ollama), and removed legacy coverage/Codecov tooling.
- ODS support: Added ODS (OpenDocument Spreadsheet) ingestion support and documented it in both English and Turkish READMEs.
- LLM routing and prompt catalog: Refactored the LLM router into a provider-strategy layer, introduced a document processing pipeline orchestrator, centralized all backend LLM/Ollama prompts under
PromptCatalog, and added a shared AppSettings factory plus updated tests.
2026-03-10
- Documentation and licensing: Expanded README content with PII pipeline and AI gateway sections, screenshot gallery, and clarified extension workflow; added MIT license and kept EN/TR READMEs in sync.
- Testing and quality: Improved backend and frontend coverage, added Jest setup, fixed async engine and aiosqlite warnings, and ensured backend tests import the app package correctly.
- Sanitization and PII pipeline: Hardened sanitizer structure and robustness, generalized the PII pipeline, added configurable text normalization rules and non‑PII filters, and localized deanonymization banner copy.
- Chat experience: Added global i18n for chat UI, approval flow localization, chat debug tools, document‑optional chats, generic prompts, and post‑processing for malformed LLM output.
- Platform and tooling: Introduced Dockerfiles and docker‑compose for backend/frontend, tracked env templates, pinned backend dependencies, and updated docs and dependencies.
- UI and layout: Refined documents, chunks, and settings UIs; improved sidebar layout; and added the Septum logo across the app.
2026-03-09
- Core platform foundation: Bootstrapped the Septum project skeleton with core utils, crypto, database models, and health checks.
- Ingestion pipeline: Implemented ingestion base and office ingesters (documents, spreadsheets, presentations), plus image and audio ingesters with health checks.
- Privacy and recognition engine: Added national ID validators and tests, a multilayer sanitizer, anonymization map with coreference handling, and a regulation‑aware recognizer registry and policy composer.
- Vector store and retrieval: Introduced an encrypted FAISS vector store per document and ignored local index artifacts from version control.
- Backend services and frontend shell: Added LLM router, deanonymizer, approval gate, chat pipeline wiring, settings sync, settings UI, regulations UI, documents UI, and the initial Next.js frontend shell with layout and API client.