Security architecture — technical details

A Tenancy & runtime isolation

One container per customer. Isolation that’s architectural, not a policy.

Every customer’s agent runs in its own dedicated container — a separate, walled-off runtime with its own process space, filesystem, and memory. There is no shared multi-tenant process where one customer’s request and another’s are handled by the same long-lived worker.

Own network segment. Each tenant container sits on its own network segment with firewall isolation between tenants. East-west traffic between customer containers is not permitted; a container reaches the model provider and its connected APIs outbound, nothing laterally.
No shared runtime. Because there is no common in-process execution path, there is no class of bug where a prompt or payload in one tenant can read another tenant’s memory, tools, or tokens. The boundary is the OS/container boundary, not application logic we ask to behave.
Per-tenant persistence. Memory and skills live on per-tenant volumes that persist across image upgrades; the runtime image can be rebuilt and rolled forward without touching a tenant’s data (see §F).
Independent lifecycle. Containers start, restart, suspend, and stop independently. A subscription cancellation suspends exactly that tenant’s container; nothing else is affected.

Why it matters

The most expensive multi-tenant breaches come from a shared runtime where a logic flaw crosses the tenant line. OSP removes that category by giving every customer a real isolation boundary — the same kind you’d get running the agent on your own dedicated host, without you having to run it.

B Identity & access

An OAuth broker with PKCE. The platform never receives a provider password.

Third-party app connections (Google, Microsoft, Slack, a CRM, and any other OAuth provider) go through an OAuth authorization- code flow with PKCE. The user authenticates against the provider directly; OSP brokers the exchange and stores only the resulting tokens.

Authorization code + PKCE. The flow uses a proof-key-for-code-exchange so the authorization code can’t be replayed by an interceptor. Credentials are entered on the provider’s own domain — the platform never sees, transmits, or stores a provider password.
Scoped, least-privilege tokens. Each integration requests only the per-provider scopes it needs (e.g. read a calendar vs. full mailbox control). The agent can do exactly what the granted scopes allow and nothing more.
Revocable at any time. Because access is a token rather than a password, the user can revoke it from the provider’s own settings instantly, with no dependency on OSP. Revocation takes effect at the provider.
Automatic refresh. Short-lived access tokens are refreshed transparently using the refresh token; the long-lived secret never has to ride along on every request.

What this rules out

Password-based integrations create a credential that can be phished, reused across services, and only revoked by changing the password everywhere. Brokered OAuth gives a narrowly scoped, independently revocable grant instead — the platform is never a place a provider password could leak from, because it never had one.

C Secrets management

Encrypted vault. Injected only at boot, in memory, never persisted.

Customer model API keys and channel tokens are held encrypted in Supabase Vault. They are decrypted and injected into the tenant container at runtime, only at boot, into process memory — and they are deliberately kept out of every durable surface.

Encrypted at rest in the vault. Secrets live in the managed vault, not in ordinary database columns. Application code reads them through the vault, not by selecting a plaintext field.
Runtime-only injection. A secret is materialized into the container’s environment/memory when the container starts. When the container stops, the in-memory copy is gone — there is no on-disk persisted copy inside the tenant to recover.
Never written to: database columns (as plaintext), image layers, source control, environment files committed to git, or logs. Logging is structured to avoid emitting secret material.
Rotation. Replacing a key in the vault and triggering a configure-restart re-injects the new value at the next boot; the old in-memory value dies with the previous container. Rotation is a vault write plus a restart, not a redeploy of the image.

The invariant

Keys are never baked and never plaintext-at-rest in the app database. The only place a live key exists is in the memory of the running container that needs it, for as long as it’s running. That collapses the blast radius of a database or image compromise to near zero for secret material.

D Data isolation

Postgres row-level security: no cross-tenant read, even by mistake.

Beyond the container boundary, tenant data is isolated at the database layer with Postgres Row-Level Security (RLS). Per-owner policies are enforced by the database engine on every query — not by application code that has to remember to add a WHERE owner = … clause.

Per-owner policies. RLS policies scope every row to its owner. A query issued in one tenant’s context cannot return another tenant’s rows; the engine filters them out before the result is built.
Default-deny posture. With RLS enabled and policies in place, a missing or wrong filter in application code fails closed (returns nothing) rather than open (leaks other tenants). The safe default is no data, not all data.
Encryption in transit and at rest. Connections are TLS; data is encrypted at rest by the managed database. Combined with the vault for secrets, sensitive material is never sitting in plaintext where it doesn’t need to be.
Two layers, both required. Network/container isolation protects the runtime; RLS protects the data. Neither substitutes for the other — they are defense in depth.

E Governance layer — the deep dive

A separate enforcement layer around the model — in, out, and around every action.

This is the part most platforms skip. OSP wraps every agent in a governance layer: a set of programmable guardrails that run as separate, deterministic checks around the model, not as instructions buried inside a prompt. A prompt can be argued with; an enforcement layer cannot. It is default-on for every customer, included free, model-agnostic, and enforced per message.

Requests flow through ordered guardrail stages. Each stage can reject (stop processing and return a safe refusal) or alter (mask, redact, or rephrase) the content before it moves on:

1 · in

Input

reject / alter user message

Dialog

allowed conversation flows

Retrieval

guard retrieved context

core

Model

your agent reasons

Execution

gate tools & actions

5 · out

Output

reject / redact reply

The guardrail catalog

Input guardrails

runs before the model

Applied to the user message before the agent ever sees it. Can reject the input outright or alter it (e.g. mask sensitive data) before processing continues.

Jailbreak detection. Classifies and scores input for jailbreak patterns — attempts to get the agent to drop its constraints — and blocks them before they reach the model that holds the tool tokens. Benefit: the model never has to win an argument it might lose.
Prompt-injection detection. Identifies instructions smuggled into user content (or into data the user pastes) that try to hijack the agent. Benefit: injected “ignore your rules” payloads are stopped at the door.
Topic & scope control. Enforces an allowed/blocked topic list per tenant; off-scope requests are deterministically refused, not just discouraged. Benefit: the agent stays inside the job it was deployed for.
Content safety / moderation. Filters abusive, harmful, or policy-violating input. Benefit: abuse is filtered before it can shape a response.
Sensitive-data-in detection. Detects PII in the user’s message and can mask it before it reaches the model. Benefit: sensitive data isn’t needlessly exposed to the model or downstream logs.

Dialog guardrails

shapes the conversation

Operate on the canonical form of messages to influence how the dialog evolves — whether to invoke the model, run an action, or use a predefined safe response.

Allowed-path conversation flows. Conversation logic is defined explicitly, so the agent follows approved flows for sensitive or regulated interactions rather than free-wheeling. Benefit: high-stakes conversations follow a vetted script, not improvisation.

Retrieval guardrails

guards RAG context

Applied to chunks retrieved in a retrieval-augmented (RAG) flow. A retrieval guardrail can reject a chunk so it never reaches the prompt, or alter the relevant chunks.

Retrieved-context guarding. Screens documents and knowledge pulled in to answer a question. Benefit: a poisoned or malicious document can’t silently steer the agent through its own knowledge base — a key indirect-injection defense.

Execution / tool guardrails

wraps tool calls & actions

Control custom action and tool invocations. These wrap the moment the agent tries to do something in the outside world.

Tool allow-lists. The agent can only call the tools you’ve permitted; anything off the list is denied. Benefit: capability is opt-in, so a hijacked agent has a tiny action surface.
Approval gating for sensitive actions. High-risk operations — sending money, deleting data, sending outbound email on your behalf, infrastructure changes — pause for explicit approval before they run. Benefit: irreversible or costly actions require a human yes.
Argument validation. The arguments to a tool call are validated before execution. Benefit: a malformed or malicious call is caught before it touches an external system.
Result validation. What a tool returns is checked before it’s fed back into the agent. Benefit: tool output can’t become a fresh injection vector.

Output guardrails

runs before the reply leaves

After the model generates a reply, output guardrails decide whether it is allowed, must be altered, or must be rejected — before the user ever sees it.

PII detection & redaction. Detects and masks sensitive entities — SSNs, credit-card numbers, secrets, emails, phone numbers — in the outgoing reply. Benefit: the agent can’t leak sensitive data, even if it somehow assembled it.
Content safety. Blocks unsafe, harmful, or policy-violating generations on the way out. Benefit: harmful output is stopped at the boundary, not after delivery.
Hallucination / fact-check (self-check). Optionally checks a claim for self-consistency or against retrieved context — sampling multiple answers and flagging disagreement, or verifying against source material. Benefit: reduces confidently-wrong answers on factual queries (off by default where it adds latency).

Cross-cutting properties

Per-message enforcement — guardrails run on every message, every direction, and can’t be skipped per request. Model-agnostic — the layer sits around whichever model your agent uses, so protection doesn’t change when the model does. Audit logging — every guardrail decision (what fired, on which rail, what was blocked/altered, and why) is written to a per-tenant audit trail you can review. Fail-safe behavior — if a guardrail can’t run, the system fails safe (default fail-closed for governed enforcement) rather than waving a message through ungoverned. Default-on, free — enabled for every customer out of the box; security is a baseline, not a premium tier.

The distinction that matters

Prompt-level instructions are the model persuading itself. Guardrails are an independent layer enforcing behavior in and out, and logging every decision. A sufficiently clever injection that survives a model’s trained refusal still has to get past deterministic checks it can’t talk to.

F Supply-chain & release security

Leak scanning as a hard CI gate. Immutable tags. Snapshot-and-rollback.

Release integrity is enforced by the pipeline, not by anyone remembering to check. Every agent image release passes through automated gates before it can ship.

Branding + secret leak scan as a hard CI gate. Every image release is scanned for secrets and for anything that shouldn’t be heading into a shipped image. If a credential or leak is detected, the build fails and the image does not ship — the gate is blocking, not advisory.
Immutable image tags. Releases are pinned to immutable tags rather than mutable ones like latest, so a given tenant is always running a known, fixed image — and a tag can’t be silently re-pointed underneath a running fleet.
Snapshot + rollback on fleet upgrades. Per-tenant volumes are snapshotted before a fleet upgrade. The brand-overlay → golden-image pipeline rebuilds the image, but per-tenant memory and skills volumes carry forward — and a bad rollout can be rolled back.
Governance versions in lockstep. The governance layer is baked into the same image as the agent, so the enforcement layer and the agent it wraps always ship and version together — there is no window where one is upgraded without the other.

G Shared responsibility & honest threat model

What OSP secures, and what stays yours.

No security page should pretend the operator owns every risk. Here is the honest split.

OSP secures

the platform & the architecture

Per-tenant container & network isolation
The OAuth broker, PKCE, and token storage
Encrypted vault & runtime-only secret injection
Postgres RLS and encryption in transit/at rest
The default-on governance layer on every message
Leak-scanned, immutable, rollback-able releases

You secure

your account & your grants

Your login — a strong, unique password
Two-factor authentication on your account
Which apps you connect and which scopes you grant
Reviewing and revoking access you no longer need
Your tenant’s topic, tool, and approval policies

The honest core: just like your bank, your email, and your phone, if someone obtains your login, they can get in. That risk is not unique to OSP — it is the same surface you already trust daily. Because connections are scoped tokens, you can revoke access at any time from the provider’s settings, with no dependency on us.

For data-processing specifics — what is processed, retention, sub-processors, and transfer terms — see the Data Processing Addendum and the Privacy Policy.

H References

Where these mechanisms come from.

The governance guardrail model described in §E follows the established “programmable rails around an LLM” pattern — ordered input → dialog → retrieval → execution → output stages, each able to reject or alter. The infrastructure controls map to the managed, audited services OSP runs on; the open standards below are public references for the mechanisms used.

OAuth 2.0 authorization-code flow with PKCE — proof key for code exchangeIETF RFC 7636 (PKCE)
OAuth 2.0 Authorization Framework — scoped, revocable token grantsIETF RFC 6749
OAuth 2.0 Token RevocationIETF RFC 7009
Postgres Row-Level Security & the encrypted secrets vault OSP’s data layer is built onSupabase · Row-Level Security
Sensitive-data (PII) detection & anonymization concepts — SSNs, credit cards, emails, phones, namesPII detection & anonymization (open source)

Secure by architecture. Live in minutes.

Your own container, brokered OAuth, a vault for your keys, RLS on your data, and governance on every message — from the first message.

Deploy your agent →← Plain-English overview