Multi-Tenant SaaS Architecture: Isolation, Auth & Billing Patterns • RiseGravity

What multi-tenant SaaS architecture actually means

Multi-tenant SaaS architecture is the practice of serving many customers (tenants) from a single, shared application and infrastructure while keeping each tenant's data, configuration, and billing strictly separate. Get it right and one deploy serves a thousand companies. Get it wrong and you ship data leaks, noisy-neighbor outages, and a billing system nobody trusts.

We've built multi-tenant platforms across very different shapes of load—from DomainFlow.ai, an AI domain-sales automation SaaS with per-operation cost metering, to ProTerminal.io, an institutional-grade market terminal running 16 app instances and 100+ background jobs. The fundamentals below are the ones that survived contact with production.

Key takeaways

Isolation is a spectrum, not a switch. Choose a model per workload, not per company.
Tenancy belongs in the data layer first. If isolation lives only in your WHERE clauses, one missing filter is a breach.
Auth and authorization are different problems. Authenticate the user once; authorize against the tenant on every request.
Meter before you bill. You cannot invoice usage you never recorded.
Make the right thing the easy thing. Tenant scoping should be the default path, not a thing engineers remember to add.

The three isolation models

Every multi-tenant decision starts here. There are three canonical models, and mature platforms usually mix them.

1. Shared database, shared schema (pooled)

All tenants live in the same tables, separated by a tenant_id column on every row. This is the cheapest to operate and the easiest to scale horizontally—one schema to migrate, one connection pool to tune.

-- Every tenant-owned table carries the discriminator
create table invoices (
  id          uuid primary key default gen_random_uuid(),
  tenant_id   uuid not null references tenants(id),
  amount_cents integer not null,
  status      text not null,
  created_at  timestamptz not null default now()
);

-- The composite index makes tenant-scoped reads fast
create index idx_invoices_tenant on invoices (tenant_id, created_at desc);

The risk is obvious: forget WHERE tenant_id = $1 once and a customer sees another customer's invoices. The fix is to make scoping structural, which we cover below with row-level security.

2. Shared database, schema per tenant (bridge)

Each tenant gets its own schema (tenant_acme.invoices, tenant_globex.invoices) inside one database. You gain cleaner isolation and per-tenant backup/restore, at the cost of migrations that must fan out across every schema. This works well up to a few hundred tenants; past that, schema sprawl makes migrations slow and brittle.

3. Database per tenant (siloed)

Every tenant gets a dedicated database (or cluster). This is the strongest isolation—ideal for enterprise customers with compliance requirements, data-residency rules, or a contractual "your data never shares a process with anyone else." The trade-off is operational weight: connection management, provisioning automation, and per-tenant migration orchestration.

Choosing a model

Dimension	Pooled	Schema-per-tenant	DB-per-tenant
Cost per tenant	Lowest	Medium	Highest
Isolation strength	Lowest	Medium	Highest
Migration effort	One run	Per schema	Per database
Noisy-neighbor risk	High	Medium	Low
Best for	Self-serve, SMB	Mid-market	Enterprise/compliance

The pragmatic answer for most SaaS: start pooled, with the door open to promote your largest or most regulated tenants to siloed databases later. Design your data-access layer so the tenant resolves to a connection string—then "where does this tenant live" becomes a lookup, not a rewrite.

Enforcing isolation at the data layer

If your only defense against cross-tenant reads is application code, you are one forgotten filter away from an incident. Push enforcement down a layer.

Row-level security (RLS)

PostgreSQL's row-level security turns the discriminator into a database-enforced rule. Even a buggy query can't escape the tenant boundary.

alter table invoices enable row level security;

create policy tenant_isolation on invoices
  using (tenant_id = current_setting('app.tenant_id')::uuid);

Your application sets app.tenant_id once per request (from the authenticated session), and every query against invoices is automatically constrained. The filter is no longer something a developer can forget—it's policy.

A tenant-aware data layer

Whatever your database, route all access through a layer that requires a tenant context. A repository that won't compile—or won't run—without a tenant id is worth more than a wiki page reminding people to add filters.

// Every query starts from a tenant-scoped client; there is no "global" find.
function tenantClient(tenantId) {
  return {
    invoices: {
      find: (q) => db.invoices.find({ ...q, tenant_id: tenantId }),
      create: (doc) => db.invoices.insert({ ...doc, tenant_id: tenantId }),
    },
  };
}

The principle: make tenant scoping the default path and an un-scoped query the conspicuous exception that shows up in code review.

Authentication vs. authorization in multi-tenant systems

These two get conflated constantly, and the confusion causes real bugs.

Authentication answers "who is this user?" You do it once per session, typically with a short-lived access token (JWT) plus a refresh token.
Authorization answers "is this user allowed to do this in this tenant?" You do it on every request.

A single user often belongs to multiple tenants (think an agency managing several client workspaces). So identity is global, but membership and role are per tenant.

// A JWT carries identity; the active tenant + role are resolved per request.
// Never trust a tenant_id sent by the client without checking membership.
async function resolveContext(req) {
  const userId = verifyAccessToken(req.headers.authorization);
  const tenantId = req.headers["x-tenant-id"];

  const membership = await memberships.findOne({ userId, tenantId });
  if (!membership) throw new ForbiddenError("Not a member of this tenant");

  return { userId, tenantId, role: membership.role };
}

The critical rule: a tenant id supplied by the client is a request, not a fact. Always confirm the authenticated user is a member of that tenant before honoring it.

Role-based access control (RBAC)

Most B2B SaaS needs at least owner / admin / member / viewer. Model roles per membership, and check permissions—not roles—at the call site, so you can refactor the role-to-permission mapping without touching every handler.

const PERMISSIONS = {
  owner:  ["billing:manage", "members:manage", "data:write", "data:read"],
  admin:  ["members:manage", "data:write", "data:read"],
  member: ["data:write", "data:read"],
  viewer: ["data:read"],
};

const can = (role, permission) => PERMISSIONS[role]?.includes(permission);

Metered billing without surprises

Billing is where multi-tenant SaaS quietly leaks money—or quietly overcharges customers and burns trust. Both are avoidable.

Meter first, bill second

You can only invoice what you recorded. Emit a usage event at the moment a billable action happens, write it to an append-only log keyed by tenant, then aggregate on a schedule. On DomainFlow we metered the cost of every AI operation per tenant—research calls, pitch generations, enrichment lookups—so the invoice was a sum of facts, not an estimate.

// Append-only usage events; aggregation happens later, idempotently.
await usageEvents.insert({
  tenantId,
  meter: "ai.pitch.generated",
  quantity: 1,
  costCents: 4,        // your cost, for margin tracking
  occurredAt: now(),
  idempotencyKey,      // dedupe retries so you never double-count
});

Plans, limits, and enforcement

Three layers keep billing honest and infrastructure safe:

Plan definition — what each tier includes (seats, API calls, storage, feature flags).
Soft limits — warn at 80%, nudge to upgrade; never surprise a customer with a hard wall mid-task.
Hard limits — per-tenant rate limits and spend caps that protect you from a runaway tenant (or a leaked API key) running up an unbounded bill.

Reconcile with your payment provider via webhooks

Stripe is the source of truth for payment state; your database is the source of truth for usage. Keep them in sync with idempotent webhook handlers—checkout.session.completed, invoice.paid, customer.subscription.updated—and treat every webhook as potentially duplicated. (Idempotency is a recurring theme; we go deep on it in Background Jobs at Scale with BullMQ and Redis.)

Avoiding the noisy-neighbor problem

In a pooled model, one tenant's heavy usage can degrade everyone else. Defenses:

Per-tenant rate limiting at the edge, not just global limits.
Fair-share queues for background work, so one tenant's 10,000 queued jobs don't starve everyone else's. Weight or partition queues by tenant.
Connection-pool budgets so a single tenant can't exhaust the database pool.
Promote heavy tenants to siloed resources once they outgrow the shared tier—this is exactly why you kept the door open in your data layer.

A pragmatic rollout sequence

You don't need every pattern on day one. Ship in this order:

Pooled database with a non-null tenant_id on every tenant-owned table.
A tenant-aware data layer that refuses un-scoped queries.
Database-enforced isolation (RLS or equivalent) before your first external customer.
Identity (global) + membership/role (per tenant), with server-side membership checks.
Usage metering as append-only events from the first billable action.
Plan limits and Stripe webhook reconciliation.
Per-tenant rate limiting and fair-share queues as you scale.
Siloed databases for enterprise/compliance tenants when the need is real.

Tenant lifecycle: provisioning, observability, and offboarding

Architecture diagrams stop at "tenant exists." Real platforms have to create tenants, watch them, and eventually let them leave—cleanly.

Provisioning should be automated from the first signup. Creating a tenant is more than a row: it's a default role assignment for the first user (owner), an initial plan, any per-tenant resources (a schema or database if you're siloed, a storage prefix, default settings), and often seed data so the workspace isn't empty. Script this end to end—manual tenant setup is a bottleneck and a source of inconsistency the day you get traction.

Per-tenant observability is what lets you operate at scale. Aggregate dashboards hide the tenant who's about to churn or the one quietly melting a database. Tag your logs, metrics, and traces with tenant_id so you can answer "is this slow for everyone or just Acme?" in seconds. On ProTerminal, per-tenant and per-job-class metrics were the difference between diagnosing a problem in minutes versus guessing. Track, per tenant: request latency, error rate, usage against plan limits, and cost-to-serve—the last one tells you which customers are actually profitable.

Offboarding is a feature, not an afterthought, and increasingly a legal requirement. When a tenant cancels, you need a defined path: export their data on request, then delete it on a schedule (with the audit trail to prove you did). In a siloed model this is "drop the database"; in a pooled model it's a careful, tenant-scoped cascade delete—another reason your data layer must make tenant scoping structural. Build the export and deletion flows before you need them; scrambling to honor a deletion request manually is how mistakes happen.

Frequently asked questions

Should I start with database-per-tenant for safety? Usually no. Start pooled with database-enforced isolation; it's far cheaper to operate and scales to thousands of self-serve tenants. Reserve siloed databases for tenants who genuinely require it (compliance, data residency, enterprise contracts) and design your data layer so promoting them later is a configuration change.

Where should the tenant id live in a request? Identity comes from the access token; the active tenant comes from an explicit header or path segment—then you verify the authenticated user is a member of that tenant. Never infer the tenant from client-supplied data without a membership check.

How do I handle a user who belongs to many tenants? Keep identity global and store one membership row per (user, tenant) with its own role. The user authenticates once and switches the active tenant per request; authorization is always evaluated against the membership for the active tenant.

What's the most common multi-tenant bug? A query missing its tenant filter. The fix is structural: enforce isolation in the database (RLS) and route all access through a tenant-scoped data layer so an un-scoped query is the visible exception, not an easy accident.

Build it with a team that's shipped it

Multi-tenant architecture rewards getting the foundations right early—isolation, auth, and metering are painful to retrofit. If you're designing a new SaaS platform or untangling tenancy in an existing one, we can help. See recent work on our Projects page, read our companion guides on background jobs and RAG in production, or reach out at contact@risegravity.com.