In short
Databases enforce access control in three layers, and a request that runs a single SELECT passes through all three before a byte of data leaves the server.
Layer 1 — authentication. The server verifies who the client claims to be. The credential can be a password hashed with SCRAM-SHA-256 (the modern Postgres default), a client certificate (mTLS), a Kerberos/GSSAPI ticket, an LDAP bind, or a short-lived IAM token signed by a cloud provider. Postgres's pg_hba.conf is a first-match-wins table mapping (host, database, user, client IP) to an authentication method. A connection that does not match any rule is rejected before it sees a prompt.
Layer 2 — authorisation via roles. A role is a named bundle of privileges. Privileges are per-object — SELECT on a table, USAGE on a schema, EXECUTE on a function. Roles can be granted to other roles, producing an inheritance graph; SET ROLE switches the active identity within a session; SECURITY DEFINER functions allow controlled privilege elevation for specific operations. Default privileges (ALTER DEFAULT PRIVILEGES) extend grants to future objects, not just existing ones.
Layer 3 — row-level security (RLS). After authorisation passes, a per-row predicate filters what the user sees. CREATE POLICY tenant_iso ON tickets FOR SELECT USING (tenant_id = current_setting('app.tenant_id')::int) causes every SELECT on tickets to transparently AND the predicate into the query. This is the cleanest way to build multi-tenant SaaS: one shared table, one policy, zero application-level WHERE clauses. The cost is that the policy runs per row touched — you index the predicate or it hurts.
Postgres has full RLS. Oracle calls it Virtual Private Database (VPD). SQL Server ships RLS. MySQL has no native RLS; you emulate it with per-user views. This chapter walks all three layers, builds a working multi-tenant policy in Python, and lists the traps — including the one that Postgres silently lets the table owner bypass unless you say FORCE ROW LEVEL SECURITY.
A SaaS table with three tenants and no leaks
Picture a B2B support product. One Postgres table holds tickets for every customer:
tickets(id bigint, tenant_id int, subject text, body text, created_at timestamptz)
Tenant 1 (Acme Corp) has 4000 tickets. Tenant 2 (Contoso) has 400. Tenant 3 (Globex) has 40000. Your HTTP backend receives a request from a signed-in Acme user and runs SELECT subject, created_at FROM tickets ORDER BY created_at DESC LIMIT 20. This query — as written — returns the 44440 most recent tickets across all three tenants. A single missing WHERE tenant_id = 1 and Acme's support dashboard lists Globex's tickets.
Three options.
A — application-level filtering. Every query carries a WHERE tenant_id = $1. Works until someone writes a new endpoint, forgets the clause, and leaks production data. One missing line is a security incident.
B — one schema or one database per tenant. Fresh copies of every table per tenant. Isolation is total, but DDL must walk N tenants and at 10 000 tenants this is operationally dead.
C — row-level security. Enable RLS on tickets once, write one policy that filters tenant_id = current_setting('app.tenant_id')::int, and the backend sets app.tenant_id at the start of each request. Every query — aggregates, joins, even mistaken ones missing the WHERE — transparently sees only the current tenant's rows. Three lines of DDL:
ALTER TABLE tickets ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_iso ON tickets FOR ALL
USING (tenant_id = current_setting('app.tenant_id')::int)
WITH CHECK (tenant_id = current_setting('app.tenant_id')::int);
That is the punchline. The rest of this chapter is what sits underneath — the authentication that decides who current_user is in the first place, the role machinery that decides what tables they can touch at all, and the RLS machinery that decides which rows inside those tables.
Layer 1 — Authentication
Authentication runs after the TCP and TLS handshakes complete. The wire protocol opens with a StartupMessage advertising user=alice dbname=app. The server responds with an AuthenticationRequest telling the client which method to use. Not a negotiation — the server picks based on pg_hba.conf and the client either has the credential the server demanded or the connection fails.
Credentials the server will accept
- Password (SCRAM-SHA-256). The default since Postgres 10. The password is hashed client-side with a per-connection nonce so it is never sent in the clear and never reusable. Why SCRAM and not MD5: MD5-hashed passwords in
pg_authidwere a replay hazard — possession of the hash was equivalent to possession of the password, because the protocol sentmd5(password || username)directly. SCRAM adds a challenge-response so the wire value changes every handshake and the server-stored value is not sufficient to log in. - Client certificate (
cert, mTLS). The client presents an X.509 certificate during the TLS handshake; its Common Name (or a mapped field viapg_ident.conf) must match the Postgresuser. No password — the cryptographic handshake is the authentication. The production default for service-to-service traffic once you have a certificate authority. - Kerberos / GSSAPI. Enterprise single sign-on. The client presents a Kerberos service ticket for
postgres/host.example.com@REALM. Common inside large organisations with Active Directory. - LDAP. Postgres delegates the credential check to an LDAP server. Operationally easy; you lose per-database password policy control.
- IAM tokens (cloud). On AWS RDS, Google Cloud SQL, Azure, the "password" is a short-lived signed token the cloud IAM system issues (valid ~15 minutes), verified against the IAM public key. No password leaves the developer's laptop; revocation is immediate.
trust. No authentication — any client claiming a username is granted it. Only safe for Unix-domain sockets or bootstrapping. Accidentally leaving this on a TCP listener is a common cloud misconfiguration.
pg_hba.conf — the rules table
pg_hba.conf (host-based authentication) is the authoritative config file that maps a connecting client to an authentication method. It is read top to bottom; first matching rule wins; no rule matches → reject.
Each line is type database user address auth-method [options]:
# TYPE DATABASE USER ADDRESS METHOD
local all postgres peer
host all all 127.0.0.1/32 scram-sha-256
host app app_rw 10.0.0.0/8 scram-sha-256
hostssl app app_ro 10.0.0.0/8 cert clientcert=verify-full
host replication replicator 10.0.1.0/24 scram-sha-256
host all all 0.0.0.0/0 reject
The first rule permits local (Unix socket) connections from OS user postgres via peer auth — Postgres reads the Unix UID and matches it to a role of the same name. The second allows localhost with SCRAM. The third permits app_rw from the 10/8 internal network. The fourth requires TLS plus a verified client certificate for app_ro. The fifth permits replication streams. The last rejects anything else before it can guess.
Three traps. Order matters absolutely — a permissive rule above a restrictive one wins; host all all 0.0.0.0/0 trust at the top lets anyone in. host vs hostssl — host matches either TLS or plain TCP; production should use hostssl for every TCP rule and let hostnossl reject catch TLS-less clients. Reload, not restart — pg_hba.conf is re-read on SELECT pg_reload_conf(); existing connections keep their decisions, new ones use the fresh file.
Password storage and the connection string
Inside pg_authid, Postgres does not store your password. It stores a derived verifier of the form SCRAM-SHA-256$4096:<salt>$<stored_key>:<server_key>. The SCRAM handshake runs on both sides so that neither the stored verifier nor the wire traffic is sufficient to impersonate you — the verifier is enough to verify a correct password but not to produce a valid new handshake transcript.
The connection string carries everything a client needs to reach the server:
postgres://app_rw:s3cret@db.example.com:5432/app?sslmode=require&application_name=checkout
The sslmode=require clause is load-bearing. Without it, libpq will fall back to plain TCP if TLS fails, and a network attacker can strip it. Always use sslmode=verify-full in production — this verifies the server certificate against a trusted CA and checks the hostname matches the CN/SAN, which blocks man-in-the-middle attacks that require alone does not.
Layer 2 — Roles and privileges
Authentication tells the server who you are. Authorisation tells it what you can do. Postgres represents both users and groups with a single abstraction — the role. A role with LOGIN attribute is what other databases call a user; a role without LOGIN is what they call a group. CREATE USER alice is a shorthand for CREATE ROLE alice LOGIN.
Granting privileges
Privileges are always (role, privilege, object) triples. The catalogue is small and worth memorising:
| Object | Privileges |
|---|---|
| Table | SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER |
| Column | SELECT, INSERT, UPDATE, REFERENCES (column-level grants) |
| Schema | USAGE, CREATE |
| Function / Procedure | EXECUTE |
| Sequence | USAGE, SELECT, UPDATE |
| Database | CONNECT, CREATE, TEMPORARY |
The grammar:
GRANT SELECT, INSERT ON tickets TO app_rw;
GRANT USAGE ON SCHEMA public TO app_rw;
GRANT EXECUTE ON FUNCTION reset_password(uuid) TO app_admin;
REVOKE DELETE ON tickets FROM app_rw;
A read-only role in three lines:
CREATE ROLE app_ro LOGIN PASSWORD 'x';
GRANT CONNECT ON DATABASE app TO app_ro;
GRANT USAGE ON SCHEMA public TO app_ro;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO app_ro;
Why CONNECT and USAGE separately: CONNECT lets the role open a connection to the database, USAGE on a schema lets the role see objects inside it, and SELECT lets the role read rows from a specific table. Each is a separate gate because the attack surface is different — you might permit CONNECT (for a monitoring tool that runs SELECT 1) without permitting USAGE on any business schema.
Default privileges — grants that apply to future objects
A subtle trap: GRANT SELECT ON ALL TABLES IN SCHEMA public TO app_ro grants on existing tables only. A migration that creates tickets_archive next week leaves app_ro with no read access to it. The fix is ALTER DEFAULT PRIVILEGES:
ALTER DEFAULT PRIVILEGES FOR ROLE app_owner IN SCHEMA public
GRANT SELECT ON TABLES TO app_ro;
This tells Postgres: whenever app_owner creates a table in public, grant SELECT on it to app_ro automatically. You set it up once and new tables inherit the read grant. Why the FOR ROLE app_owner clause matters: default privileges are keyed by the creating role, not globally. If your CI creates tables as ci_deployer but you set defaults for app_owner, new tables slip through with no read grant. Always scope the default to the role that actually runs your migrations.
Role inheritance — groups of groups
Roles can be granted to other roles. The resulting graph gives you an RBAC tree:
CREATE ROLE reader NOINHERIT;
CREATE ROLE editor NOINHERIT;
CREATE ROLE admin NOINHERIT;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO reader;
GRANT INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO editor;
GRANT TRUNCATE, REFERENCES, TRIGGER ON ALL TABLES IN SCHEMA public TO admin;
GRANT reader TO editor; -- editor gets everything reader has
GRANT editor TO admin; -- admin gets everything editor and reader have
CREATE ROLE alice LOGIN IN ROLE editor; -- alice is an editor
The INHERIT attribute controls whether alice automatically gets the privileges of editor the moment she logs in, or whether she must opt in with SET ROLE editor first. Most production setups use INHERIT for the day-to-day roles and NOINHERIT for privileged ones — you want a human DBA to have to consciously say SET ROLE admin before they can truncate a table.
SET ROLE and SECURITY DEFINER
SET ROLE switches the active role for the rest of the session (or until RESET ROLE). It is how the same physical login acts as different logical roles:
SET ROLE app_rw; -- now queries run as app_rw
SELECT count(*) FROM tickets;
RESET ROLE; -- back to the original login
SECURITY DEFINER functions are the other mechanism. By default a function runs with the caller's privileges (SECURITY INVOKER); marking it SECURITY DEFINER makes it run with the function owner's privileges instead:
CREATE FUNCTION issue_refund(ticket_id bigint) RETURNS void
SECURITY DEFINER
LANGUAGE sql AS $$
UPDATE tickets SET refunded_at = now() WHERE id = ticket_id;
$$;
REVOKE ALL ON FUNCTION issue_refund(bigint) FROM PUBLIC;
GRANT EXECUTE ON FUNCTION issue_refund(bigint) TO support_agent;
Now support_agent — who has no UPDATE on tickets — can nonetheless issue refunds through this specific function, which runs as the function's owner. The refund is fully audited (one call per refund, logged with caller) and the caller can never write UPDATE tickets SET refunded_at = now() on a ticket they shouldn't refund. This is how you grant scoped power without handing over the full privilege.
Layer 3 — Row-Level Security (RLS)
Roles answer which tables. RLS answers which rows of a table. It is the most interesting layer because it is fully declarative — you describe the invariant, and Postgres rewrites every query to preserve it.
Enabling RLS and creating a policy
ALTER TABLE tickets ENABLE ROW LEVEL SECURITY;
CREATE POLICY self_tickets ON tickets
FOR SELECT
TO app_user
USING (user_id = current_setting('app.user_id')::int);
After ENABLE ROW LEVEL SECURITY, the default posture for the table is deny all: every role except the table owner sees zero rows until a policy explicitly lets them see some. The CREATE POLICY above says when role app_user issues a SELECT on tickets, AND the predicate user_id = current_setting('app.user_id')::int into its WHERE clause.
Why AND, not OR: multiple policies on the same table are combined with OR among policies for the same command but AND against the user's own WHERE clause. If two policies each grant access to a slice, the union is visible; but the user's query restrictions still apply on top. This mirrors the intuition of "a policy grants visibility; the query filters further".
USING versus WITH CHECK
A policy can carry two predicates:
USING— applied to rows the query reads (forSELECT,UPDATE,DELETE). A row not satisfyingUSINGis simply not visible.WITH CHECK— applied to rows the query writes (forINSERT,UPDATE). A row not satisfyingWITH CHECKcauses the statement to fail with a constraint-like error.
Example splitting them:
CREATE POLICY tenant_iso ON tickets
FOR ALL TO app_user
USING (tenant_id = current_setting('app.tenant_id')::int)
WITH CHECK (tenant_id = current_setting('app.tenant_id')::int);
Under this policy, tenant 42 can SELECT only tenant-42 rows (USING), and any INSERT that tries to set tenant_id = 99 fails WITH CHECK. The combination turns "a single table of tickets" into a safe multi-tenant store where even a buggy application cannot cross tenant boundaries.
Who bypasses RLS
By default, the table owner and superusers bypass all policies — the theory being they should be able to inspect data for administration. This is almost never what you want in production:
ALTER TABLE tickets FORCE ROW LEVEL SECURITY;
FORCE makes RLS apply even to the table owner. Combined with BYPASSRLS attribute on a designated break-glass role, you get a clean model: normal traffic is policy-constrained; emergency queries require an explicit role switch that is audited.
RLS — a worked multi-tenant example
Put it all together. A fresh database, three tenants, one policy, one application role.
CREATE ROLE app_user LOGIN PASSWORD 'x';
CREATE TABLE tickets (
id bigserial PRIMARY KEY,
tenant_id int NOT NULL,
subject text NOT NULL,
body text NOT NULL,
created_at timestamptz DEFAULT now()
);
CREATE INDEX tickets_tenant_idx ON tickets (tenant_id);
GRANT SELECT, INSERT, UPDATE, DELETE ON tickets TO app_user;
GRANT USAGE, SELECT ON SEQUENCE tickets_id_seq TO app_user;
ALTER TABLE tickets ENABLE ROW LEVEL SECURITY;
ALTER TABLE tickets FORCE ROW LEVEL SECURITY;
CREATE POLICY tenant_iso ON tickets FOR ALL TO app_user
USING (tenant_id = current_setting('app.tenant_id')::int)
WITH CHECK (tenant_id = current_setting('app.tenant_id')::int);
INSERT INTO tickets (tenant_id, subject, body) VALUES
(1, 'Acme cannot login', 'Password reset loop'),
(2, 'Contoso billing query', 'Invoice #42 wrong'),
(3, 'Globex account locked', 'Too many failed attempts'),
(1, 'Acme feature request', 'Dark mode please'),
(3, 'Globex SSO not working', 'SAML assertion fails');
Now simulate a request from tenant 1:
SET LOCAL app.tenant_id = '1';
SELECT id, subject FROM tickets;
-- returns only the 2 rows with tenant_id = 1
And tenant 3:
SET LOCAL app.tenant_id = '3';
SELECT count(*) FROM tickets;
-- returns 2, not 5
An INSERT that tries to cross tenants:
SET LOCAL app.tenant_id = '1';
INSERT INTO tickets (tenant_id, subject, body) VALUES (2, 'x', 'y');
-- ERROR: new row violates row-level security policy "tenant_iso"
The policy doubles as a guardrail against the application writing the wrong tenant_id. Even if the HTTP handler has a bug, Postgres rejects the write.
The key idiom is SET LOCAL — local to the current transaction, automatically reset at COMMIT or ROLLBACK. This is the one-and-only correct way to propagate per-request tenant context through a connection pool. Plain SET would persist across transactions and leak tenant state into the next request that reuses the same pooled connection — see connection-lifecycle-pooling-pgbouncer for the pooling background.
The RLS performance gotcha
RLS is not free. The policy predicate is AND-ed into every query that touches the table, which means it runs on every row the query examines. If the policy is simple and indexable, the optimiser pushes it through joins and index scans and you pay nothing measurable. If it is not, every query slows down.
Concretely, run EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tickets WHERE created_at > now() - interval '1 day'; on the RLS-enabled table and look at the filter line:
Index Scan using tickets_tenant_idx on tickets
Index Cond: (tenant_id = (current_setting('app.tenant_id'))::integer)
Filter: (created_at > (now() - '1 day'::interval))
The tenant predicate was pushed into the index condition — the query uses tickets_tenant_idx and pays nothing extra. If instead you had written the policy as USING (tenant_id = (SELECT id FROM tenants WHERE name = current_setting('app.tenant_name'))), the optimiser cannot inline the subquery into the index scan, and every row read re-evaluates the subquery. Rule: the policy predicate must be a sargable expression on an indexed column, or it costs you.
Three mitigations:
- Index the policy predicate.
CREATE INDEX ON tickets (tenant_id)— almost always worth the disk cost. - Mark helper functions
STABLEorIMMUTABLE. ASTABLEfunction is evaluated once per query rather than once per row.current_settingis alreadySTABLE; your custom helpers should be too. BYPASSRLSon the analyst role. A batch query that scans the whole table for analytics is slow under RLS (one policy eval per row). Run it as aBYPASSRLSrole instead, ideally from a replica.
Why you cannot always push the predicate through a join: if the RLS-protected table is the inner side of a hash join, the optimiser must evaluate the policy before it can decide which rows are eligible for the hash table. This is usually fine — it filters early. But if the policy predicate is a function call, the optimiser may decide to apply it post-join, multiplying the evaluation count by the join fanout. Always EXPLAIN ANALYZE the queries your application actually issues, not synthetic ones.
Putting it together — the request lifecycle
Every database request that touches an RLS table goes through the full stack. The pipeline from TCP open to result return:
Each stage has a distinct error: SSL error, no pg_hba.conf entry, password authentication failed, permission denied for table tickets, new row violates row-level security policy. The error taxonomy is how you tell at which layer your access was denied.
Python — demonstrating the layers
A short script exercising all three layers via psycopg:
import psycopg
from psycopg.errors import InsufficientPrivilege
ADMIN_DSN = "postgres://admin:s@db:5432/app?sslmode=require"
APP_DSN = "postgres://app_user:x@db:5432/app?sslmode=require"
# --- Layer 1: SCRAM auth happens during connect().
with psycopg.connect(APP_DSN) as conn:
with conn.cursor() as cur:
# --- Layer 2: role has SELECT but not TRUNCATE.
cur.execute("SELECT current_user")
print("connected as:", cur.fetchone()[0]) # app_user
try:
cur.execute("TRUNCATE tickets")
except InsufficientPrivilege as e:
print("denied at layer 2:", e.diag.message_primary)
# --- Layer 3: set tenant context and observe RLS filtering.
for tenant in (1, 2, 3):
cur.execute("SET LOCAL app.tenant_id = %s", (str(tenant),))
cur.execute("SELECT count(*) FROM tickets")
print(f"tenant {tenant} sees", cur.fetchone()[0], "rows")
conn.rollback() # reset SET LOCAL for next iteration
# --- Layer 3: a cross-tenant INSERT is blocked by WITH CHECK.
try:
cur.execute("SET LOCAL app.tenant_id = '1'")
cur.execute(
"INSERT INTO tickets(tenant_id, subject, body) "
"VALUES (2, 'leak', 'attempt')"
)
except psycopg.errors.CheckViolation as e:
print("denied at layer 3:", e.diag.message_primary)
Running this script against the database from the worked example prints:
connected as: app_user
denied at layer 2: permission denied for table tickets
tenant 1 sees 2 rows
tenant 2 sees 1 rows
tenant 3 sees 2 rows
denied at layer 3: new row violates row-level security policy "tenant_iso" for table "tickets"
Each denial is at a distinct layer, with a distinct error class. Application code that intends to catch authorisation errors should catch InsufficientPrivilege (layer 2) and the RLS-specific CheckViolation or InsufficientPrivilege raised by layer 3 — conflating them hides configuration bugs.
Three tenants, one table, zero leaks.
Database contains the rows from the worked example: 2 tickets for tenant 1, 1 for tenant 2, 2 for tenant 3.
A web server runs three concurrent requests: tenant 1 lists tickets, tenant 2 files a new one, tenant 3 exports all their tickets. Each request lands on a pooled connection, but the handler begins by setting SET LOCAL app.tenant_id = $1 inside its transaction.
Tenant 1 runs SELECT * FROM tickets ORDER BY created_at DESC — gets exactly the two tenant-1 rows. The query plan shows Index Cond: (tenant_id = 1) — the policy pushed into the index scan, no table sequential scan.
Tenant 2 runs INSERT INTO tickets (tenant_id, subject, body) VALUES (2, 'new', 'body') — the WITH CHECK passes (tenant_id matches app.tenant_id), the row is inserted. Had tenant 2 been compromised and tried INSERT ... VALUES (1, 'evil', ...), Postgres would have rejected with new row violates row-level security policy.
Tenant 3 runs SELECT count(*) FROM tickets — gets 2, not 6. Even though tenant 3 wrote an aggregating query with no WHERE clause, RLS turned it into SELECT count(*) FROM tickets WHERE tenant_id = 3 behind the scenes.
The connection pool — shared across all three requests — never carried tenant context between them. SET LOCAL auto-resets at transaction end, so the next request on the same pooled connection starts with no app.tenant_id set, and the policy predicate evaluates NULL = NULL (false), which means zero rows visible until the application explicitly sets the tenant. This is the fail-closed default that makes the design safe.
Common confusions
"Application-level filtering is enough." Fine until the first missing WHERE clause. A single endpoint written at 2am, a single reporting job, a single admin SQL console — one mistake and a tenant's data leaks. RLS is defence in depth; the application still filters, but if it forgets, the database filters anyway.
"RLS is a Postgres-only thing." No. Oracle has Virtual Private Database (VPD) since version 8i, which is essentially the same mechanism under a different name. SQL Server 2016+ ships Row-Level Security as a first-class feature with a nearly identical API (CREATE SECURITY POLICY, predicate functions). MySQL has no native RLS — the standard workaround is per-user views (CREATE VIEW my_tickets AS SELECT * FROM tickets WHERE tenant_id = @tenant_id), which works but scales poorly to many tenants.
"Superusers and table owners respect RLS." They do not, by default — they bypass all policies silently. This surprises people who assume RLS is a hard wall. It is not; it is a wall with doors. Add FORCE ROW LEVEL SECURITY to the table and grant BYPASSRLS only to an audited break-glass role. Run your CI and analyst jobs as the break-glass role explicitly, not as the table owner.
"RLS prevents SQL injection." RLS is an access-control layer, not an input-validation layer. If the application concatenates user input into SQL, the attacker can still drop tables, dump pg_authid, or run arbitrary queries — RLS just limits which rows they see in the tables they do touch. Parameterised queries prevent injection; RLS is orthogonal.
"RLS policies are free." The predicate is AND-ed into every query on the table. If the predicate is a simple column comparison on an indexed column, the optimiser pushes it through and you pay nothing. If it is a subquery, a function call, or a LIKE on a non-indexed column, every row read evaluates it. Always EXPLAIN ANALYZE your hot queries under RLS — the filter line tells you whether the predicate became an index condition or a per-row check.
"One policy is enough." In practice you write several — one per command, one per access pattern. Postgres OR-combines all applicable policies for the same command, so FOR SELECT policies add up: a user sees rows matching any applicable SELECT policy. Design the set as a union of access rules, not a single omnibus predicate.
Going deeper
Hardening the public schema
Postgres 14 and earlier gave every role CREATE on public and CONNECT on every database. Postgres 15 tightened this; for older versions run REVOKE ALL ON SCHEMA public FROM PUBLIC; REVOKE ALL ON DATABASE app FROM PUBLIC; and then grant explicitly. This replaces the catch-all default with an explicit-grant-only posture.
Container-per-tenant vs RLS — the architectural tradeoff
A deliberate alternative to RLS is physical tenant isolation: one database per tenant, or one schema per tenant. The tradeoffs:
| Dimension | RLS (one DB, one schema, one table) | Schema-per-tenant | Database-per-tenant |
|---|---|---|---|
| Tenant count ceiling | millions | ~10 000 (pg_class bloat) | ~100 (connection overhead) |
| Migration cost | one ALTER | N ALTERs | N ALTERs per host |
| Noisy-neighbour risk | high (one slow tenant affects all) | medium | low |
| Per-tenant backup/restore | complex (row-level filter) | easy | trivial |
| Cross-tenant analytics | trivial (SELECT ... GROUP BY tenant_id) |
painful (UNION ALL) | very painful |
| Regulatory data residency | hard (one storage footprint) | hard | easy (one host per region) |
| Typical SaaS fit | B2B freemium, long tail of small tenants | mid-market | a few large enterprise customers |
Most SaaS converges on RLS for 99% of tenants, with database-per-tenant for the handful of enterprise customers who contractually require it — a shared database for free and mid-tier under RLS, plus separate databases for named enterprise accounts with their own credentials and compliance footprint. Sharding-aware poolers (pgcat) route traffic based on the authenticated role.
Application-layer vs database-layer access control
A parallel ecosystem exists for application-layer access control: libraries like Casbin and services like OpenFGA / AuthZed (SpiceDB) (Google Zanzibar-style) express policies as graphs of relationships — "Alice can view document D because Alice is a member of team T which is an editor of folder F which contains D".
Tradeoffs against DB-layer RLS: graph systems are more expressive for friend-of-friend and ownership-chain rules, but they make out-of-process calls per check (painful when listing 10 000 rows), they separate rule from data (drift risk), and they protect only code paths that go through the library — an ad-hoc SQL console bypasses them. Production systems often use both: RLS as the hard floor that catches every query, plus an application-layer engine for expressive decisions that exceed SQL.
Audit logging
None of the three layers tells you who did what. For that, Postgres needs audit logging — either log_statement = 'mod' / 'all' (blunt, built in) or the pgaudit extension (per-object, structured). Both log into the Postgres server log, which a compliant deployment ships to a separate immutable store so an attacker who compromises the database cannot also edit the trail.
Where this leads next
Chapter 66 walks client-scale concerns — how the application and the pooler behave under many tenants, per-tenant rate limiting, and the propagation of request context through async code so SET LOCAL app.tenant_id arrives at the right connection even under heavy concurrency. After that, replication, backup, and point-in-time restore inherit the access-control story: the WAL carries the rows but not the session-level current_setting, so RLS decisions are re-made on the replica using its own current_setting — which is how read replicas stay safely multi-tenant even when the primary connection pool does not extend to them.