Wire Protocols — Text (Postgres) vs Binary (MySQL)

Note: Company names, engineers, incidents, numbers, and scaling scenarios in this article are hypothetical — even when they resemble real ones. See the full disclaimer.

In short

The database wire protocol is the language a client driver and a database server speak over a TCP socket. Everything you type into psql or pass through libpq, JDBC, or mysql-connector is eventually serialised into bytes that travel this protocol. Postgres published its current protocol (version 3.0) in 2003; MySQL's predates it and has been extended through several "capability flag" generations.

Both protocols share a surprising amount of architecture. Both frame every conversation as a sequence of length-prefixed messages. Both run a strict state machine — connect, authenticate, ready-for-query, query, resultset, ready-for-query. Both ultimately carry SQL text one way and row bytes the other. The interesting differences are in payload encoding.

Postgres encodes every result column as TEXT by default: the server converts its internal binary representation (a 4-byte integer, an 8-byte timestamp, a numeric) to its ASCII decimal form, sends it as bytes, and the client parses it back into a native type. The client can request binary format per column on a per-query basis via the extended-query protocol. MySQL encodes result rows for prepared statements in a binary form by default, with type-aware per-field encodings: varints for integers, length-prefixed strings, fixed-width temporal types. For plain text queries it returns a text resultset.

Text is simpler to debug (you can tcpdump and read the conversation), robust to schema changes (a new type on the server is still a string the client can handle), and slower for numeric-heavy workloads because of the decimal round-trip. Binary is faster, more compact on the wire, and precise for floating point, but it requires the client driver to know every type's on-the-wire representation. This chapter builds a minimal Postgres-wire parser and a minimal MySQL-wire parser in Python, shows the bytes flowing in both directions, and explains why wire protocol is not invisible — it affects parsing CPU, driver portability, and what a pooler like PgBouncer can or cannot do transparently.

Open a terminal and run psql against a Postgres server. Type SELECT 1;. You see 1 come back. Trivial. But the bytes travelling over the TCP socket are not what a naive reader expects: Postgres sent the ASCII byte 0x31 (the character '1'), not the four-byte integer 0x00000001. The client parsed the string back into an integer. A follow-up SELECT '€1' AS price; ships three UTF-8 bytes for the Euro sign plus framing; the driver knows the column encoding, reads the payload length, and hands you a Python string.

The wire protocol is this layer — below SQL, above TCP — and it is where every production driver spends most of its implementation budget. This chapter opens Build 8 (the physical transport and operational layer of a DBMS) by taking the protocol apart for the two most widely deployed open-source databases.

What a wire protocol is, at the byte level

A wire protocol is a framing-plus-encoding contract between a client and a server exchanging bytes over TCP. It has three jobs.

Framing. TCP delivers a byte stream, not messages. If the server writes "Hello" then "World", the client might recv() them as "Hell", "oWor", "ld", or "HelloWorld" in one go. Every database wire protocol solves this with a length prefix: every message starts with its length, the reader reads exactly that many bytes, and knows it has one message.

Typing. The conversation has many kinds of messages — a query, a row, a parameter bind, an error, an authentication challenge. Postgres uses a one-byte type tag before the length; MySQL determines type from conversational state (the packet sequence number and what's expected next).

Encoding. Inside a payload, how are integers, strings, timestamps, nulls encoded? Text decimal? Little-endian binary? Length-prefixed UTF-8? This is where the two protocols diverge most.

Above those three jobs, both protocols run a state machine that constrains which messages are valid when. You cannot send a query before authenticating. You cannot receive a row description without first asking a query. The server advertises its state via explicit ready-for-query messages.

Both Postgres and MySQL ride the same five-state conversation. After Connect and Authenticate, the server sits in ReadyForQuery waiting for the client. The client sends a Query; the server transitions to Resultset and streams rows back; then it returns to ReadyForQuery. Errors return control to ReadyForQuery without running the query. A terminate message from the client ends the session.

Postgres wire protocol (version 3.0, stable since 2003)

The Postgres protocol is version 3.0 and has been stable since Postgres 7.4. The canonical specification is the Frontend/Backend Protocol chapter of the Postgres manual.

Message framing

Every message — client or server — after the initial handshake has the same layout:

+--------+---------------------+-----------------------+
| 1 byte | 4 bytes big-endian  | length-4 bytes        |
| type   | length (incl self)  | payload               |
+--------+---------------------+-----------------------+

One-byte type tag (an ASCII letter identifying what kind of message this is), four-byte length (big-endian, self-inclusive — so a message with a four-byte empty payload has length 4), and the payload bytes. Reading a message from the socket is three recv calls (or one, if you buffer): read one byte, read four bytes, read length-minus-four bytes.

Why length includes itself: self-inclusive length lets a recipient read the fixed 5-byte prefix (tag + length) and then know exactly how much more to read, without adding 4. The alternative is a common source of off-by-4 bugs. Postgres avoided that footgun at the cost of a minimum 5-byte message even when logically empty.

The one exception is the startup message, sent first after TCP connect. It has no type tag — just length and payload (protocol version + key-value pairs user, database, application_name). The lack of a tag is a historical quirk: the type-tag convention was introduced in protocol v3 and startup had to remain identifiable by earlier v2 servers.

Key message types

A small vocabulary of type tags does almost everything:

Q — Query (client → server). Simple-query-mode SQL as a null-terminated string.
R — Authentication (server → client). Payload carries an integer sub-type (0 = ok, 3 = clear-text password, 5 = MD5, 10 = SASL/SCRAM).
T — RowDescription (server → client). Describes the columns of a resultset: name, type OID, type modifier, format code (0 = text, 1 = binary).
D — DataRow (server → client). One row of a resultset. Payload: 2-byte column count, then per column: 4-byte length (or -1 for NULL) + that many payload bytes.
C — CommandComplete (server → client). End of this statement's output. Carries a tag string like SELECT 1 or INSERT 0 3.
Z — ReadyForQuery (server → client). Server is idle and ready for the next command. Payload: 1-byte transaction status (I = idle, T = in transaction, E = in aborted transaction).
E — ErrorResponse (server → client). A list of code:text pairs describing an error.
X — Terminate (client → server). Graceful disconnect.

For prepared statements, an extended-query protocol adds P (Parse), B (Bind), E (Execute), S (Sync), D (Describe), and C (Close). Those are covered in a later chapter on prepared-statement caching.

Text-first result encoding

The defining characteristic of the Postgres protocol is that DataRow fields are TEXT by default. The server converts the internal binary representation (packed 32-bit integer, 8-byte timestamp, numeric) into the column's text output format, then sends those ASCII bytes.

A SELECT 1; returns a DataRow containing the single byte 0x31 — not the integer 0x00000001. A SELECT 3.14::float8; returns the six ASCII bytes 3.14 — not the IEEE-754 double. The client receives text and parses it using the type OID from the preceding RowDescription.

Why text by default: when the Postgres protocol was designed, the type set was already large (geometric types, ranges, arrays, enums, user-defined types via CREATE TYPE) and growing. A binary-by-default protocol would force every driver to hard-code a codec for every type, breaking on every new server-side type. Text-by-default lets the server emit its canonical text representation and lets the driver hand anything it does not recognise back to the application as a string. This is why psycopg2 can connect to a Postgres server using a custom type it has never heard of and still read the values.

Clients opt in to binary format per-column. The Bind message carries an array of format codes (0 text, 1 binary), one per result column. Drivers that know the binary representation of common types (integers, floats, timestamps, bytea) request binary and skip the text round-trip. psycopg3's binary mode and asyncpg both do this for built-in numerics; the win on numeric-heavy workloads is 2-3×.

A minimal Postgres protocol parser in Python

Sixty lines get you a functional reader and writer for the core protocol. Ignore TLS, auth beyond trust, extended-query — this is a teaching parser that shows the framing.

# wire/pg_parser.py
import socket, struct

def send_startup(sock, user: str, database: str):
    """Startup message — no type tag; length + protocol version + kv pairs."""
    payload = struct.pack('!I', 196608)             # protocol 3.0 = 3<<16|0
    payload += b'user\x00' + user.encode() + b'\x00'
    payload += b'database\x00' + database.encode() + b'\x00'
    payload += b'\x00'                              # params terminator
    length = struct.pack('!I', len(payload) + 4)
    sock.sendall(length + payload)

def send_query(sock, sql: str):
    """Simple-query mode: type 'Q' + length + null-terminated SQL."""
    payload = sql.encode() + b'\x00'
    sock.sendall(b'Q' + struct.pack('!I', len(payload) + 4) + payload)

def read_message(sock) -> tuple[bytes, bytes]:
    """Return (type_tag, payload). Returns (b'', b'') on clean EOF."""
    tag = sock.recv(1)
    if not tag:
        return b'', b''
    length_bytes = _recv_exact(sock, 4)
    length = struct.unpack('!I', length_bytes)[0] - 4
    payload = _recv_exact(sock, length)
    return tag, payload

def _recv_exact(sock, n: int) -> bytes:
    buf = bytearray()
    while len(buf) < n:
        chunk = sock.recv(n - len(buf))
        if not chunk:
            raise ConnectionError('server closed mid-message')
        buf.extend(chunk)
    return bytes(buf)

Now parse a DataRow payload into Python values:

# wire/pg_datarow.py
import struct

def parse_datarow(payload: bytes) -> list[str | None]:
    """DataRow payload: int16 column count, then per column:
       int32 length (-1 = NULL), then length bytes of text payload."""
    col_count = struct.unpack('!H', payload[:2])[0]
    offset, values = 2, []
    for _ in range(col_count):
        length = struct.unpack('!i', payload[offset:offset+4])[0]
        offset += 4
        if length == -1:
            values.append(None)
        else:
            values.append(payload[offset:offset+length].decode('utf-8'))
            offset += length
    return values

Run a round-trip against a real server:

# wire/pg_demo.py
from wire.pg_parser import send_startup, send_query, read_message
from wire.pg_datarow import parse_datarow
import socket

sock = socket.create_connection(('localhost', 5432))
send_startup(sock, user='dipti', database='demo')
while True:
    tag, payload = read_message(sock)
    if tag == b'Z': break              # ReadyForQuery — auth done (trust auth)

send_query(sock, 'SELECT 1')
while True:
    tag, payload = read_message(sock)
    if tag == b'T': pass                # RowDescription — ignore for now
    elif tag == b'D': print(parse_datarow(payload))   # [Ôÿ1Ôÿ]
    elif tag == b'C': pass              # CommandComplete
    elif tag == b'Z': break             # ReadyForQuery — cycle complete

The output is ['1'] — a Python list of one string, because the protocol returned the ASCII byte '1'. A proper driver like psycopg coerces based on the type OID from the RowDescription.

For SELECT 1, the DataRow bytes on the wire are:

44 00 00 00 0B 00 01 00 00 00 01 31
|  |---------| |---| |---------| |
D  length=11  cols=1 len=1       '1'

Twelve bytes for the value 1. Eleven are framing; one is payload. That is the text-protocol overhead tax — most wire traffic for small numerics is framing. Wide string columns dilute the ratio; numeric-heavy columns make it dominate.

MySQL wire protocol (Client/Server Protocol 41)

MySQL's protocol (sometimes called "Protocol 41" after the capability-flag generation introduced with MySQL 4.1) predates Postgres 3.0 and descends from an even earlier 3.x format. Its canonical spec is the MySQL Client/Server Protocol documentation.

Packet framing

Every packet in the MySQL protocol has this header:

+---------------------+----------+----------------------+
| 3 bytes little-end  | 1 byte   | payload              |
| payload length      | sequence | (up to 16MB)         |
+---------------------+----------+----------------------+

Three-byte little-endian length (payload only, excluding the header), one-byte sequence id that starts at 0 for each client-driven command cycle and increments per packet, then the payload. Notably, there is no type tag in the header — packet type is determined by conversation state. After a query, the next server packet is either OK, error, or a resultset header, distinguished by the first payload byte.

Why a sequence id instead of a type tag: the sequence id is a guard against protocol desync. Each command cycle resets sequence to 0 and increments per reply packet. Out-of-sequence packets signal buffer mishandling or middlebox interference. The absence of a type tag keeps the header tight (4 bytes vs Postgres's 5) at the cost of requiring the client to track state precisely.

The 3-byte length caps packet size at 16 MiB. Larger payloads split across packets with the same sequence id incrementing; a 16MB packet is always followed by at least one more (possibly empty) so the receiver knows the logical message is complete.

Key packet payload types

The first byte of the payload distinguishes types within the current conversational state:

0x00 — OK packet (follows a successful query with no resultset). Carries affected_rows, last_insert_id.
0xFF — ERR packet. Carries error code and message.
0xFE — EOF packet (in pre-5.7 protocol); or in modern versions, also the OK packet marker in some contexts.
Anything else at the start of a resultset — a length-encoded integer giving the number of columns. Subsequent packets are column definitions, then a deprecated EOF, then data rows, then another EOF or OK.

Within a payload, integers are length-encoded: a one-byte prefix encodes 0-250 directly; 0xFC signals a 2-byte length following; 0xFD means 3-byte; 0xFE means 8-byte. Strings are length-prefixed with this varint. Compact for small values, extends gracefully.

Text vs binary resultsets

MySQL supports two resultset formats. COM_QUERY (send a SQL string) returns a text resultset with columns as length-prefixed strings, like Postgres. COM_STMT_EXECUTE (prepared-statement execution) returns a binary resultset: integers in native little-endian (1, 2, 4, or 8 bytes by column type), floats as IEEE-754 bytes, timestamps as packed year/month/day/hour/minute/second fields. Nulls live in a compact null bitmap at the start of each row, one bit per column, sparing the length prefix for null fields.

The asymmetry matters: MySQL only speaks binary for prepared statements. Plain SELECT over COM_QUERY always returns text. For binary encoding you must use prepare/execute — which production drivers do anyway, because it also gives parameter binding and plan caching.

A minimal MySQL protocol parser in Python

# wire/mysql_parser.py
import socket, struct

def read_packet(sock) -> tuple[int, bytes]:
    """Return (sequence_id, payload). Returns (-1, b'') on EOF."""
    header = _recv_exact(sock, 4)
    if not header: return -1, b''
    length = header[0] | (header[1] << 8) | (header[2] << 16)
    seq = header[3]
    payload = _recv_exact(sock, length)
    return seq, payload

def write_packet(sock, seq: int, payload: bytes):
    """Write a packet. Caller tracks the sequence id per command cycle."""
    length = len(payload)
    header = bytes([length & 0xFF, (length >> 8) & 0xFF,
                    (length >> 16) & 0xFF, seq & 0xFF])
    sock.sendall(header + payload)

def read_length_encoded_int(buf: bytes, offset: int) -> tuple[int, int]:
    """MySQL's varint: returns (value, new_offset)."""
    first = buf[offset]
    if first < 0xFB:   return first, offset + 1
    if first == 0xFC:  return struct.unpack('<H', buf[offset+1:offset+3])[0], offset + 3
    if first == 0xFD:  v = buf[offset+1] | (buf[offset+2] << 8) | (buf[offset+3] << 16)
                       return v, offset + 4
    if first == 0xFE:  return struct.unpack('<Q', buf[offset+1:offset+9])[0], offset + 9
    raise ValueError(f'reserved length-encoded prefix {first:#x}')

def _recv_exact(sock, n: int) -> bytes:
    buf = bytearray()
    while len(buf) < n:
        chunk = sock.recv(n - len(buf))
        if not chunk: raise ConnectionError('server closed mid-packet')
        buf.extend(chunk)
    return bytes(buf)

Parse a text-resultset row: each field is a length-encoded string, or the literal byte 0xFB meaning NULL.

# wire/mysql_textrow.py
from wire.mysql_parser import read_length_encoded_int

def parse_text_row(payload: bytes, col_count: int) -> list[str | None]:
    offset, values = 0, []
    for _ in range(col_count):
        if payload[offset] == 0xFB:
            values.append(None); offset += 1
        else:
            length, offset = read_length_encoded_int(payload, offset)
            values.append(payload[offset:offset+length].decode('utf-8'))
            offset += length
    return values

The binary-resultset parser is longer — it needs a type code per column and a per-type decoder (INT reads 4 little-endian bytes, BIGINT 8, VARCHAR a length-encoded string, DATETIME a packed field). The null bitmap is read first; any bit set skips the length prefix entirely. That density is where binary resultsets win.

Text vs binary — the pragmatic differences

Five dimensions, the same trade-off shape.

Debuggability. Text wins. tcpdump -i lo -X port 5432 shows a readable Postgres conversation; Wireshark's built-in dissector decodes frames with ASCII DataRow payloads. Against MySQL binary resultsets you need the type-aware MySQL dissector or you are reading hex. Production debugging: Postgres 10 minutes, MySQL binary an hour.

Throughput. Binary wins. A 1M-row SELECT id, total FROM orders where both columns are 4-byte integers transfers ~20 bytes per row in text (framing + decimal) versus ~12 bytes in binary (framing + null bitmap + two 4-byte values). Server-side sprintf to convert 1234567 to "1234567" is slow compared to memcpy. End-to-end the binary fetch is about 2× faster. For string columns the gap closes to nothing — both ship bytes as-is.

Driver simplicity. Text wins. A text driver parses decimal integers, decimal floats, ISO-8601 timestamps. A binary driver needs per-type codecs for every server type. Postgres's type system is open (user-defined types, extension types like PostGIS), so every binary driver falls back to text for unknown types. This is why psycopg2 defaulted to text for decades and psycopg3's binary mode covers only built-ins.

Precision. Binary wins for floating point. Decimal-text round-trips of double values are lossy below 17 significant digits; Postgres's text output uses 17 since v12, MySQL's has historically been less careful. Binary ships the 8 IEEE-754 bytes directly.

Schema evolution. Text wins. A new server type (jsonb in Postgres 9.4, VECTOR in MySQL 9) does not break old text clients — they read the type's text representation as a string. Binary-only drivers fail at decode. This is why Postgres's protocol has survived intact since 2003: text-default is an evolution firewall.

The pattern: text is operational, binary is performance-critical. Most production drivers default to text and opt in to binary on the hot path. Postgres's opt-in is per-column-per-query; MySQL's is all-or-nothing per prepared statement.

Prepared statements in the wire protocol

Prepared statements are the other reason binary format matters. A prepared statement is parsed once and executed many times with different parameters — a hot loop in any OLTP workload.

Postgres extended query splits prepare/execute into four messages: P (Parse) sends the SQL and assigns a statement name; B (Bind) sends parameter values with per-parameter format codes (0 text, 1 binary); E (Execute) runs it; S (Sync) ends the transaction boundary. Binding as binary skips text-to-binary conversion server-side — a BIGINT parameter is 8 bytes, not up to 19 ASCII bytes plus parsing.

MySQL COM_STMT_EXECUTE sends parameters in a dense binary layout: a null bitmap, then per non-null parameter a type code (1 byte) and the value. The server decodes directly into its execution path.

For high-frequency point queries, parameter-binding is where most of the wire-protocol gap shows up. Benchmark: 100k SELECT * FROM users WHERE id = ? queries with integer parameter. Text-bound ~12 ms/1k queries; binary-bound ~7 ms/1k queries. The 40% reduction comes entirely from skipping decimal parse server-side.

Authentication and TLS, briefly

Both protocols layer authentication above framing. Postgres Authentication messages carry a sub-type: 0 ok, 3 clear-text (discouraged), 5 MD5 (deprecated), 10 SASL/SCRAM-SHA-256 (modern default since v10). MySQL has mysql_native_password (SHA1, deprecated), caching_sha2_password (SHA256, default since 8.0), and pluggable auth.

TLS upgrade is handled identically: the client sends SSLRequest (Postgres) or sets SSL_FLAG in capability bits (MySQL); the server replies S or N; on S the socket is upgraded via the OS TLS stack before any further protocol bytes flow. Everything above is unchanged.

A real `SELECT name FROM users WHERE id = 42` conversation

Tcpdump trace, Postgres side, 8 messages:

CLIENT → SERVER   Startup      len=27 proto=3.0 user=dipti database=demo
SERVER → CLIENT   R (Auth)     len=8  subtype=0 (ok)
SERVER → CLIENT   Z (Ready)    len=5  status='I' (idle)
CLIENT → SERVER   Q (Query)    len=49 sql="SELECT name FROM users WHERE id=42"
SERVER → CLIENT   T (RowDesc)  len=33 col="name" type=varchar format=text
SERVER → CLIENT   D (DataRow)  len=14 cols=1 len=5 payload="Dipti"
SERVER → CLIENT   C (CmdDone)  len=13 tag="SELECT 1"
SERVER → CLIENT   Z (Ready)    len=5  status='I'

Ninety-four bytes of payload, nine messages, one value returned. Eight of those nine messages are framing and protocol control; one carries the answer.

MySQL side of the same query, 7 packets:

CLIENT → SERVER  seq=0 COM_QUERY: "SELECT name FROM users WHERE id=42"
SERVER → CLIENT  seq=1 col_count=1 (varint 0x01)
SERVER → CLIENT  seq=2 column_def: name=name, type=VAR_STRING
SERVER → CLIENT  seq=3 EOF (legacy) — columns done
SERVER → CLIENT  seq=4 text_row: lenenc(5) "Dipti"
SERVER → CLIENT  seq=5 EOF (legacy) — rows done
                                 (or OK packet in modern protocol)

Sequence ids climb 0, 1, 2, 3, 4, 5 across the single command cycle. No explicit type tags — the client's state machine knows that after COM_QUERY the first server packet is column count, then N column definitions, then rows, then EOF/OK. A desync at any point is caught by sequence id mismatch.

Postgres on the left: every message carries a type tag (Q, T, D, C, Z) and is individually self-describing. MySQL on the right: no type tags, just sequence ids climbing 0..5 within a command cycle, with payload type determined by state. Both end in the same place — one row of data returned, with the server idle and ready for the next query.

Wire protocol impact on pooling

The wire protocol is what a pooler must understand and, in part, reimplement. That is the topic of the next chapter, but the shape is already visible.

A connection pooler sits between clients and the database, multiplexing client sessions across a pool of server backends. PgBouncer, the common Postgres pooler, operates in three modes:

Session pooling. Backend assigned for the full session. No multiplexing; any feature works.
Transaction pooling. Backend assigned only for one transaction, then returned. Common production setting.
Statement pooling. Backend returned between every statement. Max density, extreme feature restrictions.

The feature restrictions come from wire-protocol state. Server-side state that spans transactions — named prepared statements, session variables set with SET, LISTEN/NOTIFY subscriptions, temporary tables, advisory locks — lives on one specific backend. A pooler that switches backends cannot preserve this; the client's next EXECUTE my_prepared_stmt lands on a backend that has never heard of it and errors.

PgBouncer in transaction mode therefore disables these features by default. Applications either switch to session mode (losing density) or adopt workarounds (unnamed Parse/Bind on every call). The choice is directly driven by the wire protocol's stateful design. MySQL's equivalents (ProxySQL, MaxScale) face the same problem. The wire protocol is the contract a pooler inherits.

Common confusions

"Text protocol is always slower." Slower for numbers and timestamps where decimal-binary conversion is expensive. For strings it is free — text and binary ship the same bytes. A SELECT over a VARCHAR shows no meaningful gap between formats.
"The wire protocol is stable across versions." Mostly. Both version-gate changes via the initial handshake. Postgres has been on 3.0 since 2003; MySQL's capability flags evolve continuously, each generation adding flags without breaking previous ones. Drivers target a capability level and work across many server versions.
"Wire protocol is the same as SQL dialect." No. SQL is parsed server-side; the wire protocol carries the raw SQL string or a prepared-statement reference. A Postgres client speaking perfect wire protocol will still get a syntax error if it sends MySQL-dialect SQL. Wire and dialect are orthogonal — which is exactly why Postgres-wire-compatible databases like CockroachDB and YugabyteDB exist.
"Drivers don't need to know the wire protocol." They do, intimately. Every production driver — libpq, pgjdbc, psycopg3, pgx, node-postgres, tokio-postgres — implements the wire protocol from scratch. This is why every language community has its own Postgres driver; the protocol is simple enough that a from-scratch implementation is feasible.
"TLS makes the protocol different." TLS is a transparent wrapper. The protocol bytes are identical with or without TLS; only the socket underneath changes.

Going deeper

Three threads worth pulling on at the protocol layer.

Metrics observability

Both protocols expose server-side metrics not via a scrape endpoint but via in-band SQL. MySQL's SHOW STATUS returns a resultset of counters (Queries, Innodb_buffer_pool_reads, Threads_connected); Postgres's pg_stat_activity is a view queried like any table. Prometheus exporters are thin wire-protocol clients that run these queries and re-emit the values. The wire protocol is the metric bus as well as the query bus.

The compressed protocol

MySQL supports zlib compression over the socket as an opt-in capability flag. Effective over slow networks, wasteful on localhost. Postgres handles compression via TLS (deflate in the cipher suite) rather than at the protocol layer. Modern deployments commonly disable MySQL's wire compression in favour of TLS-level negotiation.

Logical replication protocol

Postgres's logical replication protocol rides the same wire layer to stream row changes from publisher to subscriber. A subscriber issues START_REPLICATION; the server streams CopyData messages containing WAL records decoded into logical insert/update/delete events. Tools like Debezium hook this to feed Kafka. No separate port or auth path — a subscriber is a specialised query client.

Where this leads next

This chapter opened Build 8 on the physical transport and operational layer. You now know what bytes a driver and server exchange, how the two dominant open-source protocols differ, and why protocol design ripples up into operational decisions like pooler mode.

Next in Build 8:

Connection lifecycle and pooling (PgBouncer) — what happens between connect() and close(), why Postgres's process-per-connection model makes pooling mandatory, and which wire-protocol features each pooler mode breaks.
Drivers deep-dive — libpq internals, async drivers (asyncpg, pgx), pipelining via the extended-query protocol.
Prepared-statement caching — how server-side prepare/execute interacts with pooling.
The write-ahead log and durability — how the server protects data once the wire protocol has delivered it.

The wire protocol is where your application ends and the database begins. Treating that boundary as opaque is fine — until a query is slow, a pool stalls, or a driver misbehaves. Then the protocol is the first place to look, and tcpdump plus the reference docs answer the question faster than any higher-level tool.

References

PostgreSQL documentation, Frontend/Backend Protocol — the canonical specification of Postgres wire protocol 3.0. Chapter 55 of the manual. Every message format, every type tag, every state transition.
MySQL documentation, Client/Server Protocol — MySQL's protocol reference, including packet framing, capability flags, authentication methods, text and binary resultset layouts, and prepared-statement commands.
PostgreSQL source tree, libpq-fe.h and fe-protocol3.c — the reference C client library. The PQexec path plus fe-protocol3.c is the most battle-tested Postgres wire implementation in existence; a good companion to the protocol docs when a message format is ambiguous.
Jackc, pgx — Postgres driver for Go — a modern from-scratch Postgres driver in Go, with pipelining and binary-format support. The pgproto3 subpackage is a readable, well-commented wire-protocol implementation — one of the clearest learning resources outside the official docs.
Wireshark documentation, PostgreSQL protocol dissector and MySQL protocol dissector — what the tcpdump-and-read-it workflow looks like in practice. Indispensable when debugging protocol-level issues in production.
PgBouncer documentation, Features and usage — the reference for which wire-protocol features survive which pooler modes. The explicit list of incompatible features (prepared statements with names, session variables, advisory locks, LISTEN/NOTIFY) is the clearest articulation of how the wire protocol's stateful design constrains deployment architecture.