Overview
Every backend engineer eventually hits the same wall. A load balancer routes /checkout to the wrong fleet, or TLS terminates in a place you did not expect, or a health check passes at one level and the app is dead at another. Underneath all of it is a single idea most people learned as a diagram they immediately forgot: the network stack is a pile of layers, and each layer does exactly one job.
This post is the diagram, drawn properly. We walk up the stack from L3 (Network) to L7 (Application), say what each layer does and why, and then trace one real https://example.com/page request all the way down one machine and up another. By the end, the L4-versus-L7 load-balancing question that trips up so many designs answers itself.
What "Layers" Even Are
Forget the acronyms for a second. A network layer is a job with a strict contract: it takes a chunk of data from the layer above, does its one job, wraps that data in its own header, and hands it to the layer below. That is the whole trick.
The header is the key. Each layer writes a small block of its own bookkeeping in front of whatever it received, and it does not look inside what it received. To L3, the entire HTTP request plus its TLS encryption plus its TCP bookkeeping is just an opaque blob of bytes it has to move to an IP address. To L7, the IP routing that got the bytes there is invisible plumbing it never thinks about.
One sentence to hold onto: a message travels DOWN the stack on the sender, getting wrapped in one more header at every layer, then travels UP the stack on the receiver, getting one header stripped at every layer, until the layer that put a header on is the layer that reads it. Sender wraps, receiver unwraps, symmetrically.
A quick honesty note before the tour. The classic OSI model has seven layers; the model the internet actually runs on (TCP/IP) collapses the top three into one. We use the OSI numbering because it is the shared vocabulary, but I will flag where L5, L6, and L7 blur together in practice instead of pretending they are cleanly separate boxes.
The Layers, L3 to L7
Here is the whole tour in one table. Each layer, its one job, the name for its chunk of data (the PDU, Protocol Data Unit), and the protocols you actually meet.
| Layer | Name | One job | PDU | Example protocols |
|---|---|---|---|---|
| L3 | Network | Address + route across networks, hop by hop | Packet | IPv4, IPv6, ICMP |
| L4 | Transport | Deliver end-to-end between processes, via ports | Segment (TCP) / Datagram (UDP) | TCP, UDP, QUIC |
| L5 | Session | Open, maintain, resume, and tear down a conversation | (data) | TLS session/resumption, SOCKS, RPC session |
| L6 | Presentation | Encode, serialize, compress, encrypt the payload | (data) | TLS record encryption, gzip, UTF-8, JSON/protobuf |
| L7 | Application | Speak the protocol the app actually cares about | Message / Data | HTTP/1.1, HTTP/2, HTTP/3, gRPC, DNS, WebSocket, SMTP |
Now each layer in words.
L3 — Network: addressing and routing
L3 answers one question: which machine, and how do I get the bytes there? It gives every host an IP address and moves a packet from source to destination across a chain of routers, one hop at a time. Each router reads only the destination IP, consults its routing table, and forwards to the next router closer to the goal. No single router knows the whole path; each one just knows the next hop.
The protocols are IPv4 and IPv6 for addressing, and ICMP for control messages (this is what ping and traceroute ride on). The defining property of L3 is what it does not promise: it is best-effort. A packet can be dropped, duplicated, delayed, or arrive out of order, and L3 shrugs. It never retransmits, never reorders, never confirms delivery. If you want guarantees, that is somebody else's job — specifically L4's.
L4 — Transport: ports and end-to-end delivery
L3 got the bytes to the right machine. L4 gets them to the right program on that machine, and decides how reliable the delivery is. It introduces the port: a 16-bit number so the OS knows that these bytes belong to your web server on :443 and those belong to SSH on :22. An IP address plus a port is a specific conversation endpoint.
Two protocols dominate, and they are opposites.
TCP (its PDU is a segment) is the reliable one. Before any data flows it does a three-way handshake (SYN, SYN-ACK, ACK) to establish a connection. Then it numbers every byte, acknowledges what it receives, retransmits what got lost, reorders what arrived scrambled, and throttles itself under congestion. TCP hands the application an ordered, gap-free, reliable byte stream out of L3's unreliable packet soup. That reliability costs round-trips and state.
UDP (its PDU is a datagram) is the fire-and-forget one. A port, a length, a checksum, and go. No handshake, no ordering, no retransmit. It is a thin shim over L3 that only adds ports. That sounds worse until you want low latency and can tolerate loss — live video, DNS lookups, game state — where waiting for a retransmit is worse than dropping the frame.
QUIC is the modern twist: it rebuilds TCP's reliability, ordering, and congestion control on top of UDP, folds TLS in, and fixes head-of-line blocking. It is what HTTP/3 runs on. Structurally it lives at L4, but it deliberately smears into L5/L6 by owning the session and the encryption too — which is exactly the kind of blur the clean OSI diagram hides.
L5 — Session: the conversation's lifecycle
L5 manages the conversation as a thing with a beginning, a middle, and an end: establishing it, keeping it alive, and tearing it down. In pure TCP/IP terms this layer barely exists as its own box — TCP already owns connection setup and teardown — which is why L5 is the fuzziest of the five.
Where it earns its keep conceptually is session state and resumption. When TLS lets a returning client skip the full handshake via a session ticket or resumption, that is session-layer thinking: we have talked before, let us not start from zero. When an RPC framework keeps a logical session across several transport connections, that is L5. Be honest about this one: in the model the internet runs on, L5 is less a physical layer and more a role that TCP, TLS, and the application quietly share.
L6 — Presentation: encoding, serialization, encryption
L6 is about the shape of the bytes, not their delivery. Its job is to turn the application's in-memory objects into a wire format both sides agree on, and back again: character encoding (UTF-8), serialization (JSON, protobuf, MessagePack), compression (gzip, Brotli), and encryption (the TLS record layer that turns plaintext into ciphertext before it ever reaches L4).
This is the layer that means the server can read what the client wrote. If the client marshals a struct to protobuf and the server expects JSON, no amount of perfect L3 routing and L4 delivery saves you — the bytes arrived flawlessly and are still gibberish. TLS record encryption sits here too: the payload is encrypted at L6 on the way out and decrypted at L6 on the way in, which is why a plain L4 load balancer downstream sees only opaque ciphertext.
L7 — Application: the protocol the app speaks
L7 is the protocol your application actually talks. It is the layer humans and services think in: HTTP verbs and headers and status codes, gRPC method calls, DNS queries, WebSocket frames, SMTP commands. When you write GET /page HTTP/1.1\r\nHost: example.com, that string is the L7 payload. Everything below exists to carry it.
The evolution matters: HTTP/1.1 is text over one TCP connection with head-of-line blocking; HTTP/2 multiplexes many streams over one TCP connection (but TCP's own head-of-line blocking remains); HTTP/3 moves onto QUIC to kill that last blocking problem. Same L7 semantics — methods, headers, status codes — carried over three different lower stacks. That last sentence is the whole point of layering, and we will come back to it.
Encapsulation: Headers Nested Like Russian Dolls
Here is the picture the table cannot show: what the bytes literally look like as they go down the stack. Each layer wraps the layer above in its own header. By the time an HTTP message hits the wire, it is a payload buried inside header after header.
Read it top to bottom on the sender: the HTTP message gets encrypted, then a TCP header (ports) is bolted on the front to make a segment, then an IP header (addresses) to make a packet, then an Ethernet frame (MACs) for the physical hop. Each outer layer treats everything inside as an opaque payload it must not open.
On the receiver, the exact reverse — decapsulation. The frame arrives, L2 strips the Ethernet header and checks the CRC, hands the packet up. L3 strips the IP header, hands the segment up. L4 strips the TCP header and reassembles the byte stream, hands the payload up. L6 decrypts. L7 finally reads GET /page. Every header is read by exactly the layer that wrote it, and stripped in reverse order to how it was added — last on, first off, like a stack.
DOWN the sender's stack (each layer WRAPS the one above)
L7 Application: the actual message
+---------------------------------------------------+
| GET /page HTTP/1.1 Host: example.com ... |
+---------------------------------------------------+
L6 Presentation: serialize + TLS-encrypt the payload
+---------------------------------------------------+
| [ TLS record: <<encrypted HTTP bytes>> ] |
+---------------------------------------------------+
L4 Transport: prepend TCP header (src+dst PORT, seq)
+----------+----------------------------------------+
| TCP hdr | [ encrypted application payload ] |
| :51522 | |
| ->:443 | |
+----------+----------------------------------------+
\_________ TCP calls this a SEGMENT _____/
L3 Network: prepend IP header (src+dst IP ADDRESS)
+---------+----------+-------------------------------+
| IP hdr | TCP hdr | [ encrypted payload ] |
| 1.2.3.4 | :51522 | |
| ->9.8.7 | ->:443 | |
+---------+----------+-------------------------------+
\______ IP calls this whole thing a PACKET/
L2 Link: wrap in a frame (src+dst MAC) for the next hop
+--------+---------+---------+-----------------+------+
| ETH hdr| IP hdr | TCP hdr | [ payload ] | FCS |
| MACs | | | | crc |
+--------+---------+---------+-----------------+------+
\_______ L2 calls this whole thing a FRAME _/The Journey of One Request
Now the payoff: one https://example.com/page request, end to end. The browser (client, front-end) walks the data down its stack; the bytes cross the wire; the server (back-end) walks them up its stack, handles the request, and sends the response back down its own stack and up the client's. Two diagrams of the same journey — a mermaid sequence and a plain ASCII version — because the shape is worth seeing twice.
Trace the down-path once in prose. L7 produces the request line and headers. L6 encrypts them into a TLS record so nothing on the path can read them. L4 slices that into TCP segments, stamps the source and destination ports, and takes responsibility for getting every byte there in order. L3 wraps each segment in an IP packet with source and destination addresses and hands it to the network, which routes it hop by hop with zero delivery promises. The server does the same steps in reverse, its L7 handler finally sees GET /page, produces 200 OK with the HTML, and the whole thing runs backward down the server's stack and up the client's until the browser paints the page.
Notice what each side did not do. The server's L3 never decrypted anything — that is L6's job. The client's L7 never picked a route — that is L3's job. Every layer minded its own business. That discipline is not an accident; it is the entire design, and it has a name we will get to.
ASCII — same request, DOWN the client and UP the server
CLIENT (front-end) SERVER (back-end)
================== =================
L7 HTTP GET /page ------. ,--> L7 App handler
| | (reads GET /page)
L6 TLS encrypt --------| |--- L6 TLS decrypt
| |
L4 TCP seg :51522->:443-| |--- L4 TCP reassemble
| |
L3 IP pkt 1.2.3.4----- | |--- L3 IP strip header
90->9.8.7.6 | |
v |
+===================================+
| L3 NETWORK: routers, hop by |
| hop, best-effort, may drop |
+===================================+
\____ across the wire ____/
RESPONSE travels the mirror image:
server L7 200 OK -> L6 encrypt -> L4 segment -> L3 packet
-> wire -> client L3 -> L4 -> L6 decrypt -> L7 -> browser rendersBefore / After: L4 vs L7 Load Balancing
Here is where the layer you operate at becomes a real engineering decision with real money attached. A load balancer sits between clients and a fleet of servers and spreads traffic across them. Which layer it reads at changes everything it can do.
L4 load balancing operates at the transport layer. It sees IP addresses and ports and nothing else — the payload is opaque bytes (and if TLS is end-to-end, it is encrypted opaque bytes it could not read even if it wanted to). It picks a backend by a cheap rule (hash the source IP, round-robin the connection) and shovels segments through. It never terminates TLS, never parses HTTP, barely touches the CPU. Fast, cheap, blind.
L7 load balancing operates at the application layer. To read HTTP it must first terminate TLS (decrypt the traffic, which costs CPU and puts the LB inside your security boundary), then parse the HTTP request and route on its content: send /api/* to the API fleet, /img/* to the static fleet, retry idempotent requests on a failed backend, rewrite paths, add headers, do sticky sessions by cookie. Powerful, content-aware, more expensive.
One scenario: route /checkout to the payments fleet
You want checkout traffic to land on a hardened payments fleet, separate from everything else. Watch how the layer decides whether that is even possible.
The L4 way. The load balancer sees 9.8.7.6:443 and a stream of encrypted bytes. It has no idea whether a given connection is /checkout or /cat.jpg — that information lives in the HTTP request, which is L7 data encrypted at L6, three layers above where the L4 LB is looking. So it physically cannot route by path. Your only options are crude: give /checkout its own hostname resolving to a different IP, and let L4 route by destination IP; or run the payments service on a different port. The upside: near-zero latency, no TLS CPU cost on the LB, and the LB never sees a plaintext card number, so its blast radius on a compromise is small. The downside: you have pushed the routing decision out to DNS/IP topology and lost all per-request control.
The L7 way. The load balancer terminates TLS, reads POST /checkout HTTP/2, matches the path, and forwards to the payments fleet — over one shared hostname and IP, alongside all your other routes. It can additionally retry a failed checkout POST-that-is-safe-to-retry, rewrite the path, strip a header, and pin the user to a backend. The cost is real: TLS termination burns CPU and adds a little latency, and because the LB now decrypts payment traffic it sits inside your compliance and blast-radius boundary — a compromise there is far worse. You bought routing power and observability by paying CPU and widening the trust boundary.
That trade — cheap/fast/blind (L4) versus powerful/content-aware/costly (L7) — is the whole decision, and it falls directly out of which layer can see the information you want to route on. If the routing key lives in the HTTP request, no L4 config will ever reach it. That is not a limitation to work around; it is layering working exactly as designed.
BEFORE — L4 load balancer (transport, blind to content)
client --TLS--> [ L4 LB ] reads only IP:port, payload opaque
| hash(src_ip) % N
+--> backend chosen by connection, NOT by URL
every path (/checkout, /img, /api) lands on the same pool
AFTER — L7 load balancer (application, reads the request)
client --TLS--> [ L7 LB ] terminates TLS, parses HTTP
| route on Host + path + headers
+--> /checkout -> payments fleet
+--> /img/* -> static fleet
+--> /api/* -> api fleet (+ retries, rewrites)The Design Principles Underneath
The layered stack is not just a networking artifact. It is a clinic in the design principles we reach for in application code — because the same forces (change isolation, substitutability, single responsibility) produced the same answers. Mapped honestly, forcing nothing.
SRP — Single Responsibility. Each layer has exactly one job and never reaches into another's. L3 routes and only routes; it never retransmits. L4 delivers and only delivers; it never picks a route. L7 speaks the app protocol and never thinks about IP addresses. This is textbook single responsibility, and it is why the L4 load balancer physically cannot route by URL — reading the URL is L7's responsibility, and L4 does not do L7's job.
DRY / encapsulation. Every layer honors the identical contract: take a payload, add your header, pass it down. The header/payload split is defined once and reused at every level. L3 does not re-implement L4's delivery; L7 does not re-implement L3's routing. Reliability logic lives in exactly one place (TCP) instead of being copy-pasted into HTTP, DNS, and every other L7 protocol — which is precisely why they can all share it.
IoC / DI — Inversion of Control and Dependency Injection. A layer depends on the interface of the layer below, not its implementation. L7's contract with L4 is "give me a reliable ordered byte stream," and it does not care how. Swap TCP for QUIC, or IPv4 for IPv6, and HTTP is unchanged — the lower layer is injected underneath, and the upper layer never notices. That is exactly why HTTP/1.1, HTTP/2, and HTTP/3 can carry identical semantics over three different transports.
PubSub / event-driven. The base model is request-response, but some L7 protocols invert it. WebSocket and MQTT turn the connection into a channel where either side pushes messages when it has them — publish/subscribe, not ask/answer. This is where an L7 gateway earns its keep beyond routing: because it parses the application protocol, it can fan out one published message to many subscribed connections, or route by topic — something an L4 balancer, blind to the payload, could never do.
MVC — mapped lightly, and honestly. MVC is an application-architecture pattern, not a network-layer one, and pretending otherwise invents a fake correspondence. The honest map: L7 is the surface where the app's controller/view logic lives — it is the only layer that understands the request as the application means it. L4 and L3 are pure plumbing the app deliberately does not model; the whole point of layering is that your controller never thinks about packets. So MVC does not map onto the stack cleanly, and that is fine — it maps onto the top of it, and everything below is the transport the model never sees.
The One Idea to Keep
Strip everything else away and this is what remains: each layer does one job, wraps the layer above in its own header on the way down, and reads only its own header on the way up. Routing lives at L3, delivery at L4, the app's protocol at L7 — and the reason your L4 load balancer cannot see a URL, your L7 load balancer can but pays for it in CPU and trust boundary, and your HTTP code survives a swap from TCP to QUIC untouched is all the same reason: each layer minds exactly its own business, and depends only on the contract of the one below. Learn the stack once and half the "why is my traffic going there?" questions answer themselves.