Tales of the Kubernetes Ingress Networking: Deployment Patterns for External Load Balancers
Browse the slides: Use the arrows
Change chapter: Left/Right arrows
Next or previous slide: Top and bottom arrows
Overview of the slides: keyboard’s shortcut "o"
Speaker mode (and notes): keyboard’s shortcut "s"
Damien DUPORTAL:
Træfik's Developer 🥑 Advocate @ Containous
We Believe in Open Source
We Deliver Traefik, Traefik Enterprise Edition and Maesh
Commercial Support
30 people distributed, 90% tech
There was Kubernetes cluster.
How to route traffic to these pods? And between pods on different nodes?
Their goal: Expose Pods to allow incoming traffic
Services
have 1-N Endpoints
EndPoints
are determined by Kubernetes API
for different communications use cases:
From inside: type "ClusterIP"
(default).
From outside: types "NodePort"
and "LoadBalancer"
.
Virtual IP, private to the cluster, cluster)-wide (e.g. works from any node to any other node)
Uses public IPs and ports of the nodes, kind of "Routing grid"
Same as NodePort
,excepts it requires (and uses) an external Load Balancer.
Context: Exposes externally a bunch of applications
Challenge: overhead of allocation for LBs. For each application:
One LB resource (either a machine or a dedicated appliance)
At least one public IP
DNS nightmare (think about the CNAMEs to create…)
No centralization of certificates, logs, etc.
Example with Traefik as Ingress Controller:
Deployed as Pods (Deployment
or as DaemonSet
)
Exposed with a Service:
You still need access from the outside
But only one service to deal with (ideally)
Simplified Setup:
Single entrypoint, less configuration, better measures
Less resources used
Separation of concerns: differents algorithms for load balancing, etc.
Designed for (simple) HTTP/HTTPS cases
TCP/UDP can be used, but are not first-class citizens
"Virtual Host First" centric
Feels like you must carefully select your (only) Ingress Controller
Kubernetes gives you freedom:
You can use multiple Ingress Controllers!
Kubernetes gives you choices:
So much deployment patterns that you can do almost anything
Outside the "Borders" of Kubernetes:
Depends on your "platform" (as in infrastructure/cloud)
Still Managed by Kubernetes (Automation)
Requires "plugins" (operators/modules) per Load Balancer provider
No API or no Kubernetes support: requires switching to NodePort
…and I’ll tell you which LB to use…
Cloud providers provides their own external LBs
Fully Automated Management with APIs
Great UX due to the integration: works out of the box
Benefits from cloud provider HA and performances
But:
You have to pay for this :)
Configuration is cloud-specific (using annotations)
Relies on LB implementation limits
Aka. "Run it on your boxes"
Best approach: Metal LB, a Load Balancer implementation for Kubernetes, hosted inside Kubernetes
Uses all Kubernetes primitives (HA, deployment, etc.)
Allows Layer 2 routing as well as BGP
But… still not considered production ready
Otherwise: external static (or legacy) LB
Requires switching to NodePort
Service
Depends on the compute provider: cloud or bare-metal
You need a tool for mananaging clusters: kubeadm, kops, etc.
Most of these tools already manage LB if the provider does.
As a business manager, I need my system to know the IP of the emitters of the requests to track usage, write access logs for legals reasons and limit traffic in some cases.
NAT stands for "Network Adress Translation"
IPv4 world: Routers "masquerades" IPs, to allow routing from different network
DNAT stands for "Destination NAT"
Masquerade of the destination IP with the internal pod IP
SNAT stands for "Source NAT"
Masquerade of the source IP with the router’s IP
Rule: We do NOT want SNAT to happen
Challenge: many intermediate components can interfere and SNAT the packets in our back!
kube-proxy
is a Kubernetes component, running on each worker node
Role: manage the virtual IPs used for Services
Challenge with Source IP: kube-proxy
might SNAT requests
SNAT by kube-proxy
depends on the Service:
Let’s do a tour of Services Types!
ClusterIP
When kube-proxy
is in "iptables" mode: no SNAT ✅
This is the default mode
No intermediate component
NodePort
(Default)SNAT is done ❌ (routing to the node where pod is):
First node to node routing through nodes network
Then node to pod routing through pod network
NodePort
(Local Endpoint)No SNAT ✅ with externalTrafficPolicy
set to Local
Downside: Dropped request if no pod on receiving node
LoadBalancer
(Default)Default: SNAT is done ❌, same as NodePort
External Load Balancer can route to any node
If no local endpoint: Node to node routing with SNAT
LoadBalancer
(Local Endpoint)However, No SNAT ✅ for load balancers implementing Local externalTrafficPolicy
:
GKE/GCE LB, Amazon NLB, etc.
🛠Nodes without local endpoints are removed from the LB by failing healthchecks
👍🏽Pros: no dropped request from client view, but nodes always ready
👎🏼Cons: relies on healthcheck timeouts
Sometimes, SNAT is mandadatory
External LB
Network Constraint
Ingress Controller in the middle
"Network is based on layers" - let’s use another layer:
If using HTTP, retrieve the Source IP from headers
If using TCP/UDP, use the "Proxy Protocol"
Or use distributed logging and tracing
X-Forwarded-From
holds a comma-separated list of all the source IPs SNAT during all network hops.
✅ if you have an External LoadBalancer or an Ingress Controller supporting this header.
⚠️ Not standard (header starting with X-
) so not all HTTP appliance might support it.
Upcoming Official HTTP Header Forwarded
Introduced by HAProxy
Happens at Layer 4 (Transport) for TCP/UDP
Goal: "chain proxies / reverse-proxies without losing the client information"
Supported by a lot of appliances in 2019: AWS ELB, Traefik, Apache, Nginx, Varnish, etc.
Use Case: when SNAT happen AND not way to use HTTP. H
🛠 Idea:
Collect the source IP as soon as possible in distributed logging
Use distributed tracing to track the request in the system
👍🏽Pros: no more complex network setups, distributed logging and tracing stacks are already on your Kubernetes cluster (or will soon be)
👎🏼Cons: relies on the distributed logging/tracing stacks
Amazon EKS: Capturing Source IP with the local external Load Balancer traffic policy
Bare-Metal Kubernetes: Use Traefik for capturing Source IP on HTTP headers