1. Why Does OpenShift Need Router and Route?
As the name suggests, the Router is a routing device, and Route refers to the paths configured within the router. These two concepts in OpenShift are designed to address the need to access services from outside the cluster (that is, from places other than the cluster nodes). I don’t know why OpenShift changed Kubernetes’s Ingress to Router; I think the name Ingress is more fitting.
A simple diagram illustrating the process of accessing applications in pods from the outside via the router and from the inside via the service is shown below:
In the above diagram, three pods of an application are located on node1, node2, and node3 respectively. OpenShift has three layers of IP address concepts:
-
The pod’s own IP address, which can be likened to the fixed IP of a virtual machine in OpenStack. It only makes sense within the cluster.
-
The service’s IP address. Services usually have a ClusterIP, which is also a type of internal cluster IP address.
-
The application’s external IP address, which can be likened to a floating IP in OpenStack, or an IDC IP (there is a NAT mapping relationship between the floating IP and this).
Therefore, to access applications in pods from outside the cluster, there are essentially two methods:
-
One is to use a proxy to convert the external IP address into the backend Pod IP address. This is the idea behind OpenShift’s router/route. The router service in OpenShift is a cluster service running on specific nodes (usually infrastructure nodes), created and managed by the cluster administrator. It can have multiple replicas (pods). The router can have multiple routes, each route can find its backend pod list through the domain name of external HTTP requests and forward network packets. This means exposing applications in pods to external domain names, allowing users to access the applications via domain names from outside. This is essentially a layer 7 load balancer. OpenShift uses HAProxy by default for this, but it also supports other implementations such as F5.
-
The other way is to directly expose the service to the outside of the cluster. This method will be explained in detail in the article on ‘Service’.
2. How Does OpenShift Use HAProxy to Implement Router and Route?
2.1 Router Deployment
When deploying an OpenShift cluster using Ansible with default configurations, an HAProxy pod will run on the Infra node in Host networking mode, listening on ports 80 and 443 on all network interfaces.
[root@infra-node3 cloud-user]# netstat -lntp | grep haproxy
tcp 0 0 127.0.0.1:10443 0.0.0.0:* LISTEN 583/haproxy
tcp 0 0 127.0.0.1:10444 0.0.0.0:* LISTEN 583/haproxy
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 583/haproxy
tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 583/haproxy
Among them, 10443 and 10444 on 172.0.0.1 are for HAProxy’s internal use. Further explanations will follow.
Therefore, only one HAProxy pod can exist on each infra node, as these ports can only be occupied once. If the scheduler cannot find a suitable node, the scheduling of the router service will fail:
0/7 nodes are available: 2 node(s) didn't have free ports for the requested pod ports, 5 node(s) didn't match node selector
The OpenShift HAProxy Router supports two deployment methods:
-
One is the common single Router service deployment, which has one or more instances (pods) distributed across multiple nodes, responsible for external access to services deployed across the entire cluster.
-
The other is sharding deployment. In this case, there will be multiple Router services, each responsible for a specified number of projects, with mapping between the router and the projects/namespaces using labels. This is a solution proposed to address the performance limitations of a single Router.
OpenShift provides the oc adm router command to create router services.
Creating a router:
[root@master1 cloud-user]# oc adm router router2 --replicas=1 --service-account=routerinfo: password for stats user admin has been set to J3YyPjlbqf--> Creating router router2 ...
warning: serviceaccounts "router" already exists
clusterrolebinding.authorization.openshift.io "router-router2-role" created
deploymentconfig.apps.openshift.io "router2" created
service "router2" created--> Success
For detailed deployment methods, please refer to the official documentation here.
2.2 HAProxy Process in Router Pod
Within each pod of the Router service, the openshift-router process starts a haproxy process:
UID PID PPID C STIME TTY TIME CMD1000000+ 1 0 0 Nov21 ? 00:14:27 /usr/bin/openshift-router1000000+ 16011 1 0 12:42 ? 00:00:00 /usr/sbin/haproxy -f /var/lib/haproxy/conf/haproxy.config -p /var/lib/haproxy/run/haproxy.pid -x /var/lib/haproxy/run/haproxy.sock -sf 16004
Viewing the configuration file used by haproxy (only a portion):
global
maxconn 20000
daemon
ca-base /etc/ssl
crt-base /etc/ssl
。。。。defaults
maxconn 20000
# Add x-forwarded-for header.
# server openshift_backend 127.0.0.1:8080
errorfile 503 /var/lib/haproxy/conf/error-page-503.http。。。
timeout http-request 10s
timeout http-keep-alive 300s
# Long timeout for WebSocket connections.
timeout tunnel 1h
frontend public
bind :80
mode http
tcp-request inspect-delay 5s
tcp-request content accept if HTTP
monitor-uri /_______internal_router_healthz
# Strip off Proxy headers to prevent HTTpoxy (https://httpoxy.org/)
http-request del-header Proxy
# DNS labels are case insensitive (RFC 4343), we need to convert the hostname into lowercase
# before matching, or any requests containing uppercase characters will never match.
http-request set-header Host %[req.hdr(Host),lower]
# check if we need to redirect/force using https.
acl secure_redirect base,map_reg(/var/lib/haproxy/conf/os_route_http_redirect.map) -m found
redirect scheme https if secure_redirect
use_backend %[base,map_reg(/var/lib/haproxy/conf/os_http_be.map)]
default_backend openshift_default
# public ssl accepts all connections and isn't checking certificates yet certificates to use will be# determined by the next backend in the chain which may be an app backend (passthrough termination) or a backend
# that terminates encryption in this router (edge)
frontend public_ssl
bind :443
tcp-request inspect-delay 5s
tcp-request content accept if { req_ssl_hello_type 1 }
# if the connection is SNI and the route is a passthrough don't use the termination backend, just use the tcp backend
# for the SNI case, we also need to compare it in case-insensitive mode (by converting it to lowercase) as RFC 4343 says
acl sni req.ssl_sni -m found
acl sni_passthrough req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_sni_passthrough.map) -m found
use_backend %[req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_tcp_be.map)] if sni sni_passthrough
# if the route is SNI and NOT passthrough enter the termination flow
use_backend be_sni if sni
# non SNI requests should enter a default termination backend rather than the custom cert SNI backend since it
# will not be able to match a cert to an SNI host
default_backend be_no_sni。。。
For simplicity, the above is only a portion of the configuration file, which mainly includes three types:
-
Global configuration, such as maximum connection number maxconn, timeout, etc.; and the front section, which is the frontend configuration. HAProxy listens for external https and http requests on ports 443 and 80 respectively by default.
-
Backend, which is the backend configuration for each service, containing many key contents, such as backend protocol (mode), load balancing method (balance), backend list (server, which includes its IP address and port), certificates, etc.
Therefore, the router functionality of OpenShift needs to manage and control these three parts.
For a detailed introduction to load balancers and HAProxy, you can refer to the article Understanding Neutron (7): How Neutron Implements Load Balancer Virtualization.
2.3 Global Configuration Management
To specify or modify HAProxy’s global configuration, OpenShift provides two methods:
(1) The first method is to use the oc adm router command to specify various parameters when creating the router, such as –max-connections to set the maximum number of connections. For example:
oc adm router --max-connections=200000 --ports='81:80,444:443' router3
The created HAProxy’s maxconn will be 20000, and the ports exposed by the router3 service are 81 and 444, but the HAProxy pod’s ports remain 80 and 443.
(2) By setting environment variables for dc/
A complete list of environment variables can be found in the official documentation here. For example, after running the following command,
oc set env dc/router3 ROUTER_SERVICE_HTTPS_PORT=444 ROUTER_SERVICE_HTTP_PORT=81 STATS_PORT=1937
router3 will be redeployed, and the newly deployed HAProxy’s https listening port will be 444, the http listening port will be 80, and the statistics port will be 1937.
2.4 OpenShift Passthrough Type Route and HAProxy Backend
(1) Create a route through OpenShift Console or oc command to expose the jenkins service of the sit project to the domain sitjenkins.com.cn:
Create the route in the interface:
The result:
Name: sitjenkins.com.cn
Namespace: sit
Labels: app=jenkins-ephemeral
template=jenkins-ephemeral-template
Annotations: <none>Requested Host: sitjenkins.com.cn
Path: <none>TLS Termination: passthrough
Endpoint Port: web
Service: jenkins
Weight: 100 (100%)
Endpoints: 10.128.2.15:8080, 10.131.0.10:8080
Here, the service name acts as an intermediary, connecting the route and the service’s endpoints (which are the pods).
(2) The configuration file of the HAProxy process in the two pods of the router service has an additional backend:
# Secure backend, pass through
backend be_tcp:sit:sitjenkins.com.cn
balance source
hash-type consistent
timeout check 5000ms}
server pod:jenkins-1-bqhfj:jenkins:10.128.2.15:8080 10.128.2.15:8080 weight 256 check inter 5000ms
server pod:jenkins-1-h2fff:jenkins:10.131.0.10:8080 10.131.0.10:8080 weight 256 check inter 5000ms
Among them, these backend servers are actually the pods, which OpenShift finds through the service name in step (1). The balance is the load balancing strategy, which will be explained later.
(3) The file /var/lib/haproxy/conf/os_sni_passthrough.map has an additional record
sh-4.2$ cat /var/lib/haproxy/conf/os_sni_passthrough.map^sitjenkins\.com\.cn(:[0-9]+)?(/.*)?$ 1
(4) The file /var/lib/haproxy/conf/os_tcp_be.map has an additional record
sh-4.2$ cat /var/lib/haproxy/conf/os_tcp_be.map^sitjenkins\.com\.cn(:[0-9]+)?(/.*)?$ be_tcp:sit:sitjenkins.com.cn
(5) HAProxy will select the backend logic for this route based on the above map files as follows
frontend public_ssl # Explanation: Frontend protocol https,
bind :443 ## Frontend port 443
tcp-request inspect-delay 5s
tcp-request content accept if { req_ssl_hello_type 1 }
# if the connection is SNI and the route is a passthrough don't use the termination backend, just use the tcp backend
# for the SNI case, we also need to compare it in case-insensitive mode (by converting it to lowercase) as RFC 4343 says
acl sni req.ssl_sni -m found ## Check https request supports sni
acl sni_passthrough req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_sni_passthrough.map) -m found ## Check if the hostname passed through sni is in os_sni_passthrough.map file
use_backend %[req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_tcp_be.map)] if sni sni_passthrough ## Get backend name from os_tcp_be.map based on sni hostname
# if the route is SNI and NOT passthrough enter the termination flow
use_backend be_sni if sni
# non SNI requests should enter a default termination backend rather than the custom cert SNI backend since it
# will not be able to match a cert to an SNI host
default_backend be_no_sni
(6) The HAProxy process will restart to apply the modified configuration file.
Some background knowledge is needed to understand the script in (5):
-
SNI: TLS Server Name Indication (SNI) is an extension of the TLS network protocol, which allows the client to inform the server of the hostname it will connect to before the TLS handshake, enabling the server to return the appropriate certificate to the client based on that hostname, thus allowing the server to support multiple certificates for multiple hostnames. For more details, refer to here.
-
OpenShift passthrough route: This type of route does not terminate the SSL connection on the router; instead, the router passes the encrypted connection directly to the pod, meaning no certificates or keys need to be configured on the router.
-
HAProxy support for SNI: HAProxy selects the specific backend based on the hostname in the SNI information. For more details, refer to here.
-
HAProxy ACL: For more details, refer to here.
From the blue comments above, we can see that the HAProxy process uses the domain sitjenkins.com.cn passed through the SNI in the https request to obtain the backend name be_tcp:sit:sitjenkins.com.cn from the os_tcp_be.map file, thus corresponding to the backend in step (2).
The router used by OpenShift employs HAProxy to implement domain-based load balancing routing, as illustrated below. For specific instructions, please refer to the official documentation.
2.5 OpenShift Edge and Re-encrypt Type Routes with HAProxy
HAProxy Frontend: The frontend still listens for external HTTPS requests on port 443
frontend public_ssl
bind :443
.....
# if the route is SNI and NOT passthrough enter the termination flow
use_backend be_sni if sni
However, when the TLS termination type is not passthrough (edge or re-encrypt), it will use the backend be_sni.
backend be_sni
server fe_sni 127.0.0.1:10444 weight 1 send-prox
This backend is provided by the local 127.0.0.1:10444, thus it forwards to the frontend fe_sni:
frontend fe_sni
# terminate ssl on edge
bind 127.0.0.1:10444 ssl no-sslv3 crt /var/lib/haproxy/router/certs/default.pem crt-list /var/lib/haproxy/conf/cert_config.map accept-proxy
mode http
......
# map to backend
# Search from most specific to general path (host case).
# Note: If no match, haproxy uses the default_backend, no other
# use_backend directives below this will be processed.
use_backend %[base,map_reg(/var/lib/haproxy/conf/os_edge_reencrypt_be.map)]
default_backend openshift_default
Mapping file:
sh-4.2$ cat /var/lib/haproxy/conf/os_edge_reencrypt_be.map^edgejenkins\.com\.cn(:[0-9]+)?(/.*)?$ be_edge_http:sit:jenkins-edge
HAProxy backend for Edge type route:
backend be_edge_http:sit:jenkins-edge
mode http
option redispatch
option forwardfor
balance leastconn
timeout check 5000ms
.....
server pod:jenkins-1-bqhfj:jenkins:10.128.2.15:8080 10.128.2.15:8080 cookie 71c6bd03732fa7da2f1b497b1e4c7993 weight 256 check inter 5000ms
server pod:jenkins-1-h2fff:jenkins:10.131.0.10:8080 10.131.0.10:8080 cookie fa8d7fb72a46958a7add1406e6d26cc8 weight 256 check inter 5000ms
HAProxy backend for Re-encrypt type route:
# Plain http backend or backend with TLS terminated at the edge or a
# secure backend with re-encryption.
backend be_secure:sit:reencryptjenkins.com.cn
mode http
。。。
http-request set-header X-Forwarded-Host %[req.hdr(host)]
http-request set-header X-Forwarded-Port %[dst_port]
http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
http-request set-header X-Forwarded-Proto https if { ssl_fc }
http-request set-header X-Forwarded-Proto-Version h2 if { ssl_fc_alpn -i h2 }
server pod:jenkins-1-bqhfj:jenkins:10.128.2.15:8080 10.128.2.15:8080 cookie ... weight 256 ssl verifyhost jenkins.sit.svc verify required ca-file /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt check inter 5000ms #与后端的链路采用 ssl 加密,并且要检查hostname
server pod:jenkins-1-h2fff:jenkins:10.131.0.10:8080 10.131.0.10:8080 cookie ... weight 256 ssl verifyhost jenkins.sit.svc verify required ca-file /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt check inter 5000ms
Here, it can be seen that the connection is re-encrypted using keys, but it is unclear why the mode is still http instead of https.
2.6 Setting and Modifying Route Configurations
Route configurations mainly include the following important aspects:
(1) SSL termination methods. There are three types:
-
Edge: TLS is terminated on the router, and non-SSL packets are forwarded to the backend pod. Therefore, a TLS certificate needs to be installed on the router. If not installed, the router’s default certificate will be used.
-
Passthrough: Encrypted packets are sent directly to the pod, and the router does not terminate TLS, meaning no certificates or keys need to be configured on the router.
-
Re-encryption: This is a variant of edge. First, the router will use one certificate for TLS termination, and then use another certificate for re-encryption before sending it to the backend pod. Thus, the entire network path is encrypted.
Settings:
-
Can be set when creating the route or modified through the termination configuration item of the route.
-
For specific details, refer to the official documentation here.
(2) Load balancing policies.
There are three strategies:
-
Round Robin: Uses all backends in turn based on weights.
-
Least Connections: Chooses the backend with the least connections to receive requests.
-
Source: Hashes the source IP to ensure requests from the same source IP are sent to the same backend.
Settings:
-
To modify the load balancing strategy for the entire router, use the ROUTER_TCP_BALANCE_SCHEME environment variable to set the load balancing strategy for all passthrough type routes, and use ROUTER_LOAD_BALANCE_ALGORITHM for other types of routes.
-
You can use haproxy.router.openshift.io/balance to set the load balancing strategy for a specific route.
For example:
-
Set the environment variable for the entire router: oc set env dc/router ROUTER_TCP_BALANCE_SCHEME=roundrobin
-
After modification, the router instance will redeploy, and all passthrough routes will be of the roundrobin type. The default is source type.
-
Modify the load balancing strategy for a specific route: oc edit route aaaa.svc.cluster.local
After modification, the balance value in the backend corresponding to this route in HAProxy will be changed to leastconn.
3. How to Troubleshoot Common Issues?
From the above analysis, it can be seen that to ensure that both the router and route work properly, at least the following aspects must be functioning correctly:
-
The client uses the domain name and port configured in the route to access the service.
-
DNS can resolve the domain name to the server where the target router is located (this is more complex when using sharding configurations, especially to note).
-
If another layer 4 load balancer is used, it must be configured correctly and be operational.
-
HAProxy can match the correct backend based on the domain name.
-
The configurations of the router and route are correctly reflected in the HAProxy configuration file.
-
HAProxy process has restarted, thus reading the newly modified configuration file.
-
The backend pod list is correct, and at least one pod is functioning properly.
If you see the error page below, it indicates that at least one of the points from 3 to 7 is not functioning properly. At this point, targeted troubleshooting can be conducted.
Thank you for reading, and welcome to follow my WeChat public account: