In-Depth Learning of OpenShift Series 1/7: Router and Route

1. Why Does OpenShift Need Router and Route?

As the name suggests, the Router is a routing device, and Route refers to the paths configured within the router. These two concepts in OpenShift are designed to address the need to access services from outside the cluster (that is, from places other than the cluster nodes). I don’t know why OpenShift changed Kubernetes’s Ingress to Router; I think the name Ingress is more fitting.

A simple diagram illustrating the process of accessing applications in pods from the outside via the router and from the inside via the service is shown below:

In the above diagram, three pods of an application are located on node1, node2, and node3 respectively. OpenShift has three layers of IP address concepts:

The pod’s own IP address, which can be likened to the fixed IP of a virtual machine in OpenStack. It only makes sense within the cluster.
The service’s IP address. Services usually have a ClusterIP, which is also a type of internal cluster IP address.
The application’s external IP address, which can be likened to a floating IP in OpenStack, or an IDC IP (there is a NAT mapping relationship between the floating IP and this).

Therefore, to access applications in pods from outside the cluster, there are essentially two methods:

One is to use a proxy to convert the external IP address into the backend Pod IP address. This is the idea behind OpenShift’s router/route. The router service in OpenShift is a cluster service running on specific nodes (usually infrastructure nodes), created and managed by the cluster administrator. It can have multiple replicas (pods). The router can have multiple routes, each route can find its backend pod list through the domain name of external HTTP requests and forward network packets. This means exposing applications in pods to external domain names, allowing users to access the applications via domain names from outside. This is essentially a layer 7 load balancer. OpenShift uses HAProxy by default for this, but it also supports other implementations such as F5.
The other way is to directly expose the service to the outside of the cluster. This method will be explained in detail in the article on ‘Service’.

2. How Does OpenShift Use HAProxy to Implement Router and Route?

2.1 Router Deployment

When deploying an OpenShift cluster using Ansible with default configurations, an HAProxy pod will run on the Infra node in Host networking mode, listening on ports 80 and 443 on all network interfaces.

[root@infra-node3 cloud-user]# netstat -lntp | grep haproxy
tcp        0      0 127.0.0.1:10443         0.0.0.0:*               LISTEN      583/haproxy        
tcp        0      0 127.0.0.1:10444         0.0.0.0:*               LISTEN      583/haproxy        
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      583/haproxy        
tcp        0      0 0.0.0.0:443             0.0.0.0:*               LISTEN      583/haproxy

Among them, 10443 and 10444 on 172.0.0.1 are for HAProxy’s internal use. Further explanations will follow.

Therefore, only one HAProxy pod can exist on each infra node, as these ports can only be occupied once. If the scheduler cannot find a suitable node, the scheduling of the router service will fail:

0/7 nodes are available: 2 node(s) didn't have free ports for the requested pod ports, 5 node(s) didn't match node selector

The OpenShift HAProxy Router supports two deployment methods:

One is the common single Router service deployment, which has one or more instances (pods) distributed across multiple nodes, responsible for external access to services deployed across the entire cluster.
The other is sharding deployment. In this case, there will be multiple Router services, each responsible for a specified number of projects, with mapping between the router and the projects/namespaces using labels. This is a solution proposed to address the performance limitations of a single Router.

OpenShift provides the oc adm router command to create router services.

Creating a router:

[root@master1 cloud-user]# oc adm router router2 --replicas=1 --service-account=routerinfo: password for stats user admin has been set to J3YyPjlbqf--&gt; Creating router router2 ...
    warning: serviceaccounts "router" already exists
    clusterrolebinding.authorization.openshift.io "router-router2-role" created
    deploymentconfig.apps.openshift.io "router2" created
    service "router2" created--&gt; Success

For detailed deployment methods, please refer to the official documentation here.

2.2 HAProxy Process in Router Pod

Within each pod of the Router service, the openshift-router process starts a haproxy process:

UID        PID  PPID  C STIME TTY          TIME CMD1000000+     1     0  0 Nov21 ?        00:14:27 /usr/bin/openshift-router1000000+ 16011     1  0 12:42 ?        00:00:00 /usr/sbin/haproxy -f /var/lib/haproxy/conf/haproxy.config -p /var/lib/haproxy/run/haproxy.pid -x /var/lib/haproxy/run/haproxy.sock -sf 16004

Viewing the configuration file used by haproxy (only a portion):

global
  maxconn 20000
  daemon
  ca-base /etc/ssl
  crt-base /etc/ssl
 。。。。defaults
  maxconn 20000

  # Add x-forwarded-for header.
  # server openshift_backend 127.0.0.1:8080
  errorfile 503 /var/lib/haproxy/conf/error-page-503.http。。。
timeout http-request 10s
  timeout http-keep-alive 300s

  # Long timeout for WebSocket connections.
  timeout tunnel 1h

frontend public
    
  bind :80
  mode http
  tcp-request inspect-delay 5s
  tcp-request content accept if HTTP
  monitor-uri /_______internal_router_healthz

  # Strip off Proxy headers to prevent HTTpoxy (https://httpoxy.org/)
  http-request del-header Proxy

  # DNS labels are case insensitive (RFC 4343), we need to convert the hostname into lowercase
  # before matching, or any requests containing uppercase characters will never match.
  http-request set-header Host %[req.hdr(Host),lower]

  # check if we need to redirect/force using https.
  acl secure_redirect base,map_reg(/var/lib/haproxy/conf/os_route_http_redirect.map) -m found
  redirect scheme https if secure_redirect

  use_backend %[base,map_reg(/var/lib/haproxy/conf/os_http_be.map)]

  default_backend openshift_default

# public ssl accepts all connections and isn't checking certificates yet certificates to use will be# determined by the next backend in the chain which may be an app backend (passthrough termination) or a backend
# that terminates encryption in this router (edge)
frontend public_ssl
    
  bind :443
  tcp-request  inspect-delay 5s
  tcp-request content accept if { req_ssl_hello_type 1 }

  # if the connection is SNI and the route is a passthrough don't use the termination backend, just use the tcp backend
  # for the SNI case, we also need to compare it in case-insensitive mode (by converting it to lowercase) as RFC 4343 says
  acl sni req.ssl_sni -m found
  acl sni_passthrough req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_sni_passthrough.map) -m found
  use_backend %[req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_tcp_be.map)] if sni sni_passthrough

  # if the route is SNI and NOT passthrough enter the termination flow
  use_backend be_sni if sni

  # non SNI requests should enter a default termination backend rather than the custom cert SNI backend since it
  # will not be able to match a cert to an SNI host
  default_backend be_no_sni。。。

For simplicity, the above is only a portion of the configuration file, which mainly includes three types:

Global configuration, such as maximum connection number maxconn, timeout, etc.; and the front section, which is the frontend configuration. HAProxy listens for external https and http requests on ports 443 and 80 respectively by default.
Backend, which is the backend configuration for each service, containing many key contents, such as backend protocol (mode), load balancing method (balance), backend list (server, which includes its IP address and port), certificates, etc.

Therefore, the router functionality of OpenShift needs to manage and control these three parts.

For a detailed introduction to load balancers and HAProxy, you can refer to the article Understanding Neutron (7): How Neutron Implements Load Balancer Virtualization.

2.3 Global Configuration Management

To specify or modify HAProxy’s global configuration, OpenShift provides two methods:

(1) The first method is to use the oc adm router command to specify various parameters when creating the router, such as –max-connections to set the maximum number of connections. For example:

oc adm router --max-connections=200000 --ports='81:80,444:443' router3

The created HAProxy’s maxconn will be 20000, and the ports exposed by the router3 service are 81 and 444, but the HAProxy pod’s ports remain 80 and 443.

(2) By setting environment variables for dc/ to set the global configuration of the router.

A complete list of environment variables can be found in the official documentation here. For example, after running the following command,

 oc set env dc/router3 ROUTER_SERVICE_HTTPS_PORT=444 ROUTER_SERVICE_HTTP_PORT=81 STATS_PORT=1937

router3 will be redeployed, and the newly deployed HAProxy’s https listening port will be 444, the http listening port will be 80, and the statistics port will be 1937.

2.4 OpenShift Passthrough Type Route and HAProxy Backend

(1) Create a route through OpenShift Console or oc command to expose the jenkins service of the sit project to the domain sitjenkins.com.cn:

Create the route in the interface:

The result:

Name:                   sitjenkins.com.cn
Namespace:              sit
Labels:                 app=jenkins-ephemeral
                        template=jenkins-ephemeral-template
Annotations:            &lt;none&gt;Requested Host:         sitjenkins.com.cn
Path:                   &lt;none&gt;TLS Termination:        passthrough
Endpoint Port:          web

Service:        jenkins
Weight:         100 (100%)
Endpoints:      10.128.2.15:8080, 10.131.0.10:8080

Here, the service name acts as an intermediary, connecting the route and the service’s endpoints (which are the pods).

(2) The configuration file of the HAProxy process in the two pods of the router service has an additional backend:

# Secure backend, pass through
backend be_tcp:sit:sitjenkins.com.cn
  balance source

  hash-type consistent
  timeout check 5000ms}
  server pod:jenkins-1-bqhfj:jenkins:10.128.2.15:8080 10.128.2.15:8080 weight 256 check inter 5000ms
  server pod:jenkins-1-h2fff:jenkins:10.131.0.10:8080 10.131.0.10:8080 weight 256 check inter 5000ms

Among them, these backend servers are actually the pods, which OpenShift finds through the service name in step (1). The balance is the load balancing strategy, which will be explained later.

(3) The file /var/lib/haproxy/conf/os_sni_passthrough.map has an additional record

sh-4.2$ cat /var/lib/haproxy/conf/os_sni_passthrough.map^sitjenkins\.com\.cn(:[0-9]+)?(/.*)?$ 1

(4) The file /var/lib/haproxy/conf/os_tcp_be.map has an additional record

sh-4.2$ cat /var/lib/haproxy/conf/os_tcp_be.map^sitjenkins\.com\.cn(:[0-9]+)?(/.*)?$ be_tcp:sit:sitjenkins.com.cn

(5) HAProxy will select the backend logic for this route based on the above map files as follows

frontend public_ssl  # Explanation: Frontend protocol https,

  bind :443  ## Frontend port 443
  tcp-request  inspect-delay 5s
  tcp-request content accept if { req_ssl_hello_type 1 }

  # if the connection is SNI and the route is a passthrough don't use the termination backend, just use the tcp backend
  # for the SNI case, we also need to compare it in case-insensitive mode (by converting it to lowercase) as RFC 4343 says
  acl sni req.ssl_sni -m found ## Check https request supports sni
  acl sni_passthrough req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_sni_passthrough.map) -m found ## Check if the hostname passed through sni is in os_sni_passthrough.map file
  use_backend %[req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_tcp_be.map)] if sni sni_passthrough ## Get backend name from os_tcp_be.map based on sni hostname

  # if the route is SNI and NOT passthrough enter the termination flow
  use_backend be_sni if sni

  # non SNI requests should enter a default termination backend rather than the custom cert SNI backend since it
  # will not be able to match a cert to an SNI host
  default_backend be_no_sni

(6) The HAProxy process will restart to apply the modified configuration file.

Some background knowledge is needed to understand the script in (5):

SNI: TLS Server Name Indication (SNI) is an extension of the TLS network protocol, which allows the client to inform the server of the hostname it will connect to before the TLS handshake, enabling the server to return the appropriate certificate to the client based on that hostname, thus allowing the server to support multiple certificates for multiple hostnames. For more details, refer to here.
OpenShift passthrough route: This type of route does not terminate the SSL connection on the router; instead, the router passes the encrypted connection directly to the pod, meaning no certificates or keys need to be configured on the router.
HAProxy support for SNI: HAProxy selects the specific backend based on the hostname in the SNI information. For more details, refer to here.
HAProxy ACL: For more details, refer to here.

From the blue comments above, we can see that the HAProxy process uses the domain sitjenkins.com.cn passed through the SNI in the https request to obtain the backend name be_tcp:sit:sitjenkins.com.cn from the os_tcp_be.map file, thus corresponding to the backend in step (2).

The router used by OpenShift employs HAProxy to implement domain-based load balancing routing, as illustrated below. For specific instructions, please refer to the official documentation.

2.5 OpenShift Edge and Re-encrypt Type Routes with HAProxy

HAProxy Frontend: The frontend still listens for external HTTPS requests on port 443

frontend public_ssl
  bind :443
.....
  # if the route is SNI and NOT passthrough enter the termination flow
  use_backend be_sni if sni

However, when the TLS termination type is not passthrough (edge or re-encrypt), it will use the backend be_sni.

backend be_sni
  server fe_sni 127.0.0.1:10444 weight 1 send-prox

This backend is provided by the local 127.0.0.1:10444, thus it forwards to the frontend fe_sni:

frontend fe_sni
  # terminate ssl on edge
  bind 127.0.0.1:10444 ssl no-sslv3 crt /var/lib/haproxy/router/certs/default.pem crt-list /var/lib/haproxy/conf/cert_config.map accept-proxy
  mode http
......

  # map to backend
  # Search from most specific to general path (host case).
  # Note: If no match, haproxy uses the default_backend, no other
  #       use_backend directives below this will be processed.
  use_backend %[base,map_reg(/var/lib/haproxy/conf/os_edge_reencrypt_be.map)]

  default_backend openshift_default

Mapping file:

sh-4.2$ cat /var/lib/haproxy/conf/os_edge_reencrypt_be.map^edgejenkins\.com\.cn(:[0-9]+)?(/.*)?$ be_edge_http:sit:jenkins-edge

HAProxy backend for Edge type route:

backend be_edge_http:sit:jenkins-edge
  mode http
  option redispatch
  option forwardfor
  balance leastconn

  timeout check 5000ms
  .....
  server pod:jenkins-1-bqhfj:jenkins:10.128.2.15:8080 10.128.2.15:8080 cookie 71c6bd03732fa7da2f1b497b1e4c7993 weight 256 check inter 5000ms
  server pod:jenkins-1-h2fff:jenkins:10.131.0.10:8080 10.131.0.10:8080 cookie fa8d7fb72a46958a7add1406e6d26cc8 weight 256 check inter 5000ms

HAProxy backend for Re-encrypt type route:

# Plain http backend or backend with TLS terminated at the edge or a
# secure backend with re-encryption.
backend be_secure:sit:reencryptjenkins.com.cn
  mode http
  。。。
  http-request set-header X-Forwarded-Host %[req.hdr(host)]
    http-request set-header X-Forwarded-Port %[dst_port]
    http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
    http-request set-header X-Forwarded-Proto https if { ssl_fc }
    http-request set-header X-Forwarded-Proto-Version h2 if { ssl_fc_alpn -i h2 }

  server pod:jenkins-1-bqhfj:jenkins:10.128.2.15:8080 10.128.2.15:8080 cookie ... weight 256 ssl verifyhost jenkins.sit.svc verify required ca-file /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt check inter 5000ms #与后端的链路采用 ssl 加密，并且要检查hostname
  server pod:jenkins-1-h2fff:jenkins:10.131.0.10:8080 10.131.0.10:8080 cookie ... weight 256 ssl verifyhost jenkins.sit.svc verify required ca-file /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt check inter 5000ms

Here, it can be seen that the connection is re-encrypted using keys, but it is unclear why the mode is still http instead of https.

2.6 Setting and Modifying Route Configurations

Route configurations mainly include the following important aspects:

(1) SSL termination methods. There are three types:

Edge: TLS is terminated on the router, and non-SSL packets are forwarded to the backend pod. Therefore, a TLS certificate needs to be installed on the router. If not installed, the router’s default certificate will be used.
Passthrough: Encrypted packets are sent directly to the pod, and the router does not terminate TLS, meaning no certificates or keys need to be configured on the router.
Re-encryption: This is a variant of edge. First, the router will use one certificate for TLS termination, and then use another certificate for re-encryption before sending it to the backend pod. Thus, the entire network path is encrypted.

Settings:

Can be set when creating the route or modified through the termination configuration item of the route.
For specific details, refer to the official documentation here.

(2) Load balancing policies.

There are three strategies:

Round Robin: Uses all backends in turn based on weights.
Least Connections: Chooses the backend with the least connections to receive requests.
Source: Hashes the source IP to ensure requests from the same source IP are sent to the same backend.

Settings:

To modify the load balancing strategy for the entire router, use the ROUTER_TCP_BALANCE_SCHEME environment variable to set the load balancing strategy for all passthrough type routes, and use ROUTER_LOAD_BALANCE_ALGORITHM for other types of routes.
You can use haproxy.router.openshift.io/balance to set the load balancing strategy for a specific route.

For example:

Set the environment variable for the entire router: oc set env dc/router ROUTER_TCP_BALANCE_SCHEME=roundrobin
After modification, the router instance will redeploy, and all passthrough routes will be of the roundrobin type. The default is source type.
Modify the load balancing strategy for a specific route: oc edit route aaaa.svc.cluster.local

After modification, the balance value in the backend corresponding to this route in HAProxy will be changed to leastconn.

3. How to Troubleshoot Common Issues?

From the above analysis, it can be seen that to ensure that both the router and route work properly, at least the following aspects must be functioning correctly:

The client uses the domain name and port configured in the route to access the service.
DNS can resolve the domain name to the server where the target router is located (this is more complex when using sharding configurations, especially to note).
If another layer 4 load balancer is used, it must be configured correctly and be operational.
HAProxy can match the correct backend based on the domain name.
The configurations of the router and route are correctly reflected in the HAProxy configuration file.
HAProxy process has restarted, thus reading the newly modified configuration file.
The backend pod list is correct, and at least one pod is functioning properly.

If you see the error page below, it indicates that at least one of the points from 3 to 7 is not functioning properly. At this point, targeted troubleshooting can be conducted.

Thank you for reading, and welcome to follow my WeChat public account: