nekop's blog

OpenShift / JBoss / WildFly / Infinispanの中の人 http://twitter.com/nekop

OpenShiftのRouteとRouter

OpenShift 全部俺 Advent Calendar 2017

OpenShiftにはKubernetesで言うところのIngress, Ingress ControllerであるRouteとRouterが標準で利用できます。Kubernetes 1.0のころにはIngressが無かったので、独自に実装されました。

https://docs.openshift.org/latest/architecture/networking/haproxy-router.html

https://docs.openshift.org/latest/architecture/networking/routes.html

OpenShift RouterはフロントエンドロードバランサとしてHAProxyを利用しています。また、HAProxyを起動したり、設定を更新したり、リロードしたりするopenshift-routerプロセスがコンテナプロセスとなっています。Routerはhostnetowrkを利用して80ポートと443ポートをLISTENしており、HTTPリクエストのHostヘッダもしくはTLS SNIのホスト名に対応するPodのエンドポイントにトラフィックを転送します。よくある勘違いとしてRouterはServiceにリクエストを転送する、というのがありますが、Serviceは基本的にヘルスチェックが不完全なL4ロードバランスとなっているため、Serviceにリクエストを転送したりはしません。RouterはPodのEndpointをバックエンドサーバとして登録するようになっており、そちらに転送します。oc exposeではServiceを指定するとRouteが作成されることがこの誤解の原因の一つだと思いますが、設定ファイルhaproxy.configを見るとそれは正しくないことがわかります。

OpenShift Routerはdefaultプロジェクトにデプロイされています。

$ oc project default
$ oc get all -o wide
NAME                                REVISION   DESIRED   CURRENT   TRIGGERED BY
deploymentconfigs/docker-registry   1          1         1         config
deploymentconfigs/router            1          1         1         config

NAME                               READY     STATUS      RESTARTS   AGE       IP                NODE
po/docker-registry-1-694ml         1/1       Running     2          14d       172.17.0.2        localhost
po/persistent-volume-setup-hpd4l   0/1       Completed   0          14d       172.17.0.2        localhost
po/router-1-sv5j8                  1/1       Running     2          14d       192.168.124.246   localhost

NAME                   DESIRED   CURRENT   READY     AGE       CONTAINER(S)   IMAGE(S)                                  SELECTOR
rc/docker-registry-1   1         1         1         14d       registry       openshift/origin-docker-registry:v3.7.0   deployment=docker-registry-1,deploymentconfig=docker-registry,docker-registry=default
rc/router-1            1         1         1         14d       router         openshift/origin-haproxy-router:v3.7.0    deployment=router-1,deploymentconfig=router,router=router

NAME                  CLUSTER-IP     EXTERNAL-IP   PORT(S)                   AGE       SELECTOR
svc/docker-registry   172.30.1.1     <none>        5000/TCP                  14d       docker-registry=default
svc/kubernetes        172.30.0.1     <none>        443/TCP,53/UDP,53/TCP     14d       <none>
svc/router            172.30.40.34   <none>        80/TCP,443/TCP,1936/TCP   14d       router=router

NAME                           DESIRED   SUCCESSFUL   AGE       CONTAINER(S)        IMAGE(S)                  SELECTOR
jobs/persistent-volume-setup   1         1            14d       storage-setup-job   openshift/origin:v3.7.0   controller-uid=26ab5554-d62e-11e7-ace1-525400f3baa1

docker-registryのRouteを作ってアクセスてみましょう。自分のマシン -> Routerホスト -> RouterのHAProxy -> ターゲットPod、というフローとなります。アプリケーションURLに利用するホスト名はDNSでRouterホストに解決するように設定します。minishiftではdocker-registryはhttpでRouteなしでセットアップされますが、通常のインストールではhttps passthroughのRouteが最初から設定済みなので注意してください。

$ oc expose service docker-registry 
route "docker-registry" exposed
$ oc get route
NAME              HOST/PORT                                       PATH      SERVICES          PORT       TERMINATION   WILDCARD
docker-registry   docker-registry-default.192.168.42.225.nip.io             docker-registry   5000-tcp                 None
$ curl http://docker-registry-default.192.168.42.225.nip.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}

Router内を見てみましょう。プロセスはopenshift-routerとhaproxyの二つです。

$ oc rsh dc/router sh -c "ps -eaf | cat"
UID        PID  PPID  C STIME TTY          TIME CMD
1001         1     0  0 Dec11 ?        00:16:49 /usr/bin/openshift-router
1001       248     1  0 Dec11 ?        00:00:16 /usr/sbin/haproxy -f /var/lib/haproxy/conf/haproxy.config -p /var/lib/haproxy/run/haproxy.pid -sf 224

中身の/var/lib/haproxyの構成は以下のようになっています。

$ oc rsh dc/router sh -c "pwd && ls -Rla /var/lib/haproxy"
/var/lib/haproxy/conf
/var/lib/haproxy:
total 20
drwxrwxr-x.  7 haproxy root  122 Nov 29 16:57 .
drwxr-xr-x. 16 root    root  211 Nov 29 16:57 ..
-rw-rwSr--.  1    1001 1003   30 Nov 13 22:57 .cccp.yml
-rw-rwSr--.  1    1001 1003 1327 Nov 13 22:57 Dockerfile
drwxrwxr-x.  2 root    root    6 Nov 29 16:57 bin
drwxrwsr-x.  2    1001 1003 4096 Nov 29 16:22 conf
drwxrwxr-x.  2 root    root    6 Nov 29 16:57 log
-rwxrwsr-x.  1    1001 1003 4979 Nov 13 22:57 reload-haproxy
drwxrwxr-x.  4 root    root   53 Dec 11 06:07 router
drwxrwxr-x.  2 root    root   45 Dec 11 06:13 run

/var/lib/haproxy/bin:
total 0
drwxrwxr-x. 2 root    root   6 Nov 29 16:57 .
drwxrwxr-x. 7 haproxy root 122 Nov 29 16:57 ..

/var/lib/haproxy/conf:
total 72
drwxrwsr-x. 2    1001 1003  4096 Nov 29 16:22 .
drwxrwxr-x. 7 haproxy root   122 Nov 29 16:57 ..
-rw-rw-r--. 1 root    root     0 Dec 11 06:13 cert_config.map
-rw-rwSr--. 1    1001 1003  2035 Nov 13 22:57 default_pub_keys.pem
-rw-rwSr--. 1    1001 1003  3278 Nov 13 22:57 error-page-503.http
-rw-rwSr--. 1    1001 1003 30599 Nov 13 22:57 haproxy-config.template
-rw-rw-r--. 1 root    root 10051 Dec 11 06:13 haproxy.config
-rw-rw-r--. 1 root    root    88 Dec 11 06:13 os_edge_http_be.map
-rw-rw-r--. 1 root    root    94 Dec 11 06:13 os_http_be.map
-rw-rw-r--. 1 root    root     0 Dec 11 06:13 os_reencrypt.map
-rw-rw-r--. 1 root    root     0 Dec 11 06:13 os_route_http_expose.map
-rw-rw-r--. 1 root    root    88 Dec 11 06:13 os_route_http_redirect.map
-rw-rw-r--. 1 root    root     0 Dec 11 06:13 os_sni_passthrough.map
-rw-rw-r--. 1 root    root     0 Dec 11 06:13 os_tcp_be.map
-rw-rw-r--. 1 root    root     1 Dec 11 06:13 os_wildcard_domain.map

/var/lib/haproxy/log:
total 0
drwxrwxr-x. 2 root    root   6 Nov 29 16:57 .
drwxrwxr-x. 7 haproxy root 122 Nov 29 16:57 ..

/var/lib/haproxy/router:
total 4
drwxrwxr-x. 4 root    root   53 Dec 11 06:07 .
drwxrwxr-x. 7 haproxy root  122 Nov 29 16:57 ..
drwxrwxr-x. 2 root    root    6 Nov 29 16:57 cacerts
drwxrwxr-x. 2 root    root    6 Nov 29 16:57 certs
-rw-r--r--. 1    1001 root 1378 Dec 11 06:13 routes.json

/var/lib/haproxy/router/cacerts:
total 0
drwxrwxr-x. 2 root root  6 Nov 29 16:57 .
drwxrwxr-x. 4 root root 53 Dec 11 06:07 ..

/var/lib/haproxy/router/certs:
total 0
drwxrwxr-x. 2 root root  6 Nov 29 16:57 .
drwxrwxr-x. 4 root root 53 Dec 11 06:07 ..

/var/lib/haproxy/run:
total 4
drwxrwxr-x. 2 root    root  45 Dec 11 06:13 .
drwxrwxr-x. 7 haproxy root 122 Nov 29 16:57 ..
-rw-r--r--. 1    1001 root   4 Dec 11 06:13 haproxy.pid
srw-------. 1    1001 root   0 Dec 11 06:13 haproxy.sock

openshift-routerはhaproxy-config.templateからhaproxy.configファイルを生成しますが、テンプレートファイルの読み込みは起動時一回のみ行われるのでカスタマイズしたい場合にoc rshなどでコンテナ内で書き換えても意味がないことに注意してください。ConfigMapで差し替えることができます。

HAProxyはログ出力先としてsyslogしかサポートしていないため、運用環境では少なくともROUTER_SYSLOG_ADDRESSは設定しましょう。これは扱いづらいので、OpenShiftでもどうにかしようという議論は長く続いていますが、まだ解決されていません。

ドキュメントに載っているアノテーション環境変数での設定などの他に、コマンドラインパラメータにもいくつか設定があるので載せておきます。

$ oc rsh dc/router openshift-router -h
Start a router 

This command launches a router connected to your cluster master. The router listens for routes and endpoints created by users and keeps a local router configuration up to date with those changes. 

You may customize the router by providing your own --template and --reload scripts. 

The router must have a default certificate in pem format. You may provide it via --default-cert otherwise one is automatically created. 

You may restrict the set of routes exposed to a single project (with --namespace), projects your client has access to with a set of labels (--project-labels), namespaces matching a label (--namespace-labels), or all namespaces (no argument). You can limit the routes to those matching a --labels or --fields selector. Note that you must have a cluster-wide administrative role to view all namespaces.

Usage:
  openshift-router --master=<addr> [flags]
  openshift-router [command]

Available Commands:
  version     Display client and server versions

Flags:
      --allow-wildcard-routes                    Allow wildcard host names for routes
      --allowed-domains stringSlice              List of comma separated domains to allow in routes. If specified, only the domains in this list will be allowed routes. Note that domains in the denied list take precedence over the ones in the allowed list
      --as string                                Username to impersonate for the operation
      --as-group stringArray                     Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
      --azure-container-registry-config string   Path to the file container Azure container registry configuration information.
      --bind-ports-after-sync                    Bind ports only after route state has been synchronized
      --certificate-authority string             Path to a cert file for the certificate authority
      --ciphers string                           Specifies the cipher suites to use. You can choose a predefined cipher set ('modern', 'intermediate', or 'old') or specify exact cipher suites by passing a : separated list.
      --client-certificate string                Path to a client certificate file for TLS
      --client-key string                        Path to a client key file for TLS
      --cluster string                           The name of the kubeconfig cluster to use
      --config string                            Path to the config file to use for CLI requests.
      --context string                           The name of the kubeconfig context to use
      --default-certificate string               The contents of a default certificate to use for routes that don't expose a TLS server cert; in PEM format
      --default-certificate-dir string           A path to a directory that contains a file named tls.crt. If tls.crt is not a PEM file which also contains a private key, it is first combined with a file named tls.key in the same directory. The PEM-format contents are then used as the default certificate. Only used if default-certificate and default-certificate-path are not specified. (default "/etc/pki/tls/private")
      --default-certificate-path string          A path to default certificate to use for routes that don't expose a TLS server cert; in PEM format (default "/etc/pki/tls/private/tls.crt")
      --default-destination-ca-path string       A path to a PEM file containing the default CA bundle to use with re-encrypt routes. This CA should sign for certificates in the Kubernetes DNS space (service.namespace.svc). (default "/var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt")
      --denied-domains stringSlice               List of comma separated domains to deny in routes
      --disable-namespace-ownership-check        Disables the namespace ownership checks for a route host with different paths or for overlapping host names in the case of wildcard routes. Please be aware that if namespace ownership checks are disabled, routes in a different namespace can use this mechanism to 'steal' sub-paths for existing domains. This is only safe if route creation privileges are restricted, or if all the users can be trusted.
      --enable-ingress                           Enable configuration via ingress resources
      --extended-validation                      If set, then an additional extended validation step is performed on all routes admitted in by this router. Defaults to true and enables the extended validation checks. (default true)
      --fields string                            A field selector to apply to routes to watch
      --google-json-key string                   The Google Cloud Platform Service Account JSON Key to use for authentication.
      --hostname-template string                 If specified, a template that should be used to generate the hostname for a route without spec.host (e.g. '${name}-${namespace}.myapps.mycompany.com')
      --include-udp-endpoints                    If true, UDP endpoints will be considered as candidates for routing
      --insecure-skip-tls-verify                 If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
      --interval duration                        Controls how often router reloads are invoked. Mutiple router reload requests are coalesced for the duration of this interval since the last reload time. (default 5s)
      --kubernetes string                        The address of the Kubernetes server (host, host:port, or URL). If omitted defaults to the master. (default "http://localhost:8080")
      --labels string                            A label selector to apply to the routes to watch
      --listen-addr string                       The name of an interface to listen on to expose metrics and health checking. If not specified, will not listen. Overrides stats port. (default "0.0.0.0:1936")
      --log-flush-frequency duration             Maximum number of seconds between log flushes (default 5s)
      --loglevel int32                           Set the level of log output (0-10)
      --logspec string                           Set per module logging with file|pattern=LEVEL,...
      --master string                            The address the master can be reached on (host, host:port, or URL). (default "http://localhost:8080")
      --max-connections string                   Specifies the maximum number of concurrent connections.
      --metrics-type string                      Specifies the type of metrics to gather. Supports 'haproxy'. (default "haproxy")
      --name string                              The name the router will identify itself with in the route status (default "router")
  -n, --namespace string                         If present, the namespace scope for this CLI request
      --namespace-labels string                  A label selector to apply to namespaces to watch
      --override-hostname                        Override the spec.host value for a route with --hostname-template
      --project-labels string                    A label selector to apply to projects to watch; if '*' watches all projects the client can access
      --reload string                            The path to the reload script to use (default "/var/lib/haproxy/reload-haproxy")
      --request-timeout string                   The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
      --resync-interval duration                 The interval at which the route list should be fully refreshed (default 10m0s)
      --router-canonical-hostname string         CanonicalHostname is the external host name for the router that can be used as a CNAME for the host requested for this route. This value is optional and may not be set in all cases.
      --server string                            The address and port of the Kubernetes API server
      --stats-password string                    If the underlying router implementation can provide statistics this is the requested password for auth. (default "x6RVrWmXwV")
      --stats-port string                        If the underlying router implementation can provide statistics this is a hint to expose it on this port. Ignored if listen-addr is specified. (default "1936")
      --stats-user string                        If the underlying router implementation can provide statistics this is the requested username for auth. (default "admin")
      --strict-sni                               Use strict-sni bind processing (do not use default cert).
      --template string                          The path to the template file to use (default "/var/lib/haproxy/conf/haproxy-config.template")
      --token string                             Bearer token for authentication to the API server
      --user string                              The name of the kubeconfig user to use
      --version version[=true]                   Print version information and quit
      --working-dir string                       The working directory for the router plugin (default "/var/lib/haproxy/router")

Use "openshift-router [command] --help" for more information about a command.