veza/ansible/roles/haproxy/readme.md

260 lines
10 KiB
Markdown
Raw Normal View History

This role will install haproxy from the official repository `http://haproxy.debian.net`.
<!-- TOC -->
* [Important](#important)
* [Mandatory variables](#mandatory-variables)
* [Optional variables](#optional-variables)
* [Frontends](#frontends)
* [letsencrypt automatic certificate generation](#letsencrypt-automatic-certificate-generation)
* [Coraza WAF installation](#coraza-waf-installation)
* [IIS specific headers for https](#iis-specific-headers-for-https)
* [Issues with compression](#issues-with-compression)
* [haproxy and journald](#haproxy-and-journald)
* [haproxy documentation](#haproxy-documentation)
<!-- TOC -->
# Important
This role consider that haproxy will always serve https.
This role currently doesn't handle the management of the https certificates and private keys. HAproxy looks for files in /usr/local/etc/tls/haproxy: each files here must contain the private key, the certificate and the full chain (yes, everything in one file!).
HAproxy will automatically answer https requests with SNI with the correct certificate.
# Mandatory variables
This role uses object to define configuration parameters.
The haproxy version is mandatory, but should already be defined in `group_vars/all/software_versions`, so except in very specific cases (like testing of new version), you don't need to override it:
```
haproxy_version: "2.8"
```
For the backends, you can define several of them this way:
```
haproxy_backend:
- name: "identity-test"
balance: "roundrobin" # this is the default and can be ommitted
server:
- name: "id-test-1" # if undefined, takes the value of the "fqdn"
fqdn: "identity-test-node-1.talas.com" # if undefined, takes the value of the "name""
port: "8080"
- name: "id-test-2"
fqdn: "identity-test-node-2.talas.com"
port: "8080"
proto: "h2"
check: "check inter 2s fastinter 2s downinter 2s" # default is "check"
options: "string containing the options for this server, this is optional"
```
Unfortunately, currently this role cannot find out which certificate are active and thus which ones should be seen by zabbix so you must list the https website with this list:
```
haproxy_https_monitoring:
- identity.talas.com
```
# TLS profiles
_Changelog of the TLS parameter:_
- 2025-01: The "old" profile cannot be used anymore because of audit from our customers. It is kept for historical reasons only.
- 2025-03: migration of the last service using the "old" profile to the "intermediate" profile
- TO DO: actually delete the "old" profile so that it cannot be used anymore
The TLS configuration is generated with https://ssl-config.mozilla.org/#server=haproxy&version=2.8.
The default profile is "intermediate" (which supports TLS 1.2+) but you can switch it to modern (which supports TLS 1.3+) via this variable:
```
haproxy_tls_profile: "modern"
```
# Optional variables
You can change the default backend of the frontend:
```
haproxy_frontend:
default_backend: "error404"
```
This roles has a default maximum number of connection set to 20000 (the default in vanilla haproxy is 500). You can adjust this with this variable:
```
haproxy_maxconn: 20000
```
You can also adjust the timeout values of haproxy, which are explained here:
- https://serverfault.com/questions/504308/by-what-criteria-do-you-tune-timeouts-in-ha-proxy-config / https://thehftguy.com/2016/05/22/configuring-timeouts-in-haproxy/
The default are:
```yml
haproxy_timeout_connect: "5s"
haproxy_timeout_client: "50s"
haproxy_timeout_server: "50s"
```
From [haproxy documentation](https://www.haproxy.org/download/2.8/doc/configuration.txt):
> In TCP mode (and to a lesser extent, in HTTP mode), it is highly recommended that the client timeout remains equal to the server timeout in order to avoid complex situations to debug.
You can handle all robots.txt for all frontends via this variable:
```
haproxy_robotstxt: True
```
When set to true, the url /robots.txt will return:
```
User-agent: *
Disallow: /
```
Is it usefull when backends should not be indexed.
You can also use the robots.txt backend in only some cases, for this, just reference the robots_txt acl. Example:
```
acl something hdr(host) something.example.org
use_backend robotstxt if is_robots_txt something
```
The default acl `robotstxt` is in the standard frontend.
You can define several user lists, to have one authentication page (basic_auth):
```yaml
haproxy_userlist:
mailcatcher:
- bolle_mailcatcher
- user2
```
In this example:
- `mailcatcher` is the userlist name which you can specify in your haproxy configuration
- `bolle_mailcatcher` and `user2` are the users
Passwords are automatically generated by the role and added to hashicorpvault. If you wish, you can define them in advance, respecting this name:
```yaml
haproxy_basicauth_%USERNAME%_password # replace %USERNAME% with the username you've defined
```
Information: the password is added to the haproxy configuration in clear text to avoid this:
http://docs.haproxy.org/2.9/configuration.html#3.4-user
```text
Attention: Be aware that using encrypted passwords might cause significantly increased CPU usage, depending on the number of requests, and the algorithm used.
For any of the hashed variants, the password for each request must be processed through the chosen algorithm, before it can be compared to the value specified in the config file.
Most current algorithms are deliberately designed to be expensive to compute to achieve resistance against brute force attacks. They do not simply salt/hash the clear text password once, but thousands of times.
This can quickly become a major factor in HAProxy's overall CPU consumption!
```
Example of haproxy configuration:
```yaml
haproxy_frontend_raw_config: |
acl mailcatcher.bollebrands.com hdr(host) -i mailcatcher.bollebrands.com
http-request auth if mailcatcher.bollebrands.com !{ http_auth(mailcatcher) } !acme-challenge
use_backend mailcatcher.bollebrands.com if mailcatcher.bollebrands.com { http_auth_group(mailcatcher) }
```
# Frontends
By default, this role create a frontend named "https" which has the following default configuration:
```
frontend https
filter compression
compression algo gzip
compression type text/html text/plain text/xml text/css text/csv text/rtf text/richtext application/x-javascript application/javascript application/ecmascript application/rss+xml application/xml application/json application/wasm
mode http
bind :443,:::443 v6only ssl crt /usr/local/etc/tls/haproxy alpn h2,http/1.1
bind :80,:::80 v6only
http-request set-header X-Forwarded-Proto https if { ssl_fc }
redirect scheme https code 301 if !{ ssl_fc }
option forwardfor
# block access to any git paths
acl git path,url_dec -m sub /.git
use_backend error404 if git
# block access to path begining by "/manager" except from 10.0.0.0/8
acl internal_network src 10.0.0.0/8
acl manager path,url_dec -m beg /manager
use_backend error404 if manager !internal_network
# redirect multiple traling slash to one slash
acl has_multiple_slash path_reg /{2,}
http-request set-path %[path,regsub(/+,/,g)] if has_multiple_slash
```
You can override the "bind" lines with this list:
```
haproxy_frontend:
bind_list:
- "127.0.0.1:443 ssl crt /usr/local/etc/tls/haproxy alpn h2,http/1.1"
- "127.0.0.1:80"
```
You can add a raw configuration to the default frontend with this variable:
```
haproxy_frontend_raw_config: |
acl admin path,url_dec -m beg /auth/admin
use_backend error404 if admin !internal_network
```
You can deactivate the default frontend with this variable:
```
haproxy_default_frontend: false
```
You can also define any number of custom frontends with this object:
```
haproxy_frontend_list:
- name: "something"
mode: "http/tcp"
bind_list:
- "*:389"
- "1.1.1.1:80"
config: |
free field to define the config of the frontend
```
This allows full control over custom frontends for haproxy.
# letsencrypt automatic certificate generation
/!\ Lets encrypt automatic certificate generation can only be used on single node cluster (no keepalived).
For this to work correctly, you need to need to have all domains in the `haproxy_https_monitoring` variable. Each domains has its own certificate, alternative names are not supported.
To activate it, set this variable:
```
haproxy_letsencrypt: true
```
During certificate generation and renew, an http server is created to handle the challenge on port 8888. The server is created via a simple python command line and is only active during lets encrypt operations.
# Coraza WAF installation
Enable coraza WAF like this:
```
haproxy_coraza: true
```
If the `haproxy_waf_sample_percent` variable is defined, Coraza will be enabled in the default frontend.
However, if `waf_sample_percent` is defined within the `haproxy_frontend_list`, Coraza will be enabled in each frontend where `waf_sample_percent` is explicitly set.
# IIS specific headers for https
The header `Front-End-Https On` is the equivalent to `X-Forwarded-Proto https` for IIS, to activate it, set this variable to `true`:
```
haproxy_iis: true
```
# Issues with compression
Some mime types are problematic if compressed so compression was disabled for them, those are:
```
application/hal+json
application/prs.hal-forms+json
```
See the following tickets for more informations:
- https://tracker.talas.com/browse/OP-4916
- https://tracker.talas.com/browse/OP-6532
- https://tracker.talas.com/browse/IT-9018
# haproxy and journald
In the systemd file for haproxy, the following line was added:
```
BindReadOnlyPaths=/dev/log:/var/lib/haproxy/dev/log
```
This line gives to haproxy the capability to send its log to journald. While this looks like a good idea, it is not.
With this lines, the logs are duplicated between `/var/log/haproxy.log` and journald. On production, this means an increase by a factor of 40 (!!!) of the amount of write to the disk.
With this line: 400KB/s, without: 10KB/s.
This is crazy... and remember that this is duplicate logs that we don't use since filebeat will read the `/var/log/haproxy.log` and ignore journald. This also shows the poor optimisation of journald vs simple log files but this is an other story.
Anyway, this role removes this line from the service file for all those reasons.
# haproxy documentation
Official documentation can be found at https://www.haproxy.org/download/2.8/doc/configuration.txt (change the version number for the latest if needed).
Important part that we look for often is the one that details the "Session state at disconnection", which is essential to debug connectivity issues. Search for "8.5. Session state at disconnection" in the doc to find it immediately.