welcome: please sign in
location: LKL

There are some notes about configuring and using LKL (Linux Kernel Library)[1].

Basics

LKL can be used as a single .so file and hijack system calls to it, rather than the native system kernel. Thus, this can be treaten as some sort of ultra-lightweight "virtualization" method. When loading LKL's .so file, the process actually runs in an "independent kernel", so this may introduce unlimited imaginations.

Installation

Build from source

To build the .so from source, just as simple as follow the official documetations.

   1 # This is needed for Debian/Ubuntu
   2 sudo apt-get install libfuse-dev libarchive-dev xfsprogs
   3 
   4 # Actual building
   5 git clone --depth=1 https://github.com/lkl/linux.git
   6 cd linux
   7 make -C tools/lkl -j<number of threads>

When finished, there will be a tools/lkl/liblkl-hijack.so file under current directory, it can be copied to another machine to use. Here we install it to system PATH.

Installation

   1 # Optional, strip the file for smaller size
   2 strip tools/lkl/liblkl-hijack.so
   3 
   4 # Copy to system path
   5 sudo cp tools/lkl/liblkl-hijack.so /usr/local/lib/

Configuration

The .so can be used with almost any native Linux executables, here we take haproxy as an example, which can be used to enable BBR TCP Congestion for other TCP based process that runs on the same server, while the native kernel do not have such function[2]. e.g. in an OpenVZ container.

HAProxy

Following examples is based on Debian Jessie 64bit system with jessie-backports repository enabled and the whole system upgraded to latest.

TUN/TAP device need to be enabled in order to redirect packets between native host kernel and the LKL kernel in OpenVZ containers.

Here we use a private IP block 172.20.197.184/29 for internal communications, where 172.20.197.185 is the host IP and 172.20.197.186 is the gust process's. Change them to your own blocks as needed.

Redirect Packets

We need to add a new tuntap device and several iptables rules to redirect packets from host to gust, and vise vera.

   1 #!/bin/sh
   2 
   3 ip tuntap del lkl-tap0 mode tap
   4 ip tuntap add lkl-tap0 mode tap
   5 ip addr add 172.20.197.185/24 dev lkl-tap0
   6 ip link set lkl-tap0 up
   7 
   8 sysctl -w net.ipv4.ip_forward=1
   9 
  10 iptables -P FORWARD ACCEPT
  11 iptables -t nat -F
  12 #iptables -t nat -A POSTROUTING -o venet0 -j MASQUERADE
  13 iptables -t nat -A PREROUTING -i venet0 -p tcp --dport 12240:12249 -j DNAT --to-destination 172.20.197.186

The above script basically shows all the necessary commands, MASQUERADE is not needed if the guest does not need to connect to the Internet.

haproxy.service

Examples from [2].

   1 [Unit]
   2 Description=HAProxy Load Balancer
   3 Documentation=man:haproxy(1)
   4 Documentation=file:/usr/share/doc/haproxy/configuration.txt.gz
   5 After=network.target syslog.service rc-local.service
   6 Wants=syslog.service
   7 
   8 [Service]
   9 Environment="CONFIG=/etc/haproxy/haproxy.cfg" "PIDFILE=/run/haproxy.pid"
  10 EnvironmentFile=-/etc/default/haproxy
  11 
  12 # Use BBR with liblkl-hijack.so
  13 Environment='LD_PRELOAD=/usr/local/lib/liblkl-hijack.so'
  14 Environment='LKL_HIJACK_NET_QDISC="root|fq"'
  15 Environment='LKL_HIJACK_SYSCTL=\'net.ipv4.tcp_congestion_control=bbr;net.ipv4.tcp_wmem=4096 65536 67108864\''
  16 Environment='LKL_HIJACK_NET_IFTYPE=tap'
  17 Environment='LKL_HIJACK_NET_IFPARAMS=lkl-tap0'
  18 Environment='LKL_HIJACK_NET_IP=172.20.197.186'
  19 Environment='LKL_HIJACK_NET_NETMASK_LEN=29'
  20 Environment='LKL_HIJACK_NET_GATEWAY=172.20.197.185'
  21 Environment='LKL_HIJACK_OFFLOAD="0x8883"'
  22 
  23 ExecStartPre=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS
  24 ExecStart=/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid $OPTIONS
  25 #ExecStart=/usr/sbin/haproxy-systemd-wrapper -f $CONFIG -p $PIDFILE $EXTRAOPTS
  26 ExecReload=/usr/sbin/haproxy -f $CONFIG -c -q $EXTRAOPTS
  27 ExecReload=/bin/kill -USR2 $MAINPID
  28 KillMode=mixed
  29 #Restart=always
  30 Restart=on-failure
  31 
  32 [Install]
  33 WantedBy=multi-user.target

Note the ExecStart and Restart is different from original config from Debian repository. The LKL_HIJACK_NET_GATEWAY is not necessary, as well, if guest only talks to the host.

haproxy.cfg

The example config file that redirects for a proxy, also possible for any other process that uses TCP, such as sshd.

   1 global
   2         log /dev/log    local0
   3         log /dev/log    local1 notice
   4         #chroot /var/lib/haproxy
   5         stats socket /run/haproxy/admin.sock mode 660 level admin
   6         stats timeout 30s
   7         user haproxy
   8         group haproxy
   9         #daemon
  10 
  11         # Default SSL material locations
  12         #ca-base /etc/ssl/certs
  13         #crt-base /etc/ssl/private
  14 
  15         # Default ciphers to use on SSL-enabled listening sockets.
  16         # For more information, see ciphers(1SSL). This list is from:
  17         #  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
  18         # An alternative list with additional directives can be obtained from
  19         #  https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
  20         #ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
  21         #ssl-default-bind-options no-sslv3
  22 
  23 defaults
  24         log     global
  25         mode    tcp
  26         #option httplog
  27         option  dontlognull
  28         option  clitcpka
  29         timeout connect 5000
  30         timeout client  50000
  31         timeout server  50000
  32 
  33 frontend  sshd-bbr
  34         bind    172.20.197.186:115
  35         default_backend sshd-native
  36 
  37 backend sshd-native
  38         server  cloud 172.20.197.185:22 maxconn 20480

Note that daemon mode is not working with LKL as of Apr. 2017[3], I assume this might related to the new process forked by init systems when daemonize a process, and a possible fix might be setting those LKL enviroment variables globally, but I didn't tested.

Run

Just run the tuntap/iptables script, and then start the systemd service.

Ref.

[1]. Linux Kernel Library

[2]. [分享] OpenVZ 开启 BBR 之最简方法 - Linux Kernel Library

[3]. 黑科技: 不用換 kernel 不用 UML 也可以 BBR -- Linux Kernel Library (還有可能 BBR+lotServer)

LKL (last edited 2017-04-13 08:24:48 by AstroProfundis)

How many stars in your bowl, How many sorrows in your soul?
CopyRight © 2011-2017 Allen Zhong, under a CC BY-NC-ND 4.0 License. | IPv6 Enabled.