[博客翻译]NsJail:一个轻量级的Linux进程隔离工具


原文地址:https://nsjail.dev/


一个轻量级的进程隔离工具,利用Linux命名空间和seccomp-bpf系统调用过滤器(借助kafel bpf语言)

概述

NsJail是一个用于Linux的进程隔离工具。它利用Linux命名空间子系统、资源限制和Linux内核的seccomp-bpf系统调用过滤器。
它可以帮助你(除其他外):

  • 隔离网络服务(例如Web、时间、DNS),使它们与操作系统的其他部分隔离
  • 托管计算机安全挑战(所谓的CTF
  • 包含侵入性的系统调用级操作系统模糊测试工具

功能:

它提供了哪些隔离形式

  1. Linux命名空间:UTS(主机名)、MOUNT(chroot)、PID(独立的PID树)、IPC、NET(独立的网络上下文)、USER、CGROUPS
  2. 文件系统约束:chroot()、pivot_root()、只读重新挂载、自定义的/proctmpfs挂载点
  3. 资源限制(挂起时间/CPU时间限制、VM/内存地址空间限制等)
  4. 可编程的seccomp-bpf系统调用过滤器(通过kafel语言
  5. 克隆和隔离的以太网接口
  6. Cgroups用于内存和PID利用率控制

支持的用例

网络服务的隔离(inetd风格)

注意:你需要在/chroot中有一个有效的文件系统树。如果没有,请将/chroot改为/

  • 服务器:
$ ./nsjail -Ml --port 9000 --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
  • 客户端:
$ nc 127.0.0.1 9000
 / $ ifconfig
 / $ ifconfig -a
 lo  Link encap:Local Loopback
    LOOPBACK MTU:65536 Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
    RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
 / $ ps wuax
 PID  USER   COMMAND
 1 99999  /bin/sh -i
 3 99999  {busybox} ps wuax
 / $

访问私有克隆接口的隔离(需要root/setuid权限)

注意:你需要在/chroot中有一个有效的文件系统树。如果没有,请将/chroot改为/

$ sudo ./nsjail --user 9999 --group 9999 --macvlan_iface eth0 --chroot /chroot/ -Mo --macvlan_vs_ip 192.168.0.44 --macvlan_vs_nm 255.255.255.0 --macvlan_vs_gw 192.168.0.1 -- /bin/sh -i
/ $ id
uid=9999 gid=9999
/ $ ip addr sh
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue 
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
  inet 127.0.0.1/8 scope host lo
    valid_lft forever preferred_lft forever
  inet6 ::1/128 scope host 
    valid_lft forever preferred_lft forever
2: vs: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
  link/ether ca:a2:69:21:33:66 brd ff:ff:ff:ff:ff:ff
  inet 192.168.0.44/24 brd 192.168.0.255 scope global vs
    valid_lft forever preferred_lft forever
  inet6 fe80::c8a2:69ff:fe21:cd66/64 scope link 
    valid_lft forever preferred_lft forever
/ $ nc 217.146.165.209 80
GET / HTTP/1.0
HTTP/1.0 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: https://www.google.ch/?gfe_rd=cr&ei=cEzWVrG2CeTI8ge88ofwDA
Content-Length: 258
Date: Wed, 02 Mar 2016 02:14:08 GMT
...
...
/ $

本地进程的隔离

注意:你需要在/chroot中有一个有效的文件系统树。如果没有,请将/chroot改为/

$ ./nsjail -Mo --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
 / $ ifconfig -a
 lo  Link encap:Local Loopback
    LOOPBACK MTU:65536 Metric:1
    RX packets:0 errors:0 dropped:0 overruns:0 frame:0
    TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0
    RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
 / $ id
 uid=99999 gid=99999
 / $ ps wuax
 PID  USER   COMMAND
 1 99999  /bin/sh -i
 4 99999  {busybox} ps wuax
 / $exit
 $

本地进程的隔离(并在必要时重新运行)

注意:你需要在/chroot中有一个有效的文件系统树。如果没有,请将/chroot改为/

$ ./nsjail -Mr --chroot /chroot/ --user 99999 --group 99999 -- /bin/sh -i
 BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1) built-in shell (ash)
 Enter 'help' for a list of built-in commands.
 / $ ps wuax
 PID  USER   COMMAND
 1 99999  /bin/sh -i
 2 99999  {busybox} ps wuax
 / $ exit
 BusyBox v1.21.1 (Ubuntu 1:1.21.0-1ubuntu1) built-in shell (ash)
 Enter 'help' for a list of built-in commands.
 / $ ps wuax
 PID  USER   COMMAND
 1 99999  /bin/sh -i
 2 99999  {busybox} ps wuax
 / $

在最小文件系统中运行Bash,uid==0且仅访问/dev/urandom

$ ./nsjail -Mo --user 0 --group 99999 -R /bin/ -R /lib -R /lib64/ -R /usr/ -R /sbin/ -T /dev -R /dev/urandom --keep_caps -- /bin/bash -i
[2017-05-24T17:08:02+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:08:02+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/bin/bash', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:true, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/bin/' dst:'/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/lib' dst:'/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/lib64/' dst:'/lib64/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/usr/' dst:'/usr/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/sbin/' dst:'/sbin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'none' dst:'/dev' type:'tmpfs' flags:0 options:'size=4194304' isDir:True
[2017-05-24T17:08:02+0200] Mount point: src:'/dev/urandom' dst:'/dev/urandom' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:08:02+0200] Uid map: inside_uid:0 outside_uid:69664
[2017-05-24T17:08:02+0200] Gid map: inside_gid:99999 outside_gid:5000
[2017-05-24T17:08:02+0200] Executing '/bin/bash' for '[STANDALONE_MODE]'
bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell
bash-4.3# ls -l
total 28
drwxr-xr-x  2 65534 65534 4096 May 15 14:04 bin
drwxrwxrwt  2   0 99999  60 May 24 15:08 dev
drwxr-xr-x 28 65534 65534 4096 May 15 14:10 lib
drwxr-xr-x  2 65534 65534 4096 May 15 13:56 lib64
dr-xr-xr-x 391 65534 65534   0 May 24 15:08 proc
drwxr-xr-x  2 65534 65534 12288 May 15 14:16 sbin
drwxr-xr-x 17 65534 65534 4096 May 15 13:58 usr
bash-4.3# id
uid=0 gid=99999 groups=65534,99999
bash-4.3# exit
exit
[2017-05-24T17:08:05+0200] PID: 129839 exited with status: 0, (PIDs left: 0)

在最小文件系统中运行/usr/bin/find(仅从/usr/bin访问/usr/bin/find)

$ ./nsjail -Mo --user 99999 --group 99999 -R /lib/x86_64-linux-gnu/ -R /lib/x86_64-linux-gnu -R /lib64 -R /usr/bin/find -R /dev/urandom --keep_caps -- /usr/bin/find / | wc -l
[2017-05-24T17:04:37+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:04:37+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/usr/bin/find', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:true, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:04:37+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib/x86_64-linux-gnu/' dst:'/lib/x86_64-linux-gnu/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib/x86_64-linux-gnu' dst:'/lib/x86_64-linux-gnu' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/lib64' dst:'/lib64' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:04:37+0200] Mount point: src:'/usr/bin/find' dst:'/usr/bin/find' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:04:37+0200] Mount point: src:'/dev/urandom' dst:'/dev/urandom' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:False
[2017-05-24T17:04:37+0200] Uid map: inside_uid:99999 outside_uid:69664
[2017-05-24T17:04:37+0200] Gid map: inside_gid:99999 outside_gid:5000
[2017-05-24T17:04:37+0200] Executing '/usr/bin/find' for '[STANDALONE_MODE]'
/usr/bin/find: `/proc/tty/driver': Permission denied
2289
[2017-05-24T17:04:37+0200] PID: 129525 exited with status: 1, (PIDs left: 0)

使用/etc/subuid

$ tail -n1 /etc/subuid
user:10000000:1
$ ./nsjail -R /lib -R /lib64/ -R /usr/lib -R /usr/bin/ -R /usr/sbin/ -R /bin/ -R /sbin/ -R /dev/null -U 0:10000000:1 -u 0 -R /tmp/ -T /tmp/ -- /bin/ls -l /usr/
[2017-05-24T17:12:31+0200] Mode: STANDALONE_ONCE
[2017-05-24T17:12:31+0200] Jail parameters: hostname:'NSJAIL', chroot:'(null)', process:'/bin/ls', bind:[::]:0, max_conns_per_ip:0, time_limit:0, personality:0, daemonize:false, clone_newnet:true, clone_newuser:true, clone_newns:true, clone_newpid:true, clone_newipc:true, clonew_newuts:true, clone_newcgroup:false, keep_caps:false, tmpfs_size:4194304, disable_no_new_privs:false, pivot_root_only:false
[2017-05-24T17:12:31+0200] Mount point: src:'none' dst:'/' type:'tmpfs' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'none' dst:'/proc' type:'proc' flags:MS_RDONLY|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/lib' dst:'/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/lib64/' dst:'/lib64/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/lib' dst:'/usr/lib' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/bin/' dst:'/usr/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/usr/sbin/' dst:'/usr/sbin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|0 options:'' isDir:True
[2017-05-24T17:12:31+0200] Mount point: src:'/bin/' dst:'/bin/' type:'' flags:MS_RDONLY|MS_BIND|MS_REC|