容器(五)容器如何訪問外部世界?【31】
(六)容器如何訪問外部世界?
前面我們已經解決了容器間通訊的問題,接下來討論容器如何與外部世界通訊。這裡涉及兩個方向:
- 容器訪問外部世界
- 外部世界訪問容器
(1)容器訪問外部世界
在我們當前的實驗環境下,docker host 是可以訪問外網的。 我們看一下容器是否也能訪問外網呢?
root@cuiyongchao:~# ping www.baidu.com PING www.a.shifen.com (180.101.49.11) 56(84) bytes of data. 64 bytes from 180.101.49.11 (180.101.49.11): icmp_seq=1 ttl=128 time=4.39 ms 64 bytes from 180.101.49.11 (180.101.49.11): icmp_seq=2 ttl=128 time=4.47 ms 64 bytes from 180.101.49.11 (180.101.49.11): icmp_seq=3 ttl=128 time=4.39 ms ^C --- www.a.shifen.com ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2004ms rtt min/avg/max/mdev = 4.395/4.424/4.479/0.038 ms root@cuiyongchao:~# 容器訪問外部: / # ping www.baidu.com PING www.baidu.com (180.101.49.11): 56 data bytes 64 bytes from 180.101.49.11: seq=0 ttl=127 time=4.256 ms 64 bytes from 180.101.49.11: seq=1 ttl=127 time=5.123 ms 64 bytes from 180.101.49.11: seq=2 ttl=127 time=13.880 ms ^C --- www.baidu.com ping statistics --- 3 packets transmitted, 3 packets received, 0% packet loss round-trip min/avg/max = 4.256/7.753/13.880 ms / #
可見,容器預設就能訪問外網。請注意:這裡外網指的是容器網路以外的網路環境,並非特指 internet。
現象很簡單,但更重要的:我們應該理解現象下的本質。
在上面的例子中,busybox 位於 docker0
這個私有 bridge 網路中(172.17.0.0/16),當 busybox 從容器向外 ping 時,資料包是怎樣到達 www.baidu.com 的呢?
這裡的關鍵就是 NAT。我們檢視一下 docker host 上的 iptables 規則:
root@cuiyongchao:~# iptables -t nat -S -P PREROUTING ACCEPT -P INPUT ACCEPT -P OUTPUT ACCEPT -P POSTROUTING ACCEPT -N DOCKER -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER -A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER -A POSTROUTING -s 172.22.0.0/16 ! -o br-ba21840c1713 -j MASQUERADE -A POSTROUTING -s 172.18.0.0/16 ! -o br-283474cba87c -j MASQUERADE -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE -A DOCKER -i br-ba21840c1713 -j RETURN -A DOCKER -i br-283474cba87c -j RETURN -A DOCKER -i docker0 -j RETURN root@cuiyongchao:~#
在 NAT 表中,有這麼一條規則:
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
其含義是:如果網橋 docker0
收到來自 172.17.0.0/16 網段的外出包,把它交給 MASQUERADE 處理。而 MASQUERADE 的處理方式是將包的源地址替換成 host 的地址傳送出去,即做了一次網路地址轉換(NAT)。
先檢視 docker host 的路由表:
root@cuiyongchao:~# ip route
default via 10.0.0.254 dev ens33 proto static
預設路由通過 ens33 發出去,所以我們要同時監控 ens33和 docker0 上的 icmp(ping)資料包。
當 busybox ping www.baidu.com 時,tcpdump 輸出如下:
root@cuiyongchao:~# tcpdump -i docker0 -n icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes
10:40:39.549712 IP 172.17.0.3 > 180.101.49.12: ICMP echo request, id 1792, seq 0, length 64
10:40:39.558211 IP 180.101.49.12 > 172.17.0.3: ICMP echo reply, id 1792, seq 0, length 64
10:40:40.550385 IP 172.17.0.3 > 180.101.49.12: ICMP echo request, id 1792, seq 1, length 64
10:40:40.558821 IP 180.101.49.12 > 172.17.0.3: ICMP echo reply, id 1792, seq 1, length 64
10:40:41.551612 IP 172.17.0.3 > 180.101.49.12: ICMP echo request, id 1792, seq 2, length 64
10:40:41.561578 IP 180.101.49.12 > 172.17.0.3: ICMP echo reply, id 1792, seq 2, length 64
10:40:42.552413 IP 172.17.0.3 > 180.101.49.12: ICMP echo request, id 1792, seq 3, length 64
10:40:42.560352 IP 180.101.49.12 > 172.17.0.3: ICMP echo reply, id 1792, seq 3, length 64
10:40:43.553517 IP 172.17.0.3 > 180.101.49.12: ICMP echo request, id 1792, seq 4, length 64
10:40:43.561490 IP 180.101.49.12 > 172.17.0.3: ICMP echo reply, id 1792, seq 4, length 64
10:40:44.554024 IP 172.17.0.3 > 180.101.49.12: ICMP echo request, id 1792, seq 5, length 64
10:40:44.564883 IP 180.101.49.12 > 172.17.0.3: ICMP echo reply, id 1792, seq 5, length 64
10:40:45.554431 IP 172.17.0.3 > 180.101.49.12: ICMP echo request, id 1792, seq 6, length 64
10:40:45.562137 IP 180.101.49.12 > 172.17.0.3: ICMP echo reply, id 1792, seq 6, length 64
docker0 收到 busybox 的 ping 包,源地址為容器 IP 172.17.0.3,這沒問題,交給 MASQUERADE 處理。這時,在 ens33上我們看到了變化:
root@cuiyongchao:~# tcpdump -i ens33 -n icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens33, link-type EN10MB (Ethernet), capture size 262144 bytes
10:40:39.549765 IP 10.0.0.20 > 180.101.49.12: ICMP echo request, id 1792, seq 0, length 64
10:40:39.558188 IP 180.101.49.12 > 10.0.0.20: ICMP echo reply, id 1792, seq 0, length 64
10:40:40.550432 IP 10.0.0.20 > 180.101.49.12: ICMP echo request, id 1792, seq 1, length 64
10:40:40.558775 IP 180.101.49.12 > 10.0.0.20: ICMP echo reply, id 1792, seq 1, length 64
10:40:41.551658 IP 10.0.0.20 > 180.101.49.12: ICMP echo request, id 1792, seq 2, length 64
10:40:41.561544 IP 180.101.49.12 > 10.0.0.20: ICMP echo reply, id 1792, seq 2, length 64
10:40:42.552461 IP 10.0.0.20 > 180.101.49.12: ICMP echo request, id 1792, seq 3, length 64
10:40:42.560315 IP 180.101.49.12 > 10.0.0.20: ICMP echo reply, id 1792, seq 3, length 64
10:40:43.553560 IP 10.0.0.20 > 180.101.49.12: ICMP echo request, id 1792, seq 4, length 64
10:40:43.561455 IP 180.101.49.12 > 10.0.0.20: ICMP echo reply, id 1792, seq 4, length 64
10:40:44.554077 IP 10.0.0.20 > 180.101.49.12: ICMP echo request, id 1792, seq 5, length 64
ping 包的源地址變成了 enp0s3 的 IP 10.0.0.20,這就是 iptable NAT 規則處理的結果,從而保證資料包能夠到達外網。下面用一張圖來說明這個過程:
- busybox 傳送 ping 包:172.17.0.3> www.baidu.com。
- docker0 收到包,發現是傳送到外網的,交給 NAT 處理。
- NAT 將源地址換成 enss33 的 IP:10.0.0.20 > www.baidu.com。
- ping 包從 enss3 傳送出去,到達 www.baidu.com。
通過 NAT,docker 實現了容器對外網的訪問。