记一次openstack网络故障排查

故障说明

环境中集成了ZUN,但是docker容器和虚拟机在绑定浮动ip后网络都不通。

网络拓扑情况如下:

该主机为192.168.1.16,绑定了浮动ip,172.18.0.162,但是无法访问外部网络。逐一排查各个网段,测试发现可以Ping通192.168.1.1172.18.0.192

因此怀疑是router处和外部网络无法连通。排查router连接情况。

root@controller:/home/ubuntu# source /home/ubuntu/admin-openrc 
root@controller:/home/ubuntu# openstack router list
+--------------------------------------+--------+--------+-------+----------------------------------+-------------+-------+
| ID                                   | Name   | Status | State | Project                          | Distributed | HA    |
+--------------------------------------+--------+--------+-------+----------------------------------+-------------+-------+
| bffc6841-bbd2-4b39-b5b0-a877f20f5fa4 | router | ACTIVE | UP    | 3c0e7cfc3c604ca795cc6d370f785ce7 | False       | False |
+--------------------------------------+--------+--------+-------+----------------------------------+-------------+-------+
root@controller:/home/ubuntu# ip netns
qdhcp-4e358a61-315a-46dd-9447-3d088de1aced (id: 1)
qdhcp-1fae90a3-41e5-4cb6-87cd-201dc179dc51 (id: 2)
qrouter-bffc6841-bbd2-4b39-b5b0-a877f20f5fa4 (id: 0)
root@controller:/home/ubuntu# ip netns exec qrouter-bffc6841-bbd2-4b39-b5b0-a877f20f5fa4 ping 114.114.114.114
PING 114.114.114.114 (114.114.114.114) 56(84) bytes of data.
From 172.18.0.192 icmp_seq=1 Destination Host Unreachable
From 172.18.0.192 icmp_seq=2 Destination Host Unreachable
From 172.18.0.192 icmp_seq=3 Destination Host Unreachable
^C
--- 114.114.114.114 ping statistics ---
4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3076ms

可以看的,从router的net namespace也无法连通外部网络。检查router上的虚拟网卡绑定情况和路由。

root@controller:/home/ubuntu# ip netns exec qrouter-bffc6841-bbd2-4b39-b5b0-a877f20f5fa4 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: qr-eb23ae92-d0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:9d:22:c5 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.1.1/24 brd 192.168.1.255 scope global qr-eb23ae92-d0
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe9d:22c5/64 scope link 
       valid_lft forever preferred_lft forever
3: qg-57ae9e7b-53@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000
    link/ether fa:16:3e:4d:8b:7b brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.18.0.192/24 brd 172.18.0.255 scope global qg-57ae9e7b-53
       valid_lft forever preferred_lft forever
    inet 172.18.0.161/32 brd 172.18.0.161 scope global qg-57ae9e7b-53
       valid_lft forever preferred_lft forever
    inet 172.18.0.162/32 brd 172.18.0.162 scope global qg-57ae9e7b-53
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe4d:8b7b/64 scope link 
       valid_lft forever preferred_lft forever


root@controller:/home/ubuntu# ip netns exec qrouter-bffc6841-bbd2-4b39-b5b0-a877f20f5fa4 ip route 
default via 172.18.0.254 dev qg-57ae9e7b-53 proto static 
172.18.0.0/24 dev qg-57ae9e7b-53 proto kernel scope link src 172.18.0.192 
192.168.1.0/24 dev qr-eb23ae92-d0 proto kernel scope link src 192.168.1.1

可以看到虚拟网卡qg-57ae9e7b-53对应内部的浮动ip网卡所在,默认路由也是通过这个qg-57ae9e7b-53网卡在访问外部网络。因此可能是这个tuntap网卡有问题,这里环境安装的是openstack 基于linuxbridge版本的,因此查看bridge的连接情况。

root@controller:/home/ubuntu# brctl show
bridge name     bridge id               STP enabled     interfaces
brq1fae90a3-41          8000.facf268f5372       no              tap2e36c85e-e9
                                                        tapeb23ae92-d0
                                                        vxlan-4464
brq4e358a61-31          8000.4a023181b109       no              tap57ae9e7b-53
                                                        tapaeb2adac-74
                                                        vxlan-1001

可以看到qg-57ae9e7b-53对应的tap57ae9e7b-53绑定在了brq4e358a61-31网桥上,但是该网桥上并没有可以和外部网络联通的网卡,因此手工添加provider网卡至该网桥。

root@controller:/home/ubuntu# brctl addif brq4e358a61-31 ens160
root@controller:/home/ubuntu# brctl show
bridge name     bridge id               STP enabled     interfaces
brq1fae90a3-41          8000.facf268f5372       no              tap2e36c85e-e9
                                                        tapeb23ae92-d0
                                                        vxlan-4464
brq4e358a61-31          8000.4a023181b109       no              ens160
                                                        tap57ae9e7b-53
                                                        tapaeb2adac-74
                                                        vxlan-1001

此时虚拟机内网络接入外部网络。

参考

  • https://bbs.huaweicloud.com/blogs/152596