weixin_39934675
weixin_39934675
2021-01-07 08:13

Containers not receiving DHCP replies from host dnsmasq

  • Distribution: Ubuntu
  • Distribution version: 16.04.4 LTS
  • lxd version 2.21 Full lxc info output: lxc-info.txt

Instances I create do not get addresses assigned via DHCP, despite evidence that they are requesting them.

I've installed lxd with the default networking settings. I have confirmed that dnsmasq is running:


4586 dnsmasq --strict-order --bind-interfaces --pid-file=/var/lib/lxd/networks/lxdbr0/dnsmasq.pid --except-interface=lo --interface=lxdbr0 --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.243.191.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/lib/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/lib/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.243.191.2,10.243.191.254,1h --listen-address=fd42:cfbf:5e80:3066::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd -S /lxd/ --conf-file=/var/lib/lxd/networks/lxdbr0/dnsmasq.raw -u lxd

netstat -l shows that it is listening for DHCP requests.

netstat -l -n -p|grep dnsmasq
tcp        0      0 10.243.191.1:53         0.0.0.0:*               LISTEN      4586/dnsmasq    
tcp6       0      0 fd42:cfbf:5e80:3066::53 :::*                    LISTEN      4586/dnsmasq    
tcp6       0      0 fe80::d0c5:d1ff:fe53:53 :::*                    LISTEN      4586/dnsmasq    
udp        0      0 10.243.191.1:53         0.0.0.0:*                           4586/dnsmasq    
udp        0      0 0.0.0.0:67              0.0.0.0:*                           4586/dnsmasq    
udp6       0      0 :::547                  :::*                                4586/dnsmasq    
udp6       0      0 fd42:cfbf:5e80:3066::53 :::*                                4586/dnsmasq    
udp6       0      0 fe80::d0c5:d1ff:fe53:53 :::*                                4586/dnsmasq    
raw6       0      0 :::58                   :::*                    7           4586/dnsmasq    

tshark sees the request packets on the bridge interface on the host side:

Capturing on 'lxdbr0'
    1 0.000000000      0.0.0.0 → 255.255.255.255 DHCP 342 DHCP Discover - Transaction ID 0x1a425451
    2 6.693280259      0.0.0.0 → 255.255.255.255 DHCP 342 DHCP Discover - Transaction ID 0xee3ec48
    3 9.412126540      0.0.0.0 → 255.255.255.255 DHCP 342 DHCP Discover - Transaction ID 0x1a425451
    4 18.159773989      0.0.0.0 → 255.255.255.255 DHCP 342 DHCP Discover - Transaction ID 0xee3ec48
    5 18.406228488      0.0.0.0 → 255.255.255.255 DHCP 342 DHCP Discover - Transaction ID 0x1a425451

No replies are visible

Within the container the network interface does not receive an IPv4 address:

# ip addr
1: lo: <loopback> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
15: eth0: <broadcast> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:52:0e:61 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fd42:cfbf:5e80:3066:216:3eff:fe52:e61/64 scope global mngtmpaddr dynamic 
       valid_lft 3345sec preferred_lft 3345sec
    inet6 fe80::216:3eff:fe52:e61/64 scope link 
       valid_lft forever preferred_lft forever
</broadcast></loopback>

Steps to reproduce

In a separate terminal, tshark -i lxdbr0 lxc launch ubuntu:16.04 test wait for the container to launch lxc launch test systemctl status networking observe that the container is continuing to send DHCPDISCOVER requests observe that tshark reports DHCP Discover, but no replies

lxd.log container-config.txt container-log.txt

该提问来源于开源项目:lxc/lxd

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

10条回答

  • weixin_39934675 weixin_39934675 4月前

    ls: cannot access '/proc/sys/net/bridge/bridge-nf-call-iptables': No such file or directory

    I completely reset the firewall rules and restarted lxd. I also destroyed and re-created the container.

    
    Chain INPUT (policy ACCEPT)
    target     prot opt source               destination         
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:53 /* generated for LXD network lxdbr0 */
    ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:53 /* generated for LXD network lxdbr0 */
    ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp dpt:67 /* generated for LXD network lxdbr0 */
    
    Chain FORWARD (policy ACCEPT)
    target     prot opt source               destination         
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */
    ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */
    
    Chain OUTPUT (policy ACCEPT)
    target     prot opt source               destination         
    ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp spt:53 /* generated for LXD network lxdbr0 */
    ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp spt:53 /* generated for LXD network lxdbr0 */
    ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            udp spt:67 /* generated for LXD network lxdbr0 */
    
    Chain TCP (0 references)
    target     prot opt source               destination         
    
    Chain UDP (0 references)
    target     prot opt source               destination         
    

    Behavior is unchanged.

    点赞 评论 复制链接分享
  • weixin_39688875 weixin_39688875 4月前

    Hmm, ok, now that's pretty weird, can you show: - lxc network show lxdbr0 - lxc config show --expanded NAME-OF-CONTAINER - ps fauxww - tcpdump -ni lxdbr0 as you start a stopped container

    点赞 评论 复制链接分享
  • weixin_39934675 weixin_39934675 4月前
    
    $ lxc network show lxdbr0
    config:
      ipv4.address: 10.243.191.1/24
      ipv4.nat: "true"
      ipv6.address: fd42:cfbf:5e80:3066::1/64
      ipv6.nat: "true"
    description: ""
    name: lxdbr0
    type: bridge
    used_by:
    - /1.0/containers/test
    managed: true
    
    
    $ lxc config show --expanded test
    architecture: x86_64
    config:
      image.architecture: amd64
      image.description: ubuntu 16.04 LTS amd64 (release) (20180418)
      image.label: release
      image.os: ubuntu
      image.release: xenial
      image.serial: "20180418"
      image.version: "16.04"
      volatile.base_image: e3898cc6c4b53b2943baf340996bf0438af3b85bf43bb1afb3c8b273dd9077e2
      volatile.eth0.hwaddr: 00:16:3e:a1:df:19
      volatile.eth0.name: eth0
      volatile.idmap.base: "0"
      volatile.idmap.next: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
      volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
      volatile.last_state.power: RUNNING
    devices:
      eth0:
        nictype: bridged
        parent: lxdbr0
        type: nic
      root:
        path: /
        pool: default
        type: disk
    ephemeral: false
    profiles:
    - default
    stateful: false
    description: ""
    
    
    # tcpdump -ni lxdbr0
    tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
    listening on lxdbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
    21:32:47.547688 IP6 fe80::78a1:c2ff:fe5c:fc2b > ff02::16: HBH ICMP6, multicast listener report v2, 7 group record(s), length 148
    21:32:47.547793 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
    21:32:48.087635 IP6 fe80::78a1:c2ff:fe5c:fc2b > ff02::16: HBH ICMP6, multicast listener report v2, 7 group record(s), length 148
    21:32:48.275690 IP6 :: > ff02::1:ffa1:df19: ICMP6, neighbor solicitation, who has fe80::216:3eff:fea1:df19, length 24
    21:32:48.287677 IP6 :: > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
    21:32:49.044258 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:a1:df:19, length 300
    21:32:49.275735 IP6 fe80::216:3eff:fea1:df19 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
    21:32:49.275795 IP6 fe80::216:3eff:fea1:df19 > ff02::2: ICMP6, router solicitation, length 16
    21:32:49.435710 IP6 fe80::216:3eff:fea1:df19 > ff02::16: HBH ICMP6, multicast listener report v2, 1 group record(s), length 28
    21:32:52.762591 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:a1:df:19, length 300
    21:32:53.283703 IP6 fe80::216:3eff:fea1:df19 > ff02::2: ICMP6, router solicitation, length 16
    21:32:57.293159 IP6 fe80::216:3eff:fea1:df19 > ff02::2: ICMP6, router solicitation, length 16
    21:32:59.590604 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:a1:df:19, length 300
    21:33:19.958967 IP 0.0.0.0.68 > 255.255.255.255.67: BOOTP/DHCP, Request from 00:16:3e:a1:df:19, length 300
    
    
    USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    root         2  0.0  0.0      0     0 ?        S    20:43   0:00 [kthreadd]
    root         1  0.1  0.2  38028  6080 ?        Ss   20:43   0:03 /sbin/init
    root       186  0.0  0.1  30768  3632 ?        Ss   20:43   0:00 /lib/systemd/systemd-journald
    root       269  0.0  0.1  44288  3852 ?        Ss   20:43   0:00 /lib/systemd/systemd-udevd
    systemd+   319  0.0  0.1 100324  2568 ?        Ssl  20:43   0:00 /lib/systemd/systemd-timesyncd
    syslog     518  0.0  0.1 256392  3204 ?        Ssl  20:43   0:00 /usr/sbin/rsyslogd -n
    root       552  0.0  0.0   4396   696 ?        Ss   20:43   0:00 /usr/sbin/acpid
    root       561  0.0  0.1  29008  2964 ?        Ss   20:43   0:00 /usr/sbin/cron -f
    root       570  0.0  0.0 160904  1584 ?        Ssl  20:43   0:00 /usr/bin/lxcfs /var/lib/lxcfs/
    root       809  0.0  0.3  65508  6180 ?        Ss   20:43   0:00 /usr/sbin/sshd -D
    root       939  0.0  0.3  90448  6284 ?        Ss   20:43   0:00  \_ sshd: MY_USER [priv]
    MY_USER    987  0.0  0.1  90448  3944 ?        S    20:43   0:00      \_ sshd: MY_USER/0
    MY_USER   3708  0.0  0.1  37508  3508 ?        Rs   21:39   0:00          \_ ps fauxww
    MY_USER    972  0.0  0.1  20716  2536 ?        S    20:43   0:00 /usr/lib/gamin/gam_server
    MY_USER    995  0.0  0.1  21416  4004 pts/1    Ss+  20:43   0:00  \_ -bash
    root      1065  0.0  1.6 712948 33804 ?        Ssl  20:43   0:01 /usr/bin/lxd --group lxd --logfile=/var/log/lxd/lxd.log
    lxd       1153  0.0  0.1  52864  2740 ?        S    20:43   0:00 dnsmasq --strict-order --bind-interfaces --pid-file=/var/lib/lxd/networks/lxdbr0/dnsmasq.pid --except-interface=lo --interface=lxdbr0 --quiet-dhcp --quiet-dhcp6 --quiet-ra --listen-address=10.243.191.1 --dhcp-no-override --dhcp-authoritative --dhcp-leasefile=/var/lib/lxd/networks/lxdbr0/dnsmasq.leases --dhcp-hostsfile=/var/lib/lxd/networks/lxdbr0/dnsmasq.hosts --dhcp-range 10.243.191.2,10.243.191.254,1h --listen-address=fd42:cfbf:5e80:3066::1 --enable-ra --dhcp-range ::,constructor:lxdbr0,ra-stateless,ra-names -s lxd -S /lxd/ --conf-file=/var/lib/lxd/networks/lxdbr0/dnsmasq.raw -u lxd
    root      3159  0.0  0.2 102908  6072 ?        Ss   21:32   0:00 [lxc monitor] /var/lib/lxd/containers test
    100000    3175  0.1  0.2  37620  5796 ?        Ss   21:32   0:00  \_ /sbin/init
    100000    3260  0.0  0.1  35272  3344 ?        Ss   21:32   0:00      \_ /lib/systemd/systemd-journald
    100000    3269  0.0  0.1  41720  3356 ?        Ss   21:32   0:00      \_ /lib/systemd/systemd-udevd
    100000    3554  0.0  0.0  20096  1124 ?        Ss   21:37   0:00      \_ /lib/systemd/systemd-logind
    100107    3559  0.0  0.1  42892  3800 ?        Ss   21:37   0:00      \_ /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
    100000    3564  0.0  0.2  65508  5428 ?        Ss   21:37   0:00      \_ /usr/sbin/sshd -D
    100001    3568  0.0  0.1  26044  2100 ?        Ss   21:37   0:00      \_ /usr/sbin/atd -f
    100000    3572  0.0  1.1 147472 22536 ?        Ssl  21:37   0:00      \_ /usr/lib/snapd/snapd
    100000    3577  0.0  0.1  27728  2968 ?        Ss   21:37   0:00      \_ /usr/sbin/cron -f
    100000    3579  0.0  0.3 274488  6296 ?        Ssl  21:37   0:00      \_ /usr/lib/accountsservice/accounts-daemon
    100104    3581  0.0  0.1 186896  3324 ?        Ssl  21:37   0:00      \_ /usr/sbin/rsyslogd -n
    100000    3651  0.0  0.3 277176  6224 ?        Ssl  21:37   0:00      \_ /usr/lib/policykit-1/polkitd --no-debug
    

    I've elided a bunch of kernel threads and some unrelated local processes (mailserver stuff).

    点赞 评论 复制链接分享
  • weixin_39688875 weixin_39688875 4月前

    Ok, there's nothing obviously wrong here except for the complete lack of response from dnsmasq to any of the traffic (DHCPv4 and IPv6 router solicitation).

    Can you run systemctl restart lxd and then confirm that the dnsmasq process has been restarted?

    Then try restarting the container and see if you get the same tcpdump output. Looking for messages from dnsmasq in /var/log/syslog may also help.

    点赞 评论 复制链接分享
  • weixin_39688875 weixin_39688875 4月前

    Oh and just to eliminate more potential firewall stuff can you also post the output of iptables-save, ip6tables-save and ebtables-save?

    点赞 评论 复制链接分享
  • weixin_39934675 weixin_39934675 4月前

    That did it:

    iptables-save

    
    raw
    :PREROUTING ACCEPT [0:0]
    :OUTPUT ACCEPT [0:0]
    -A PREROUTING -m rpfilter --invert -j DROP
    COMMIT
    

    I dropped that rule and things are humming along now. I'll have to take a look at what all my firewall scripts are actually doing.

    It appears some NixOS folks ran into this: https://github.com/NixOS/nixpkgs/issues/10101 They considered modifying their firewall script to allow DHCP just before the rpfilter rule in raw. If it were just my weirdo script I'd say I'd just fix it and move on, but it looks like other people are potentially running into this.

    Thank you for your assistance in tracking this down.

    点赞 评论 复制链接分享
  • weixin_39688875 weixin_39688875 4月前

    Oh, a raw table rule, that's pretty uncommon :)

    Good that we figured it out.

    点赞 评论 复制链接分享
  • weixin_39688875 weixin_39688875 4月前

    In the past such symptoms would usually indicate a firewall.

    Can you paste iptables -L -n -v

    点赞 评论 复制链接分享
  • weixin_39934675 weixin_39934675 4月前
    
    Chain INPUT (policy DROP 0 packets, 0 bytes)
     pkts bytes target     prot opt in     out     source               destination         
        0     0 ACCEPT     tcp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            tcp dpt:53 /* generated for LXD network lxdbr0 */
        0     0 ACCEPT     udp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:53 /* generated for LXD network lxdbr0 */
        0     0 ACCEPT     udp  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            udp dpt:67 /* generated for LXD network lxdbr0 */
    53848   17M ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0           
     338K  295M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate RELATED,ESTABLISHED
    10524  509K DROP       all  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate INVALID
     7343 1109K UDP        udp  --  *      *       0.0.0.0/0            0.0.0.0/0            ctstate NEW
    16781  810K TCP        tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp flags:0x17/0x02 ctstate NEW
     5926  823K REJECT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable
     9796  404K REJECT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with tcp-reset
       46  1620 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-proto-unreachable
    
    Chain FORWARD (policy DROP 0 packets, 0 bytes)
     pkts bytes target     prot opt in     out     source               destination         
        0     0 ACCEPT     all  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */
        0     0 ACCEPT     all  --  lxdbr0 *       0.0.0.0/0            0.0.0.0/0            /* generated for LXD network lxdbr0 */
        0     0 ACCEPT     all  --  eth0   zt+     0.0.0.0/0            xxxxxxxx/24        state RELATED,ESTABLISHED
        0     0 ACCEPT     all  --  zt+    eth0    xxxxxxxx/24        0.0.0.0/0           
    
    Chain OUTPUT (policy ACCEPT 403K packets, 108M bytes)
     pkts bytes target     prot opt in     out     source               destination         
        0     0 ACCEPT     tcp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            tcp spt:53 /* generated for LXD network lxdbr0 */
        0     0 ACCEPT     udp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            udp spt:53 /* generated for LXD network lxdbr0 */
        0     0 ACCEPT     udp  --  *      lxdbr0  0.0.0.0/0            0.0.0.0/0            udp spt:67 /* generated for LXD network lxdbr0 */
    53848   17M ACCEPT     all  --  *      lo      0.0.0.0/0            0.0.0.0/0           
    
    Chain TCP (1 references)
     pkts bytes target     prot opt in     out     source               destination         
     6989  407K ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            multiport dports 22,25,587,993,443
    
    Chain UDP (1 references)
     pkts bytes target     prot opt in     out     source               destination         
     1417  287K ACCEPT     udp  --  *      *       0.0.0.0/0            0.0.0.0/0            multiport dports 9993
    
    点赞 评论 复制链接分享
  • weixin_39688875 weixin_39688875 4月前

    There is indeed a firewall of some kind running on this machine as the INPUT and FORWARD chains have non-default policies (DROP rather than ACCEPT) and there are a number of addition firewalling rules in place.

    The counters above look a bit off though, I'd have expected to at least see a hit for that DHCP request but the count for INPUT udp/67 is 0 so I wonder what might have dropped it even earlier.

    One potential source of problem would be br-netfilter, if that's enabled on your system, then all interfaces even those that are part of a bridge will have your firewall applied to them, in this case likely leading to the packet being dropped.

    On modern kernel that'd be indicated by /proc/sys/net/bridge/bridge-nf-call-iptables existing on your system and containing 1 to indicate it's enabled (which I believe is the default when loaded).

    I'm closing this issue as I don't believe there's anything wrong with LXD itself but it's instead getting container traffic blocked by firewalling that's configured by something else. We do monitor closed issues so if this somehow turns back into a LXD issue, we'll re-open.

    点赞 评论 复制链接分享

相关推荐