E2E tests regularly run this basic container networking DNS test after building a cluster:
$ kubectl describe pod validate-dns-linux-4p75n -n default completed in 913.265905ms
2020/02/21 16:22:35
Name: validate-dns-linux-4p75n
Namespace: default
Priority: 0
Node: k8s-agentpool1-13396981-vmss000000/10.240.0.34
Start Time: Fri, 21 Feb 2020 16:20:31 +0000
Labels: controller-uid=74e04685-aed4-4943-91d4-17eb49e6cd5d
job-name=validate-dns-linux
Annotations: kubernetes.io/psp: privileged
Status: Running
IP: 10.240.0.52
IPs:
IP: 10.240.0.52
Controlled By: Job/validate-dns-linux
Containers:
validate-bing-google:
Container ID: containerd://9ea0e6c78af111ff70224d4722d9ce6f0f8303e819bddffad3ebdfe3c73ac61d
Image: library/busybox
Image ID: docker.io/library/busybox:6915be4043561d64e0ab0f8f098dc2ac48e077fe23f488ac24b665166898115a
Port: <none>
Host Port: <none>
Command:
sh
-c
until nslookup www.bing.com || nslookup google.com; do echo waiting for DNS resolution; sleep 1; done;
State: Running
Started: Fri, 21 Feb 2020 16:20:35 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-rnh6k (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
default-token-rnh6k:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-rnh6k
Optional: false
QoS Class: BestEffort
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned default/validate-dns-linux-4p75n to k8s-agentpool1-13396981-vmss000000
Normal Pulling 2m3s kubelet, k8s-agentpool1-13396981-vmss000000 Pulling image "library/busybox"
Normal Pulled 2m kubelet, k8s-agentpool1-13396981-vmss000000 Successfully pulled image "library/busybox"
Normal Created 2m kubelet, k8s-agentpool1-13396981-vmss000000 Created container validate-bing-google
Normal Started 2m kubelet, k8s-agentpool1-13396981-vmss000000 Started container validate-bing-google
</unknown></none></none></none>
We are getting intermittent failures to receive a terminal zero exit code state of the above on clusters running w/ Azure-built containerd:
$ k get nodes -o json
2020/02/21 16:14:45 NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
k8s-agentpool1-13396981-vmss000000 Ready <none> 46s v1.18.0-alpha.5 10.240.0.34 <none> Ubuntu 16.04.6 LTS 4.15.0-1069-azure containerd://1.3.2+azure
k8s-agentpool1-13396981-vmss000001 Ready <none> 46s v1.18.0-alpha.5 10.240.0.65 <none> Ubuntu 16.04.6 LTS 4.15.0-1069-azure containerd://1.3.2+azure
k8s-master-13396981-0 Ready <none> 46s v1.18.0-alpha.5 10.255.255.5 <none> Ubuntu 16.04.6 LTS 4.15.0-1069-azure containerd://1.3.2+azure
</none></none></none></none></none></none>
The errors:
$ k logs validate-dns-linux-4p75n -c validate-bing-google -n default
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
;; connection timed out; no servers could be reached
waiting for DNS resolution
;; connection timed out; no servers could be reached
We wait up to 2 minutes before throwing an error in E2E.
该提问来源于开源项目:Azure/aks-engine