weixin_39976748
weixin_39976748
2020-12-02 14:43

Only 1 CPU is used even if more CPUs are available.

Description of problem

Only 1 CPU is used even if more CPUs are available. Setup is k8s + cri-containerd + kata. Launched a kubernetes container with 4 CPU's, ran stress --cpu 4 --timeout 60s and expected all 4 CPU's to show usage percentage; but only 1 CPU shows 100%.

kubernetes resources definition: resources: requests: memory: "2Gi" cpu: "4" limits: memory: "2Gi" cpu: "4" args: - -cpus - "4"

Expected result

available CPUs should be used, in this case, all 4. Expected QEMU summary to show Summary: QEMU Standard PC, 5 x Xeon Gold 6140 2.30GHz, 13.9GB / 2GB RAM Processors: 5 x Xeon Gold 6140 2.30GHz (HT missing, 5 cores, 5 threads) - Sky Lake-E, 14nm, L3: 36MB

Actual result

htop inside container: image

nproc inside container: 1

nproc --all inside container: 5

hwconfig inside container:


Summary:    QEMU Standard PC, 1 x Xeon Gold 6140 2.30GHz, 13.9GB / 2GB RAM
System:     QEMU Standard PC, C-xK/2/0
Processors: 1 x Xeon Gold 6140 2.30GHz (HT missing, 7 cores, 7 threads) - Sky Lake-E, 14nm, L3: 36MB
Memory:     13.9GB / 2GB RAM == 1 x 2GB
Disk:       sda (virtio_scsi0): 107GB (0%) == 1 x 107GB QEMU-QEMU-HARDDISK
Disk:       sdb (virtio_scsi0): 107GB (3%) == 1 x 107GB QEMU-QEMU-HARDDISK
Disk-Control:   00:01.1: Bochs/Intel 82371SB PIIX3 ATA/33
Disk-Control:   virtio-pci1: Red Hat, . Virtio SCSI
Network:    00:07.0 (virtio-pci4): Red Hat Virtio Network Device
Chipset:    Intel 440FX (Natoma), 82371SB (PIIX3)
OS:     RHEL Server 7.8, Linux 4.19.86-7.1.container x86_64, 64-bit
BIOS:       SeaBIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014, rev 0.0

lscpu inside container:


Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                5
On-line CPU(s) list:   0-4
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             5
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
Stepping:              4
CPU MHz:               2294.608
BogoMIPS:              4589.21
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
L3 cache:              16384K
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 arat pku md_clear

kata-collect-data.sh

Show kata-collect-data.sh details

# Meta details Running `kata-collect-data.sh` version `1.11.1 (commit 984ccea48a1badd2863a68f91c8d95e229898095)` at `2020-06-29.06:00:53.076170759+0000`. --- Runtime is `/usr/bin/kata-runtime`. # `kata-env` Output of "`/usr/bin/kata-runtime kata-env`":

toml
[Meta]
  Version = "1.0.23"

[Runtime]
  Debug = false
  Trace = false
  DisableGuestSeccomp = true
  DisableNewNetNs = false
  SandboxCgroupOnly = false
  Path = "/usr/bin/kata-runtime"
  [Runtime.Version]
    Semver = "1.10.2"
    Commit = "29c489e64d5b76e7624e8fbcc40c26402f95bf64"
    OCI = "1.0.1-dev"
  [Runtime.Config]
    Path = "/etc/kata-containers/configuration.toml"

[Hypervisor]
  MachineType = "pc"
  Version = "QEMU emulator version 4.1.0\nCopyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers"
  Path = "/usr/bin/qemu-vanilla-system-x86_64"
  BlockDeviceDriver = "virtio-scsi"
  EntropySource = "/dev/urandom"
  Msize9p = 8192
  MemorySlots = 10
  Debug = false
  UseVSock = false
  SharedFS = "virtio-9p"

[Image]
  Path = "/usr/share/kata-containers/kata-containers-image_clearlinux_1.10.2_agent_9ecf1b6c06.img"

[Kernel]
  Path = "/usr/share/kata-containers/vmlinuz-4.19.86.60-7.1.container"
  Parameters = "systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket vsyscall=emulate init=/usr/bin/kata-agent"

[Initrd]
  Path = ""

[Proxy]
  Type = "kataProxy"
  Version = "kata-proxy version 1.10.2-4fe00a9"
  Path = "/usr/libexec/kata-containers/kata-proxy"
  Debug = false

[Shim]
  Type = "kataShim"
  Version = "kata-shim version 1.10.2-f75e584"
  Path = "/usr/libexec/kata-containers/kata-shim"
  Debug = false

[Agent]
  Type = "kata"
  Debug = false
  Trace = false
  TraceMode = ""
  TraceType = ""

[Host]
  Kernel = "3.10.0-1062.12.1.el7.YAHOO.20200205.52.x86_64"
  Architecture = "amd64"
  VMContainerCapable = true
  SupportVSocks = true
  [Host.Distro]
    Name = "Red Hat Enterprise Linux Server"
    Version = "7.7"
  [Host.CPU]
    Vendor = "GenuineIntel"
    Model = "Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz"

[Netmon]
  Version = "kata-netmon version 1.10.2"
  Path = "/usr/libexec/kata-containers/kata-netmon"
  Debug = false
  Enable = false
--- # Runtime config files ## Runtime default config files

/etc/kata-containers/configuration.toml
/usr/share/defaults/kata-containers/configuration.toml
## Runtime config file contents Output of "`cat "/etc/kata-containers/configuration.toml"`":
toml
#Copyrighn (c) 2017-2019 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#

# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "cli/config/configuration-qemu.toml.in"
# XXX: Project:
# XXX:   Name: Kata Containers
# XXX:   Type: kata

[hypervisor.qemu]
path = "/usr/bin/qemu-vanilla-system-x86_64"
kernel = "/usr/share/kata-containers/vmlinuz.container"
image = "/usr/share/kata-containers/kata-containers.img"
machine_type = "pc"

# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = "vsyscall=emulate init=/usr/bin/kata-agent"

# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = ""

# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators=""

# Default number of vCPUs per SB/VM:
# unspecified or 0                --> will be set to 1
# < 0                             --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores
default_vcpus = 0

# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0             --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
default_maxvcpus = 0

# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
#   This limitation could be a bug in qemu or in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0   --> will be set to 1
# > 1 <= 5           --> will be set to the specified number
# > 5                --> will be set to 5
default_bridges = 1

# Default memory size in MiB for SB/VM.
# If unspecified then it will be set 2048 MiB.
default_memory = 2048

#
# Default memory slots per SB/VM.
# If unspecified then it will be set 10.
# This is will determine the times that memory will be hotadded to sandbox/VM.
#memory_slots = 10

# The size in MiB will be plused to max memory of hypervisor.
# It is the memory address space for the NVDIMM devie.
# If set block storage driver (block_device_driver) to "nvdimm",
# should set memory_offset to the size of block device.
# Default 0
#memory_offset = 0

# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons.
# This flag prevents the block device from being passed to the hypervisor,
# 9pfs is used instead to pass the rootfs.
disable_block_device_use = false

# Shared file system type:
#   - virtio-9p (default)
#   - virtio-fs
shared_fs = "virtio-9p"

# Path to vhost-user-fs daemon.
virtio_fs_daemon = "/opt/kata/bin/virtiofsd"

# Default size of DAX cache in MiB
virtio_fs_cache_size = 1024

# Extra args for virtiofsd daemon
#
# Format example:
#   ["-o", "arg1=xxx,arg2", "-o", "hello world", "--arg3=yyy"]
#
# see `virtiofsd -h` for possible options.
virtio_fs_extra_args = []

# Cache mode:
#
#  - none
#    Metadata, data, and pathname lookup are not cached in guest. They are
#    always fetched from host and any changes are immediately pushed to host.
#
#  - auto
#    Metadata and pathname lookup cache expires after a configured amount of
#    time (default is 1 second). Data is cached while the file is open (close
#    to open consistency).
#
#  - always
#    Metadata, data, and pathname lookup are cached in guest and never expire.
virtio_fs_cache = "always"

# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is virtio-scsi, virtio-blk
# or nvdimm.
block_device_driver = "virtio-scsi"

# Specifies cache-related options will be set to block devices or not.
# Default false
#block_device_cache_set = true

# Specifies cache-related options for block devices.
# Denotes whether use of O_DIRECT (bypass the host page cache) is enabled.
# Default false
#block_device_cache_direct = true

# Specifies cache-related options for block devices.
# Denotes whether flush requests for the device are ignored.
# Default false
#block_device_cache_noflush = true

# Enable iothreads (data-plane) to be used. This causes IO to be
# handled in a separate IO thread. This is currently only implemented
# for SCSI.
#
enable_iothreads = false

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically
# result in memory pre allocation
#enable_hugepages = true

# Enable file based guest memory support. The default is an empty string which
# will disable this feature. In the case of virtio-fs, this is enabled
# automatically and '/dev/shm' is used as the backing folder.
# This option will be ignored if VM templating is enabled.
#file_mem_backend = ""

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. This extra output is added
# to the proxy logs, but only when proxy debug is also enabled.
#
# Default false
#enable_debug = true

# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
#
#disable_nesting_checks = true

# This is the msize used for 9p shares. It is the number of bytes
# used for 9p packet payload.
#msize_9p = 8192

# If true and vsocks are supported, use vsocks to communicate directly
# with the agent and no proxy is started, otherwise use unix
# sockets and start a proxy to communicate with the agent.
# Default false
#use_vsock = true

# VFIO devices are hotplugged on a bridge by default.
# Enable hotplugging on root bus. This may be required for devices with
# a large PCI bar, as this is a current limitation with hotplugging on
# a bridge. This value is valid for "pc" machine type.
# Default false
#hotplug_vfio_on_root_bus = true

# If vhost-net backend for virtio-net is not desired, set to true. Default is false, which trades off
# security (vhost-net runs ring0) for network I/O performance.
#disable_vhost_net = true

#
# Default entropy source.
# The path to a host source of entropy (including a real hardware RNG)
# /dev/urandom and /dev/random are two main options.
# Be aware that /dev/random is a blocking source of entropy.  If the host
# runs out of entropy, the VMs boot time will increase leading to get startup
# timeouts.
# The source of entropy /dev/urandom is non-blocking and provides a
# generally acceptable source of entropy. It should work well for pretty much
# all practical purposes.
#entropy_source= "/dev/urandom"

# Path to OCI hook binaries in the *guest rootfs*.
# This does not affect host-side hooks which must instead be added to
# the OCI spec passed to the runtime.
#
# You can create a rootfs with hooks by customizing the osbuilder scripts:
# https://github.com/kata-containers/osbuilder
#
# Hooks must be stored in a subdirectory of guest_hook_path according to their
# hook type, i.e. "guest_hook_path/{prestart,postart,poststop}".
# The agent will scan these directories for executable files and add them, in
# lexicographical order, to the lifecycle of the guest container.
# Hooks are executed in the runtime namespace of the guest. See the official documentation:
# https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
# Warnings will be logged if any error is encountered will scanning for hooks,
# but it will not abort container execution.
#guest_hook_path = "/usr/share/oci/hooks"

[factory]
# VM templating support. Once enabled, new VMs are created from template
# using vm cloning. They will share the same initial kernel, initramfs and
# agent memory by mapping it readonly. It helps speeding up new container
# creation and saves a lot of memory if there are many kata containers running
# on the same host.
#
# When disabled, new VMs are created from scratch.
#
# Note: Requires "initrd=" to be set ("image=" is not supported).
#
# Default false
#enable_template = true

# Specifies the path of template.
#
# Default "/run/vc/vm/template"
#template_path = "/run/vc/vm/template"

# The number of caches of VMCache:
# unspecified or == 0   --> VMCache is disabled
# > 0                   --> will be set to the specified number
#
# VMCache is a function that creates VMs as caches before using it.
# It helps speed up new container creation.
# The function consists of a server and some clients communicating
# through Unix socket.  The protocol is gRPC in protocols/cache/cache.proto.
# The VMCache server will create some VMs and cache them by factory cache.
# It will convert the VM to gRPC format and transport it when gets
# requestion from clients.
# Factory grpccache is the VMCache client.  It will request gRPC format
# VM and convert it back to a VM.  If VMCache function is enabled,
# kata-runtime will request VM from factory grpccache when it creates
# a new sandbox.
#
# Default 0
#vm_cache_number = 0

# Specify the address of the Unix socket that is used by VMCache.
#
# Default /var/run/kata-containers/cache.sock
#vm_cache_endpoint = "/var/run/kata-containers/cache.sock"

[proxy.kata]
path = "/usr/libexec/kata-containers/kata-proxy"

# If enabled, proxy messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[shim.kata]
path = "/usr/libexec/kata-containers/kata-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

# If enabled, the shim will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
#
# Note: By default, the shim runs in a separate network namespace. Therefore,
# to allow it to send trace details to the Jaeger agent running on the host,
# it is necessary to set 'disable_new_netns=true' so that it runs in the host
# network namespace.
#
# (default: disabled)
#enable_tracing = true

[agent.kata]
# If enabled, make the agent display debug-level messages.
# (default: disabled)
#enable_debug = true

# Enable agent tracing.
#
# If enabled, the default trace mode is "dynamic" and the
# default trace type is "isolated". The trace mode and type are set
# explicity with the `trace_type=` and `trace_mode=` options.
#
# Notes:
#
# - Tracing is ONLY enabled when `enable_tracing` is set: explicitly
#   setting `trace_mode=` and/or `trace_type=` without setting `enable_tracing`
#   will NOT activate agent tracing.
#
# - See https://github.com/kata-containers/agent/blob/master/TRACING.md for
#   full details.
#
# (default: disabled)
#enable_tracing = true
#
#trace_mode = "dynamic"
#trace_type = "isolated"

# Comma separated list of kernel modules and their parameters.
# These modules will be loaded in the guest kernel using modprobe(8).
# The following example can be used to load two kernel modules with parameters
#  - kernel_modules=["e1000e InterruptThrottleRate=3000,3000,3000 EEE=1", "i915 enable_ppgtt=0"]
# The first word is considered as the module name and the rest as its parameters.
# Container will not be started when:
#  * A kernel module is specified and the modprobe command is not installed in the guest
#    or it fails loading the module.
#  * The module is not available in the guest or it doesn't met the guest kernel
#    requirements, like architecture and version.
#
kernel_modules=[]


[netmon]
# If enabled, the network monitoring process gets started when the
# sandbox is created. This allows for the detection of some additional
# network being added to the existing network namespace, after the
# sandbox has been created.
# (default: disabled)
#enable_netmon = true

# Specify the path to the netmon binary.
path = "/usr/libexec/kata-containers/kata-netmon"

# If enabled, netmon messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
#   - macvtap
#     Used when the Container network interface can be bridged using
#     macvtap.
#
#   - none
#     Used when customize network. Only creates a tap device. No veth pair.
#
#   - tcfilter
#     Uses tc filter rules to redirect traffic from the network interface
#     provided by plugin to a tap interface connected to the VM.
#
internetworking_model="tcfilter"

# disable guest seccomp
# Determines whether container seccomp profiles are passed to the virtual
# machine and applied by the kata agent. If set to true, seccomp is not applied
# within the guest
# (default: true)
disable_guest_seccomp=true

# If enabled, the runtime will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
# (default: disabled)
#enable_tracing = true

# If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
# This option may have some potential impacts to your host. It should only be used when you know what you're doing.
# `disable_new_netns` conflicts with `enable_netmon`
# `disable_new_netns` conflicts with `internetworking_model=tcfilter` and `internetworking_model=macvtap`. It works only
# with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
# (like OVS) directly.
# If you are using docker, `disable_new_netns` only works with `docker run --net=none`
# (default: false)
#disable_new_netns = true

# if enabled, the runtime will add all the kata processes inside one dedicated cgroup.
# The container cgroups in the host are not created, just one single cgroup per sandbox.
# The sandbox cgroup is not constrained by the runtime
# The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
# The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
# See: https://godoc.org/github.com/kata-containers/runtime/virtcontainers#ContainerType
sandbox_cgroup_only=false

# Enabled experimental feature list, format: ["a", "b"].
# Experimental features are features not stable enough for production,
# They may break compatibility, and are prepared for a big version bump.
# Supported experimental features:
# 1. "newstore": new persist storage driver which breaks backward compatibility,
#               expected to move out of experimental in 2.0.0.
# (default: [])
experimental=[]
Config file `/opt/kata/share/defaults/kata-containers/configuration.toml` not found Output of "`cat "/usr/share/defaults/kata-containers/configuration.toml"`":
toml
# Copyright (c) 2017-2019 Intel Corporation
#
# SPDX-License-Identifier: Apache-2.0
#

# XXX: WARNING: this file is auto-generated.
# XXX:
# XXX: Source file: "cli/config/configuration-qemu.toml.in"
# XXX: Project:
# XXX:   Name: Kata Containers
# XXX:   Type: kata

[hypervisor.qemu]
path = "/usr/bin/qemu-vanilla-system-x86_64"
kernel = "/usr/share/kata-containers/vmlinuz.container"
image = "/usr/share/kata-containers/kata-containers.img"
machine_type = "pc"

# Optional space-separated list of options to pass to the guest kernel.
# For example, use `kernel_params = "vsyscall=emulate"` if you are having
# trouble running pre-2.15 glibc.
#
# WARNING: - any parameter specified here will take priority over the default
# parameter value of the same name used to start the virtual machine.
# Do not set values here unless you understand the impact of doing so as you
# may stop the virtual machine from booting.
# To see the list of default parameters, enable hypervisor debug, create a
# container and look for 'default-kernel-parameters' log entries.
kernel_params = ""

# Path to the firmware.
# If you want that qemu uses the default firmware leave this option empty
firmware = ""

# Machine accelerators
# comma-separated list of machine accelerators to pass to the hypervisor.
# For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
machine_accelerators=""

# Default number of vCPUs per SB/VM:
# unspecified or 0                --> will be set to 1
# < 0                             --> will be set to the actual number of physical cores
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores
default_vcpus = 1

# Default maximum number of vCPUs per SB/VM:
# unspecified or == 0             --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# > 0 <= number of physical cores --> will be set to the specified number
# > number of physical cores      --> will be set to the actual number of physical cores or to the maximum number
#                                     of vCPUs supported by KVM if that number is exceeded
# WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
# the actual number of physical cores is greater than it.
# WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
# the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
# can be added to a SB/VM, but the memory footprint will be big. Another example, with
# `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
# vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
# unless you know what are you doing.
default_maxvcpus = 0

# Bridges can be used to hot plug devices.
# Limitations:
# * Currently only pci bridges are supported
# * Until 30 devices per bridge can be hot plugged.
# * Until 5 PCI bridges can be cold plugged per VM.
#   This limitation could be a bug in qemu or in the kernel
# Default number of bridges per SB/VM:
# unspecified or 0   --> will be set to 1
# > 1 <= 5           --> will be set to the specified number
# > 5                --> will be set to 5
default_bridges = 1

# Default memory size in MiB for SB/VM.
# If unspecified then it will be set 2048 MiB.
default_memory = 2048
#
# Default memory slots per SB/VM.
# If unspecified then it will be set 10.
# This is will determine the times that memory will be hotadded to sandbox/VM.
#memory_slots = 10

# The size in MiB will be plused to max memory of hypervisor.
# It is the memory address space for the NVDIMM devie.
# If set block storage driver (block_device_driver) to "nvdimm",
# should set memory_offset to the size of block device.
# Default 0
#memory_offset = 0

# Disable block device from being used for a container's rootfs.
# In case of a storage driver like devicemapper where a container's 
# root file system is backed by a block device, the block device is passed
# directly to the hypervisor for performance reasons. 
# This flag prevents the block device from being passed to the hypervisor, 
# 9pfs is used instead to pass the rootfs.
disable_block_device_use = false

# Shared file system type:
#   - virtio-9p (default)
#   - virtio-fs
shared_fs = "virtio-9p"

# Path to vhost-user-fs daemon.
virtio_fs_daemon = "/usr/bin/virtiofsd"

# Default size of DAX cache in MiB
virtio_fs_cache_size = 1024

# Extra args for virtiofsd daemon
#
# Format example:
#   ["-o", "arg1=xxx,arg2", "-o", "hello world", "--arg3=yyy"]
#
# see `virtiofsd -h` for possible options.
virtio_fs_extra_args = []

# Cache mode:
#
#  - none
#    Metadata, data, and pathname lookup are not cached in guest. They are
#    always fetched from host and any changes are immediately pushed to host.
#
#  - auto
#    Metadata and pathname lookup cache expires after a configured amount of
#    time (default is 1 second). Data is cached while the file is open (close
#    to open consistency).
#
#  - always
#    Metadata, data, and pathname lookup are cached in guest and never expire.
virtio_fs_cache = "always"

# Block storage driver to be used for the hypervisor in case the container
# rootfs is backed by a block device. This is virtio-scsi, virtio-blk
# or nvdimm.
block_device_driver = "virtio-scsi"

# Specifies cache-related options will be set to block devices or not.
# Default false
#block_device_cache_set = true

# Specifies cache-related options for block devices.
# Denotes whether use of O_DIRECT (bypass the host page cache) is enabled.
# Default false
#block_device_cache_direct = true

# Specifies cache-related options for block devices.
# Denotes whether flush requests for the device are ignored.
# Default false
#block_device_cache_noflush = true

# Enable iothreads (data-plane) to be used. This causes IO to be
# handled in a separate IO thread. This is currently only implemented
# for SCSI.
#
enable_iothreads = false

# Enable pre allocation of VM RAM, default false
# Enabling this will result in lower container density
# as all of the memory will be allocated and locked
# This is useful when you want to reserve all the memory
# upfront or in the cases where you want memory latencies
# to be very predictable
# Default false
#enable_mem_prealloc = true

# Enable huge pages for VM RAM, default false
# Enabling this will result in the VM memory
# being allocated using huge pages.
# This is useful when you want to use vhost-user network
# stacks within the container. This will automatically 
# result in memory pre allocation
#enable_hugepages = true

# Enable file based guest memory support. The default is an empty string which
# will disable this feature. In the case of virtio-fs, this is enabled
# automatically and '/dev/shm' is used as the backing folder.
# This option will be ignored if VM templating is enabled.
#file_mem_backend = ""

# Enable swap of vm memory. Default false.
# The behaviour is undefined if mem_prealloc is also set to true
#enable_swap = true

# This option changes the default hypervisor and kernel parameters
# to enable debug output where available. This extra output is added
# to the proxy logs, but only when proxy debug is also enabled.
# 
# Default false
#enable_debug = true

# Disable the customizations done in the runtime when it detects
# that it is running on top a VMM. This will result in the runtime
# behaving as it would when running on bare metal.
# 
#disable_nesting_checks = true

# This is the msize used for 9p shares. It is the number of bytes 
# used for 9p packet payload.
#msize_9p = 8192

# If true and vsocks are supported, use vsocks to communicate directly
# with the agent and no proxy is started, otherwise use unix
# sockets and start a proxy to communicate with the agent.
# Default false
#use_vsock = true

# VFIO devices are hotplugged on a bridge by default. 
# Enable hotplugging on root bus. This may be required for devices with
# a large PCI bar, as this is a current limitation with hotplugging on 
# a bridge. This value is valid for "pc" machine type.
# Default false
#hotplug_vfio_on_root_bus = true

# If vhost-net backend for virtio-net is not desired, set to true. Default is false, which trades off
# security (vhost-net runs ring0) for network I/O performance. 
#disable_vhost_net = true

#
# Default entropy source.
# The path to a host source of entropy (including a real hardware RNG)
# /dev/urandom and /dev/random are two main options.
# Be aware that /dev/random is a blocking source of entropy.  If the host
# runs out of entropy, the VMs boot time will increase leading to get startup
# timeouts.
# The source of entropy /dev/urandom is non-blocking and provides a
# generally acceptable source of entropy. It should work well for pretty much
# all practical purposes.
#entropy_source= "/dev/urandom"

# Path to OCI hook binaries in the *guest rootfs*.
# This does not affect host-side hooks which must instead be added to
# the OCI spec passed to the runtime.
#
# You can create a rootfs with hooks by customizing the osbuilder scripts:
# https://github.com/kata-containers/osbuilder
#
# Hooks must be stored in a subdirectory of guest_hook_path according to their
# hook type, i.e. "guest_hook_path/{prestart,postart,poststop}".
# The agent will scan these directories for executable files and add them, in
# lexicographical order, to the lifecycle of the guest container.
# Hooks are executed in the runtime namespace of the guest. See the official documentation:
# https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
# Warnings will be logged if any error is encountered will scanning for hooks,
# but it will not abort container execution.
#guest_hook_path = "/usr/share/oci/hooks"

[factory]
# VM templating support. Once enabled, new VMs are created from template
# using vm cloning. They will share the same initial kernel, initramfs and
# agent memory by mapping it readonly. It helps speeding up new container
# creation and saves a lot of memory if there are many kata containers running
# on the same host.
#
# When disabled, new VMs are created from scratch.
#
# Note: Requires "initrd=" to be set ("image=" is not supported).
#
# Default false
#enable_template = true

# Specifies the path of template.
#
# Default "/run/vc/vm/template"
#template_path = "/run/vc/vm/template"

# The number of caches of VMCache:
# unspecified or == 0   --> VMCache is disabled
# > 0                   --> will be set to the specified number
#
# VMCache is a function that creates VMs as caches before using it.
# It helps speed up new container creation.
# The function consists of a server and some clients communicating
# through Unix socket.  The protocol is gRPC in protocols/cache/cache.proto.
# The VMCache server will create some VMs and cache them by factory cache.
# It will convert the VM to gRPC format and transport it when gets
# requestion from clients.
# Factory grpccache is the VMCache client.  It will request gRPC format
# VM and convert it back to a VM.  If VMCache function is enabled,
# kata-runtime will request VM from factory grpccache when it creates
# a new sandbox.
#
# Default 0
#vm_cache_number = 0

# Specify the address of the Unix socket that is used by VMCache.
#
# Default /var/run/kata-containers/cache.sock
#vm_cache_endpoint = "/var/run/kata-containers/cache.sock"

[proxy.kata]
path = "/usr/libexec/kata-containers/kata-proxy"

# If enabled, proxy messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[shim.kata]
path = "/usr/libexec/kata-containers/kata-shim"

# If enabled, shim messages will be sent to the system log
# (default: disabled)
#enable_debug = true

# If enabled, the shim will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
#
# Note: By default, the shim runs in a separate network namespace. Therefore,
# to allow it to send trace details to the Jaeger agent running on the host,
# it is necessary to set 'disable_new_netns=true' so that it runs in the host
# network namespace.
#
# (default: disabled)
#enable_tracing = true

[agent.kata]
# If enabled, make the agent display debug-level messages.
# (default: disabled)
#enable_debug = true

# Enable agent tracing.
#
# If enabled, the default trace mode is "dynamic" and the
# default trace type is "isolated". The trace mode and type are set
# explicity with the `trace_type=` and `trace_mode=` options.
#
# Notes:
#
# - Tracing is ONLY enabled when `enable_tracing` is set: explicitly
#   setting `trace_mode=` and/or `trace_type=` without setting `enable_tracing`
#   will NOT activate agent tracing.
#
# - See https://github.com/kata-containers/agent/blob/master/TRACING.md for
#   full details.
#
# (default: disabled)
#enable_tracing = true
#
#trace_mode = "dynamic"
#trace_type = "isolated"

# Comma separated list of kernel modules and their parameters.
# These modules will be loaded in the guest kernel using modprobe(8).
# The following example can be used to load two kernel modules with parameters
#  - kernel_modules=["e1000e InterruptThrottleRate=3000,3000,3000 EEE=1", "i915 enable_ppgtt=0"]
# The first word is considered as the module name and the rest as its parameters.
# Container will not be started when:
#  * A kernel module is specified and the modprobe command is not installed in the guest
#    or it fails loading the module.
#  * The module is not available in the guest or it doesn't met the guest kernel
#    requirements, like architecture and version.
#
kernel_modules=[]


[netmon]
# If enabled, the network monitoring process gets started when the
# sandbox is created. This allows for the detection of some additional
# network being added to the existing network namespace, after the
# sandbox has been created.
# (default: disabled)
#enable_netmon = true

# Specify the path to the netmon binary.
path = "/usr/libexec/kata-containers/kata-netmon"

# If enabled, netmon messages will be sent to the system log
# (default: disabled)
#enable_debug = true

[runtime]
# If enabled, the runtime will log additional debug messages to the
# system log
# (default: disabled)
#enable_debug = true
#
# Internetworking model
# Determines how the VM should be connected to the
# the container network interface
# Options:
#
#   - macvtap
#     Used when the Container network interface can be bridged using
#     macvtap.
#
#   - none
#     Used when customize network. Only creates a tap device. No veth pair.
#
#   - tcfilter
#     Uses tc filter rules to redirect traffic from the network interface
#     provided by plugin to a tap interface connected to the VM.
#
internetworking_model="tcfilter"

# disable guest seccomp
# Determines whether container seccomp profiles are passed to the virtual
# machine and applied by the kata agent. If set to true, seccomp is not applied
# within the guest
# (default: true)
disable_guest_seccomp=true

# If enabled, the runtime will create opentracing.io traces and spans.
# (See https://www.jaegertracing.io/docs/getting-started).
# (default: disabled)
#enable_tracing = true

# If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
# This option may have some potential impacts to your host. It should only be used when you know what you're doing.
# `disable_new_netns` conflicts with `enable_netmon`
# `disable_new_netns` conflicts with `internetworking_model=tcfilter` and `internetworking_model=macvtap`. It works only
# with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
# (like OVS) directly.
# If you are using docker, `disable_new_netns` only works with `docker run --net=none`
# (default: false)
#disable_new_netns = true

# if enabled, the runtime will add all the kata processes inside one dedicated cgroup.
# The container cgroups in the host are not created, just one single cgroup per sandbox.
# The sandbox cgroup is not constrained by the runtime
# The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
# The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
# See: https://godoc.org/github.com/kata-containers/runtime/virtcontainers#ContainerType
sandbox_cgroup_only=false

# Enabled experimental feature list, format: ["a", "b"].
# Experimental features are features not stable enough for production,
# They may break compatibility, and are prepared for a big version bump.
# Supported experimental features:
# 1. "newstore": new persist storage driver which breaks backward compatibility,
#               expected to move out of experimental in 2.0.0.
# (default: [])
experimental=[]
--- # KSM throttler ## version Output of "`/usr/libexec/kata-ksm-throttler/kata-ksm-throttler --version`":

kata-ksm-throttler version 1.10.2-7d69920
Output of "`/usr/lib/systemd/system/kata-ksm-throttler.service --version`":

./kata-collect-data.sh: line 178: /usr/lib/systemd/system/kata-ksm-throttler.service: Permission denied
## systemd service # Image details
yaml
---
osbuilder:
  url: "https://github.com/kata-containers/osbuilder"
  version: "unknown"
rootfs-creation-time: "2020-03-19T14:22:02.874378806+0000Z"
description: "osbuilder rootfs"
file-format-version: "0.0.2"
architecture: "x86_64"
base-distro:
  name: "Clear"
  version: "32640"
  packages:
    default:
      - "chrony"
      - "iptables-bin"
      - "kmod-bin"
      - "libudev0-shim"
      - "systemd"
      - "util-linux-bin"
    extra:

agent:
  url: "https://github.com/kata-containers/agent"
  name: "kata-agent"
  version: "1.10.2-9ecf1b6c06059d31bc22d469e80d73a70232be04"
  agent-is-init-daemon: "no"
--- # Initrd details No initrd --- # Logfiles ## Runtime logs Recent runtime problems found in system journal:

time="2020-06-29T05:29:27.57784593Z" level=error msg="Invalid command \"kata-collect-data\"" arch=amd64 name=kata-runtime pid=3759572 source=runtime
## Proxy logs No recent proxy problems found in system journal. ## Shim logs No recent shim problems found in system journal. ## Throttler logs No recent throttler problems found in system journal. --- # Container manager details No `docker` Have `kubectl` ## Kubernetes Output of "`kubectl version`":

Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.9", GitCommit:"2e808b7cb054ee242b68e62455323aa783991f03", GitTreeState:"archive", BuildDate:"2020-01-21T20:41:54Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server 127.0.0.1:9999 was refused - did you specify the right host or port?
Output of "`kubectl config view`":

apiVersion: v1
clusters: []
contexts: []
current-context: ""
kind: Config
preferences: {}
users: []
Output of "`systemctl show kubelet`":

Type=simple
Restart=on-failure
NotifyAccess=none
RestartUSec=100ms
TimeoutStartUSec=1min 30s
TimeoutStopUSec=1min 30s
WatchdogUSec=0
WatchdogTimestamp=Mon 2020-06-29 06:00:44 UTC
WatchdogTimestampMonotonic=5377148030930
StartLimitInterval=10000000
StartLimitBurst=5
StartLimitAction=none
FailureAction=none
PermissionsStartOnly=no
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
MainPID=3789183
ControlPID=0
FileDescriptorStoreMax=0
StatusErrno=0
Result=success
ExecMainStartTimestamp=Mon 2020-06-29 06:00:44 UTC
ExecMainStartTimestampMonotonic=5377148030884
ExecMainExitTimestampMonotonic=0
ExecMainPID=3789183
ExecMainCode=0
ExecMainStatus=0
ExecStart={ path=/usr/bin/kubelet ; argv[]=/usr/bin/kubelet $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBELET_ADDRESS $KUBELET_PORT $KUBELET_HOSTNAME $KUBE_ALLOW_PRIV $KUBELET_ARGS ; ignore_errors=no ; start_time=[Mon 2020-06-29 06:00:44 UTC] ; stop_time=[n/a] ; pid=3789183 ; code=(null) ; status=0/0 }
Slice=system.slice
ControlGroup=/system.slice/kubelet.service
MemoryCurrent=46546944
TasksCurrent=65
Delegate=no
CPUAccounting=no
CPUShares=18446744073709551615
StartupCPUShares=18446744073709551615
CPUQuotaPerSecUSec=infinity
BlockIOAccounting=no
BlockIOWeight=18446744073709551615
StartupBlockIOWeight=18446744073709551615
MemoryAccounting=no
MemoryLimit=18446744073709551615
DevicePolicy=auto
TasksAccounting=no
TasksMax=18446744073709551615
EnvironmentFile=/etc/sysconfig/kubelet (ignore_errors=yes)
UMask=0022
LimitCPU=18446744073709551615
LimitFSIZE=18446744073709551615
LimitDATA=18446744073709551615
LimitSTACK=18446744073709551615
LimitCORE=18446744073709551615
LimitRSS=18446744073709551615
LimitNOFILE=65536
LimitAS=18446744073709551615
LimitNPROC=767294
LimitMEMLOCK=65536
LimitLOCKS=18446744073709551615
LimitSIGPENDING=767294
LimitMSGQUEUE=819200
LimitNICE=0
LimitRTPRIO=0
LimitRTTIME=18446744073709551615
WorkingDirectory=/var/lib/kubelet
OOMScoreAdjust=0
Nice=0
IOScheduling=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SecureBits=0
CapabilityBoundingSet=18446744073709551615
AmbientCapabilities=0
MountFlags=0
PrivateTmp=no
PrivateNetwork=no
PrivateDevices=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=no
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=0
RuntimeDirectoryMode=0755
KillMode=control-group
KillSignal=15
SendSIGKILL=yes
SendSIGHUP=no
Id=kubelet.service
Names=kubelet.service
Requires=kube-proxy.service basic.target -.mount system.slice containerd.service
WantedBy=multi-user.target
Conflicts=shutdown.target
Before=shutdown.target multi-user.target
After=systemd-journald.socket basic.target -.mount system.slice containerd.service
RequiresMountsFor=/var/lib/kubelet
Documentation=https://github.com/kubernetes/kubernetes
Description=Kubernetes Kubelet Server
LoadState=loaded
ActiveState=active
SubState=running
FragmentPath=/usr/lib/systemd/system/kubelet.service
UnitFileState=enabled
UnitFilePreset=disabled
InactiveExitTimestamp=Mon 2020-06-29 06:00:44 UTC
InactiveExitTimestampMonotonic=5377148030945
ActiveEnterTimestamp=Mon 2020-06-29 06:00:44 UTC
ActiveEnterTimestampMonotonic=5377148030945
ActiveExitTimestamp=Mon 2020-06-29 06:00:44 UTC
ActiveExitTimestampMonotonic=5377147970746
InactiveEnterTimestamp=Mon 2020-06-29 06:00:44 UTC
InactiveEnterTimestampMonotonic=5377147979376
CanStart=yes
CanStop=yes
CanReload=no
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnFailureJobMode=replace
IgnoreOnIsolate=no
IgnoreOnSnapshot=no
NeedDaemonReload=no
JobTimeoutUSec=0
JobTimeoutAction=none
ConditionResult=yes
AssertResult=yes
ConditionTimestamp=Mon 2020-06-29 06:00:44 UTC
ConditionTimestampMonotonic=5377148017445
AssertTimestamp=Mon 2020-06-29 06:00:44 UTC
AssertTimestampMonotonic=5377148017445
Transient=no
No `crio` Have `containerd` ## containerd Output of "`containerd --version`":

containerd github.com/containerd/containerd v1.3.3 d76c121f76a5fc8a462dc64594aea72fe18e1178
Output of "`systemctl show containerd`":

Type=simple
Restart=always
NotifyAccess=none
RestartUSec=5s
TimeoutStartUSec=1min 30s
TimeoutStopUSec=1min 30s
WatchdogUSec=0
WatchdogTimestamp=Mon 2020-06-29 06:00:44 UTC
WatchdogTimestampMonotonic=5377148015546
StartLimitInterval=10000000
StartLimitBurst=5
StartLimitAction=none
FailureAction=none
PermissionsStartOnly=no
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
MainPID=3789181
ControlPID=0
FileDescriptorStoreMax=0
StatusErrno=0
Result=success
ExecMainStartTimestamp=Mon 2020-06-29 06:00:44 UTC
ExecMainStartTimestampMonotonic=5377148015509
ExecMainExitTimestampMonotonic=0
ExecMainPID=3789181
ExecMainCode=0
ExecMainStatus=0
ExecStartPre={ path=/sbin/modprobe ; argv[]=/sbin/modprobe overlay ; ignore_errors=no ; start_time=[Mon 2020-06-29 06:00:44 UTC] ; stop_time=[Mon 2020-06-29 06:00:44 UTC] ; pid=3789179 ; code=exited ; status=0 }
ExecStart={ path=/usr/local/bin/containerd ; argv[]=/usr/local/bin/containerd ; ignore_errors=no ; start_time=[Mon 2020-06-29 06:00:44 UTC] ; stop_time=[n/a] ; pid=3789181 ; code=(null) ; status=0/0 }
Slice=system.slice
ControlGroup=/system.slice/containerd.service
MemoryCurrent=63042719744
TasksCurrent=380
Delegate=yes
CPUAccounting=no
CPUShares=18446744073709551615
StartupCPUShares=18446744073709551615
CPUQuotaPerSecUSec=infinity
BlockIOAccounting=no
BlockIOWeight=18446744073709551615
StartupBlockIOWeight=18446744073709551615
MemoryAccounting=no
MemoryLimit=18446744073709551615
DevicePolicy=auto
TasksAccounting=no
TasksMax=18446744073709551615
UMask=0022
LimitCPU=18446744073709551615
LimitFSIZE=18446744073709551615
LimitDATA=18446744073709551615
LimitSTACK=18446744073709551615
LimitCORE=18446744073709551615
LimitRSS=18446744073709551615
LimitNOFILE=1048576
LimitAS=18446744073709551615
LimitNPROC=18446744073709551615
LimitMEMLOCK=65536
LimitLOCKS=18446744073709551615
LimitSIGPENDING=767294
LimitMSGQUEUE=819200
LimitNICE=0
LimitRTPRIO=0
LimitRTTIME=18446744073709551615
OOMScoreAdjust=-999
Nice=0
IOScheduling=0
CPUSchedulingPolicy=0
CPUSchedulingPriority=0
TimerSlackNSec=50000
CPUSchedulingResetOnFork=no
NonBlocking=no
StandardInput=null
StandardOutput=journal
StandardError=inherit
TTYReset=no
TTYVHangup=no
TTYVTDisallocate=no
SyslogPriority=30
SyslogLevelPrefix=yes
SecureBits=0
CapabilityBoundingSet=18446744073709551615
AmbientCapabilities=0
MountFlags=0
PrivateTmp=no
PrivateNetwork=no
PrivateDevices=no
ProtectHome=no
ProtectSystem=no
SameProcessGroup=no
IgnoreSIGPIPE=yes
NoNewPrivileges=no
SystemCallErrorNumber=0
RuntimeDirectoryMode=0755
KillMode=process
KillSignal=15
SendSIGKILL=yes
SendSIGHUP=no
Id=containerd.service
Names=containerd.service
Requires=basic.target system.slice
RequiredBy=kubelet.service
WantedBy=multi-user.target
Conflicts=shutdown.target
Before=shutdown.target kubelet.service multi-user.target
After=systemd-journald.socket basic.target system.slice network.target
Documentation=https://containerd.io
Description=containerd container runtime
LoadState=loaded
ActiveState=active
SubState=running
FragmentPath=/etc/systemd/system/containerd.service
UnitFileState=enabled
UnitFilePreset=disabled
InactiveExitTimestamp=Mon 2020-06-29 06:00:44 UTC
InactiveExitTimestampMonotonic=5377148000779
ActiveEnterTimestamp=Mon 2020-06-29 06:00:44 UTC
ActiveEnterTimestampMonotonic=5377148015590
ActiveExitTimestamp=Mon 2020-06-29 06:00:44 UTC
ActiveExitTimestampMonotonic=5377147991726
InactiveEnterTimestamp=Mon 2020-06-29 06:00:44 UTC
InactiveEnterTimestampMonotonic=5377147999237
CanStart=yes
CanStop=yes
CanReload=no
CanIsolate=no
StopWhenUnneeded=no
RefuseManualStart=no
RefuseManualStop=no
AllowIsolate=no
DefaultDependencies=yes
OnFailureJobMode=replace
IgnoreOnIsolate=no
IgnoreOnSnapshot=no
NeedDaemonReload=no
JobTimeoutUSec=0
JobTimeoutAction=none
ConditionResult=yes
AssertResult=yes
ConditionTimestamp=Mon 2020-06-29 06:00:44 UTC
ConditionTimestampMonotonic=5377148000346
AssertTimestamp=Mon 2020-06-29 06:00:44 UTC
AssertTimestampMonotonic=5377148000346
Transient=no
Output of "`cat /etc/containerd/config.toml`":

# Originally version 1 was referred from https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#configure-containerd-to-use-kata-containers
# Used containerd config dump to generate version 2 config
# Details of this config is in https://github.com/containerd/cri/blob/master/docs/config.md
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
plugin_dir = ""
disabled_plugins = []
required_plugins = []
# The service unit file already have -999, keep it here for redundant
oom_score = -999

[grpc]
  address = "/run/containerd/containerd.sock"
  tcp_address = ""
  tcp_tls_cert = ""
  tcp_tls_key = ""
  uid = 0
  gid = 0
  max_recv_message_size = 16777216
  max_send_message_size = 16777216

[ttrpc]
  address = ""
  uid = 0
  gid = 0

[debug]
  address = ""
  uid = 0
  gid = 0
  level = "info"

[metrics]
  address = "0.0.0.0:1338"
  grpc_histogram = false

[cgroup]
  path = ""

[timeouts]
  "io.containerd.timeout.shim.cleanup" = "5s"
  "io.containerd.timeout.shim.load" = "5s"
  "io.containerd.timeout.shim.shutdown" = "3s"
  "io.containerd.timeout.task.state" = "2s"

[plugins]
  [plugins."io.containerd.gc.v1.scheduler"]
    pause_threshold = 0.02
    deletion_threshold = 0
    mutation_threshold = 100
    schedule_delay = "0s"
    startup_delay = "100ms"
  [plugins."io.containerd.grpc.v1.cri"]
    disable_tcp_service = true
    stream_server_address = "127.0.0.1"
    stream_server_port = "0"
    stream_idle_timeout = "4h0m0s"
    enable_selinux = false
    sandbox_image = "docker.ouroath.com:4443/yahoo-cloud/k8s.gcr.io/pause:3.1"
    stats_collect_period = 10
    enable_tls_streaming = false
    max_container_log_line_size = 16384
    disable_cgroup = false
    disable_apparmor = false
    restrict_oom_score_adj = false
    max_concurrent_downloads = 3
    disable_proc_mount = false
    [plugins."io.containerd.grpc.v1.cri".containerd]
      snapshotter = "devmapper"
      default_runtime_name = "runc"
      no_pivot = false
      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata]
          runtime_type = "io.containerd.kata.v2"
          privileged_without_host_devices = true
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata.options]
            ConfigPath = "/etc/kata-containers/configuration.toml"
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
          runtime_type = "io.containerd.runc.v2"
          privileged_without_host_devices = true
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
            NoPivotRoot = false
            NoNewKeyring = false
            ShimCgroup = ""
            IoUid = 0
            IoGid = 0
            BinaryName = ""
            Root = ""
            CriuPath = ""
            SystemdCgroup = true
            CriuImagePath = ""
            CriuWorkPath = ""
    [plugins."io.containerd.grpc.v1.cri".cni]
      bin_dir = "/opt/cni/bin"
      conf_dir = "/etc/cni/net.d"
      max_conf_num = 1
      conf_template = ""
    [plugins."io.containerd.grpc.v1.cri".registry]
      [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
          endpoint = ["https://registry-1.docker.io"]
    [plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
      tls_cert_file = ""
      tls_key_file = ""
  [plugins."io.containerd.internal.v1.opt"]
    path = "/opt/containerd"
  [plugins."io.containerd.internal.v1.restart"]
    interval = "10s"
  [plugins."io.containerd.metadata.v1.bolt"]
    content_sharing_policy = "shared"
  [plugins."io.containerd.monitor.v1.cgroups"]
    no_prometheus = false
  [plugins."io.containerd.runtime.v1.linux"]
    shim = "containerd-shim"
    runtime = "runc"
    runtime_root = ""
    no_shim = false
    shim_debug = false
    systemd_cgroup = true
  [plugins."io.containerd.runtime.v2.task"]
    platforms = ["linux/amd64"]
  [plugins."io.containerd.service.v1.diff-service"]
    default = ["walking"]
  [plugins."io.containerd.snapshotter.v1.devmapper"]
    root_path = ""
    pool_name = "sys-docker--pool"
    base_image_size = "100GB"
--- # Packages No `dpkg` Have `rpm` Output of "`rpm -qa|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-ksm-throttler|kata-containers-image|linux-container|qemu-)"`":

kata-containers-image-1.10.2-6.1.x86_64
kata-linux-container-4.19.86.60-7.1.x86_64
kata-shim-bin-1.10.2-6.1.x86_64
qemu-vanilla-4.1.0+git.9e06029aea-7.1.x86_64
kata-ksm-throttler-1.10.2-6.1.x86_64
kata-proxy-1.10.2-6.1.x86_64
kata-shim-1.10.2-6.1.x86_64
qemu-vanilla-data-4.1.0+git.9e06029aea-7.1.x86_64
qemu-vanilla-bin-4.1.0+git.9e06029aea-7.1.x86_64
kata-proxy-bin-1.10.2-6.1.x86_64
kata-runtime-1.10.2-6.1.x86_64
---

该提问来源于开源项目:kata-containers/runtime

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

14条回答

  • weixin_39976748 weixin_39976748 5月前

    inside the container when I set the cpuset.cpus 0-3 /sys/fs/cgroup/cpuset/cpuset.cpus, it uses CPUs properly. Earlier the value was 0 and it was using 1 CPU. May I know, how do I set it from k8s pod?

    image

    点赞 评论 复制链接分享
  • weixin_39693971 weixin_39693971 5月前

    It's weird, the cpuset.cpus shouldn't be 0, and it should be set 0-3 by default.

    点赞 评论 复制链接分享
  • weixin_39693971 weixin_39693971 5月前

    No, used this stress --cpu 4 --timeout 60s. In your test default_vcpus is set to 1?

    Yes, default_vcpus is set to 1 and my pod yaml as below:

    
    apiVersion: v1
    kind: Pod
    metadata:
      name: fupan-kata
    spec:
      runtimeClassName: kata
      containers:
      - name: fupancount
        image: ubuntu
        resources:
            limits:
                cpu: "4"
            requests:
                cpu: "4"
    
    点赞 评论 复制链接分享
  • weixin_39976748 weixin_39976748 5月前

    Thanks , am I missing any configuration? https://github.com/kata-containers/runtime/issues/2809#issuecomment-651327965

    点赞 评论 复制链接分享
  • weixin_39693971 weixin_39693971 5月前

    uhmm ok, that explains why I couldn't reproduce it with the CLI any thoughts ?

    It seemed I couldn't reproduce it. I had tried to launch a pod with 4 cpus and then run stress with "--cpu 4", all of the cpus are fulled used as below: htop

    点赞 评论 复制链接分享
  • weixin_39637363 weixin_39637363 5月前

    thanks. could you elaborate on this? * what container are you using? * Are you using pod overhead? * Runtimeclass? * the full yaml would be very helpful

    点赞 评论 复制链接分享
  • weixin_39693971 weixin_39693971 5月前

    Have you set cpuset to your stress program? it seemed that all of the stress thread running on a single cpu.

    点赞 评论 复制链接分享
  • weixin_39976748 weixin_39976748 5月前

    Are you using pod overhead? No

    k8s version:

    
    Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.9", GitCommit:"2e808b7cb054ee242b68e62455323aa783991f03", GitTreeState:"clean", BuildDate:"2020-01-18T23:33:14Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"darwin/amd64"}
    Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.9", GitCommit:"2e808b7cb054ee242b68e62455323aa783991f03", GitTreeState:"archive", BuildDate:"2020-01-21T20:41:54Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
    

    k8s runtime class:

    
    apiVersion: node.k8s.io/v1beta1
    kind: RuntimeClass
    metadata:
      name: kata
    handler: kata
    

    k8s yaml:

    
    apiVersion: v1
    kind: Pod
    metadata:
      name: testcpu
      namespace: kata
    spec:
      runtimeClassName: kata
      serviceAccount: sd
      terminationGracePeriodSeconds: 30
      containers:
      - name: "testcpu"
        image: centos:7
        imagePullPolicy: IfNotPresent
        command: ['sh', '-c', 'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 10; done']
        resources:
          requests:
            memory: "2Gi"
            cpu: "4"
          limits:
            memory: "2Gi"
            cpu: "4"
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /workspace
          name: workspace
      volumes:
        - name: workspace
          emptyDir: {}
      restartPolicy: Always
      nodeSelector:
        kubernetes.io/hostname: node
      tolerations:
      - key: dedicated
        operator: Equal
        value: sd-kata
        effect: NoSchedule
    

    No, used this stress --cpu 4 --timeout 60s. In your test default_vcpus is set to 1?

    点赞 评论 复制链接分享
  • weixin_39976748 weixin_39976748 5月前

    When I set default_vcpus = 4, htop listed 8 cpus [ 4 k8s def + 4 default cpus]. Running stress --cpu 4 --timeout 60s shows usage for all 4 cpus.

    点赞 评论 复制链接分享
  • weixin_39637363 weixin_39637363 5月前

    Running kata-collect-data.sh version 1.11.1 (commit 984ccea48a1badd2863a68f91c8d95e229898095) at 2020-06-29.06:00:53.076170759+0000. kata-containers-image-1.10.2-6.1.x86_64 kata-linux-container-4.19.86.60-7.1.x86_64

    uhmm.. kata 1.11.1 and 1.10.2 ? could you enable full debug, try again with the latest stable (1.11.1) and run kata-collect-data.sh?

    点赞 评论 复制链接分享
  • weixin_39976748 weixin_39976748 5月前

    kata-collect-data.sh 1.11.1

    Show kata-collect-data.sh details

    # Meta details Running `kata-collect-data.sh` version `1.11.1 (commit 7b6f3b64873bbeed63f2f7937d83f1ed6ccd258a)` at `2020-06-29.19:41:25.653689313+0000`. --- Runtime is `/usr/bin/kata-runtime`. # `kata-env` Output of "`/usr/bin/kata-runtime kata-env`":

    toml
    [Meta]
      Version = "1.0.24"
    
    [Runtime]
      Debug = true
      Trace = false
      DisableGuestSeccomp = true
      DisableNewNetNs = false
      SandboxCgroupOnly = false
      Path = "/usr/bin/kata-runtime"
      [Runtime.Version]
        OCI = "1.0.1-dev"
        [Runtime.Version.Version]
          Semver = "1.11.1"
          Major = 1
          Minor = 11
          Patch = 1
          Commit = "7b6f3b64873bbeed63f2f7937d83f1ed6ccd258a"
      [Runtime.Config]
        Path = "/etc/kata-containers/configuration.toml"
    
    [Hypervisor]
      MachineType = "pc"
      Version = "QEMU emulator version 4.1.1\nCopyright (c) 2003-2019 Fabrice Bellard and the QEMU Project developers"
      Path = "/usr/bin/qemu-vanilla-system-x86_64"
      BlockDeviceDriver = "virtio-scsi"
      EntropySource = "/dev/urandom"
      SharedFS = "virtio-9p"
      VirtioFSDaemon = "/usr/bin/virtiofsd"
      Msize9p = 8192
      MemorySlots = 10
      PCIeRootPort = 0
      HotplugVFIOOnRootBus = false
      Debug = true
      UseVSock = false
    
    [Image]
      Path = "/usr/share/kata-containers/kata-containers-image_clearlinux_1.11.1_agent_34a000d0cf.img"
    
    [Kernel]
      Path = "/usr/share/kata-containers/vmlinuz-5.4.32.73-5.1.container"
      Parameters = "systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none agent.log=debug vsyscall=emulate init=/usr/bin/kata-agent agent.log=debug initcall_debug"
    
    [Initrd]
      Path = ""
    
    [Proxy]
      Type = "kataProxy"
      Path = "/usr/libexec/kata-containers/kata-proxy"
      Debug = true
      [Proxy.Version]
        Semver = "1.11.1-f61beff"
        Major = 1
        Minor = 11
        Patch = 1
        Commit = "<<unknown>>"
    
    [Shim]
      Type = "kataShim"
      Path = "/usr/libexec/kata-containers/kata-shim"
      Debug = true
      [Shim.Version]
        Semver = "1.11.1-9938269"
        Major = 1
        Minor = 11
        Patch = 1
        Commit = "<<unknown>>"
    
    [Agent]
      Type = "kata"
      Debug = true
      Trace = false
      TraceMode = ""
      TraceType = ""
    
    [Host]
      Kernel = "3.10.0-1062.12.1.el7.YAHOO.20200205.52.x86_64"
      Architecture = "amd64"
      VMContainerCapable = true
      SupportVSocks = true
      [Host.Distro]
        Name = "Red Hat Enterprise Linux Server"
        Version = "7.7"
      [Host.CPU]
        Vendor = "GenuineIntel"
        Model = "Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz"
    
    [Netmon]
      Path = "/usr/libexec/kata-containers/kata-netmon"
      Debug = true
      Enable = false
      [Netmon.Version]
        Semver = "1.11.1"
        Major = 1
        Minor = 11
        Patch = 1
        Commit = "<<unknown>>"
    </unknown></unknown></unknown>
    --- # Runtime config files ## Runtime default config files
    
    /etc/kata-containers/configuration.toml
    /usr/share/defaults/kata-containers/configuration.toml
    
    ## Runtime config file contents Output of "`cat "/etc/kata-containers/configuration.toml"`":
    toml
    # Copyright (c) 2017-2019 Intel Corporation
    #
    # SPDX-License-Identifier: Apache-2.0
    #
    
    # XXX: WARNING: this file is auto-generated.
    # XXX:
    # XXX: Source file: "cli/config/configuration-qemu.toml.in"
    # XXX: Project:
    # XXX:   Name: Kata Containers
    # XXX:   Type: kata
    
    [hypervisor.qemu]
    path = "/usr/bin/qemu-vanilla-system-x86_64"
    kernel = "/usr/share/kata-containers/vmlinuz.container"
    image = "/usr/share/kata-containers/kata-containers.img"
    machine_type = "pc"
    
    # Optional space-separated list of options to pass to the guest kernel.
    # For example, use `kernel_params = "vsyscall=emulate"` if you are having
    # trouble running pre-2.15 glibc.
    #
    # WARNING: - any parameter specified here will take priority over the default
    # parameter value of the same name used to start the virtual machine.
    # Do not set values here unless you understand the impact of doing so as you
    # may stop the virtual machine from booting.
    # To see the list of default parameters, enable hypervisor debug, create a
    # container and look for 'default-kernel-parameters' log entries.
    kernel_params = "vsyscall=emulate init=/usr/bin/kata-agent agent.log=debug initcall_debug"
    
    # Path to the firmware.
    # If you want that qemu uses the default firmware leave this option empty
    firmware = ""
    
    # Machine accelerators
    # comma-separated list of machine accelerators to pass to the hypervisor.
    # For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
    machine_accelerators=""
    
    # Default number of vCPUs per SB/VM:
    # unspecified or 0                --> will be set to 1
    # < 0                             --> will be set to the actual number of physical cores
    # > 0 <= number of physical cores --> will be set to the specified number
    # > number of physical cores      --> will be set to the actual number of physical cores
    default_vcpus = 1
    
    # Default maximum number of vCPUs per SB/VM:
    # unspecified or == 0             --> will be set to the actual number of physical cores or to the maximum number
    #                                     of vCPUs supported by KVM if that number is exceeded
    # > 0 <= number of physical cores --> will be set to the specified number
    # > number of physical cores      --> will be set to the actual number of physical cores or to the maximum number
    #                                     of vCPUs supported by KVM if that number is exceeded
    # WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
    # the actual number of physical cores is greater than it.
    # WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
    # the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
    # can be added to a SB/VM, but the memory footprint will be big. Another example, with
    # `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
    # vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
    # unless you know what are you doing.
    default_maxvcpus = 0
    
    # Bridges can be used to hot plug devices.
    # Limitations:
    # * Currently only pci bridges are supported
    # * Until 30 devices per bridge can be hot plugged.
    # * Until 5 PCI bridges can be cold plugged per VM.
    #   This limitation could be a bug in qemu or in the kernel
    # Default number of bridges per SB/VM:
    # unspecified or 0   --> will be set to 1
    # > 1 <= 5           --> will be set to the specified number
    # > 5                --> will be set to 5
    default_bridges = 1
    
    # Default memory size in MiB for SB/VM.
    # If unspecified then it will be set 2048 MiB.
    default_memory = 2048
    #
    # Default memory slots per SB/VM.
    # If unspecified then it will be set 10.
    # This is will determine the times that memory will be hotadded to sandbox/VM.
    #memory_slots = 10
    
    # The size in MiB will be plused to max memory of hypervisor.
    # It is the memory address space for the NVDIMM devie.
    # If set block storage driver (block_device_driver) to "nvdimm",
    # should set memory_offset to the size of block device.
    # Default 0
    #memory_offset = 0
    
    # Specifies virtio-mem will be enabled or not.
    # Please note that this option should be used with the command
    # "echo 1 > /proc/sys/vm/overcommit_memory".
    # Default false
    #enable_virtio_mem = true
    
    # Disable block device from being used for a container's rootfs.
    # In case of a storage driver like devicemapper where a container's 
    # root file system is backed by a block device, the block device is passed
    # directly to the hypervisor for performance reasons. 
    # This flag prevents the block device from being passed to the hypervisor, 
    # 9pfs is used instead to pass the rootfs.
    disable_block_device_use = false
    
    # Shared file system type:
    #   - virtio-9p (default)
    #   - virtio-fs
    shared_fs = "virtio-9p"
    
    # Path to vhost-user-fs daemon.
    virtio_fs_daemon = "/usr/bin/virtiofsd"
    
    # Default size of DAX cache in MiB
    virtio_fs_cache_size = 1024
    
    # Extra args for virtiofsd daemon
    #
    # Format example:
    #   ["-o", "arg1=xxx,arg2", "-o", "hello world", "--arg3=yyy"]
    #
    # see `virtiofsd -h` for possible options.
    virtio_fs_extra_args = []
    
    # Cache mode:
    #
    #  - none
    #    Metadata, data, and pathname lookup are not cached in guest. They are
    #    always fetched from host and any changes are immediately pushed to host.
    #
    #  - auto
    #    Metadata and pathname lookup cache expires after a configured amount of
    #    time (default is 1 second). Data is cached while the file is open (close
    #    to open consistency).
    #
    #  - always
    #    Metadata, data, and pathname lookup are cached in guest and never expire.
    virtio_fs_cache = "always"
    
    # Block storage driver to be used for the hypervisor in case the container
    # rootfs is backed by a block device. This is virtio-scsi, virtio-blk
    # or nvdimm.
    block_device_driver = "virtio-scsi"
    
    # Specifies cache-related options will be set to block devices or not.
    # Default false
    #block_device_cache_set = true
    
    # Specifies cache-related options for block devices.
    # Denotes whether use of O_DIRECT (bypass the host page cache) is enabled.
    # Default false
    #block_device_cache_direct = true
    
    # Specifies cache-related options for block devices.
    # Denotes whether flush requests for the device are ignored.
    # Default false
    #block_device_cache_noflush = true
    
    # Enable iothreads (data-plane) to be used. This causes IO to be
    # handled in a separate IO thread. This is currently only implemented
    # for SCSI.
    #
    enable_iothreads = false
    
    # Enable pre allocation of VM RAM, default false
    # Enabling this will result in lower container density
    # as all of the memory will be allocated and locked
    # This is useful when you want to reserve all the memory
    # upfront or in the cases where you want memory latencies
    # to be very predictable
    # Default false
    #enable_mem_prealloc = true
    
    # Enable huge pages for VM RAM, default false
    # Enabling this will result in the VM memory
    # being allocated using huge pages.
    # This is useful when you want to use vhost-user network
    # stacks within the container. This will automatically 
    # result in memory pre allocation
    #enable_hugepages = true
    
    # Enable vhost-user storage device, default false
    # Enabling this will result in some Linux reserved block type
    # major range 240-254 being chosen to represent vhost-user devices.
    enable_vhost_user_store = false
    
    # The base directory specifically used for vhost-user devices.
    # Its sub-path "block" is used for block devices; "block/sockets" is
    # where we expect vhost-user sockets to live; "block/devices" is where
    # simulated block device nodes for vhost-user devices to live.
    vhost_user_store_path = "/var/run/kata-containers/vhost-user"
    
    # Enable file based guest memory support. The default is an empty string which
    # will disable this feature. In the case of virtio-fs, this is enabled
    # automatically and '/dev/shm' is used as the backing folder.
    # This option will be ignored if VM templating is enabled.
    #file_mem_backend = ""
    
    # Enable swap of vm memory. Default false.
    # The behaviour is undefined if mem_prealloc is also set to true
    #enable_swap = true
    
    # This option changes the default hypervisor and kernel parameters
    # to enable debug output where available. This extra output is added
    # to the proxy logs, but only when proxy debug is also enabled.
    # 
    # Default false
    enable_debug = true
    
    # Disable the customizations done in the runtime when it detects
    # that it is running on top a VMM. This will result in the runtime
    # behaving as it would when running on bare metal.
    # 
    #disable_nesting_checks = true
    
    # This is the msize used for 9p shares. It is the number of bytes 
    # used for 9p packet payload.
    #msize_9p = 8192
    
    # If true and vsocks are supported, use vsocks to communicate directly
    # with the agent and no proxy is started, otherwise use unix
    # sockets and start a proxy to communicate with the agent.
    # Default false
    #use_vsock = true
    
    # If false and nvdimm is supported, use nvdimm device to plug guest image.
    # Otherwise virtio-block device is used.
    # Default is false
    #disable_image_nvdimm = true
    
    # VFIO devices are hotplugged on a bridge by default. 
    # Enable hotplugging on root bus. This may be required for devices with
    # a large PCI bar, as this is a current limitation with hotplugging on 
    # a bridge. This value is valid for "pc" machine type.
    # Default false
    #hotplug_vfio_on_root_bus = true
    
    # Before hot plugging a PCIe device, you need to add a pcie_root_port device.
    # Use this parameter when using some large PCI bar devices, such as Nvidia GPU
    # The value means the number of pcie_root_port
    # This value is valid when hotplug_vfio_on_root_bus is true and machine_type is "q35"
    # Default 0
    #pcie_root_port = 2
    
    # If vhost-net backend for virtio-net is not desired, set to true. Default is false, which trades off
    # security (vhost-net runs ring0) for network I/O performance. 
    #disable_vhost_net = true
    
    #
    # Default entropy source.
    # The path to a host source of entropy (including a real hardware RNG)
    # /dev/urandom and /dev/random are two main options.
    # Be aware that /dev/random is a blocking source of entropy.  If the host
    # runs out of entropy, the VMs boot time will increase leading to get startup
    # timeouts.
    # The source of entropy /dev/urandom is non-blocking and provides a
    # generally acceptable source of entropy. It should work well for pretty much
    # all practical purposes.
    #entropy_source= "/dev/urandom"
    
    # Path to OCI hook binaries in the *guest rootfs*.
    # This does not affect host-side hooks which must instead be added to
    # the OCI spec passed to the runtime.
    #
    # You can create a rootfs with hooks by customizing the osbuilder scripts:
    # https://github.com/kata-containers/osbuilder
    #
    # Hooks must be stored in a subdirectory of guest_hook_path according to their
    # hook type, i.e. "guest_hook_path/{prestart,postart,poststop}".
    # The agent will scan these directories for executable files and add them, in
    # lexicographical order, to the lifecycle of the guest container.
    # Hooks are executed in the runtime namespace of the guest. See the official documentation:
    # https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
    # Warnings will be logged if any error is encountered will scanning for hooks,
    # but it will not abort container execution.
    #guest_hook_path = "/usr/share/oci/hooks"
    
    [factory]
    # VM templating support. Once enabled, new VMs are created from template
    # using vm cloning. They will share the same initial kernel, initramfs and
    # agent memory by mapping it readonly. It helps speeding up new container
    # creation and saves a lot of memory if there are many kata containers running
    # on the same host.
    #
    # When disabled, new VMs are created from scratch.
    #
    # Note: Requires "initrd=" to be set ("image=" is not supported).
    #
    # Default false
    #enable_template = true
    
    # Specifies the path of template.
    #
    # Default "/run/vc/vm/template"
    #template_path = "/run/vc/vm/template"
    
    # The number of caches of VMCache:
    # unspecified or == 0   --> VMCache is disabled
    # > 0                   --> will be set to the specified number
    #
    # VMCache is a function that creates VMs as caches before using it.
    # It helps speed up new container creation.
    # The function consists of a server and some clients communicating
    # through Unix socket.  The protocol is gRPC in protocols/cache/cache.proto.
    # The VMCache server will create some VMs and cache them by factory cache.
    # It will convert the VM to gRPC format and transport it when gets
    # requestion from clients.
    # Factory grpccache is the VMCache client.  It will request gRPC format
    # VM and convert it back to a VM.  If VMCache function is enabled,
    # kata-runtime will request VM from factory grpccache when it creates
    # a new sandbox.
    #
    # Default 0
    #vm_cache_number = 0
    
    # Specify the address of the Unix socket that is used by VMCache.
    #
    # Default /var/run/kata-containers/cache.sock
    #vm_cache_endpoint = "/var/run/kata-containers/cache.sock"
    
    [proxy.kata]
    path = "/usr/libexec/kata-containers/kata-proxy"
    
    # If enabled, proxy messages will be sent to the system log
    # (default: disabled)
    enable_debug = true
    
    [shim.kata]
    path = "/usr/libexec/kata-containers/kata-shim"
    
    # If enabled, shim messages will be sent to the system log
    # (default: disabled)
    enable_debug = true
    
    # If enabled, the shim will create opentracing.io traces and spans.
    # (See https://www.jaegertracing.io/docs/getting-started).
    #
    # Note: By default, the shim runs in a separate network namespace. Therefore,
    # to allow it to send trace details to the Jaeger agent running on the host,
    # it is necessary to set 'disable_new_netns=true' so that it runs in the host
    # network namespace.
    #
    # (default: disabled)
    #enable_tracing = true
    
    [agent.kata]
    # If enabled, make the agent display debug-level messages.
    # (default: disabled)
    enable_debug = true
    
    # Enable agent tracing.
    #
    # If enabled, the default trace mode is "dynamic" and the
    # default trace type is "isolated". The trace mode and type are set
    # explicity with the `trace_type=` and `trace_mode=` options.
    #
    # Notes:
    #
    # - Tracing is ONLY enabled when `enable_tracing` is set: explicitly
    #   setting `trace_mode=` and/or `trace_type=` without setting `enable_tracing`
    #   will NOT activate agent tracing.
    #
    # - See https://github.com/kata-containers/agent/blob/master/TRACING.md for
    #   full details.
    #
    # (default: disabled)
    #enable_tracing = true
    #
    #trace_mode = "dynamic"
    #trace_type = "isolated"
    
    # Comma separated list of kernel modules and their parameters.
    # These modules will be loaded in the guest kernel using modprobe(8).
    # The following example can be used to load two kernel modules with parameters
    #  - kernel_modules=["e1000e InterruptThrottleRate=3000,3000,3000 EEE=1", "i915 enable_ppgtt=0"]
    # The first word is considered as the module name and the rest as its parameters.
    # Container will not be started when:
    #  * A kernel module is specified and the modprobe command is not installed in the guest
    #    or it fails loading the module.
    #  * The module is not available in the guest or it doesn't met the guest kernel
    #    requirements, like architecture and version.
    #
    kernel_modules=[]
    
    
    [netmon]
    # If enabled, the network monitoring process gets started when the
    # sandbox is created. This allows for the detection of some additional
    # network being added to the existing network namespace, after the
    # sandbox has been created.
    # (default: disabled)
    #enable_netmon = true
    
    # Specify the path to the netmon binary.
    path = "/usr/libexec/kata-containers/kata-netmon"
    
    # If enabled, netmon messages will be sent to the system log
    # (default: disabled)
    enable_debug = true
    
    [runtime]
    # If enabled, the runtime will log additional debug messages to the
    # system log
    # (default: disabled)
    enable_debug = true
    #
    # Internetworking model
    # Determines how the VM should be connected to the
    # the container network interface
    # Options:
    #
    #   - macvtap
    #     Used when the Container network interface can be bridged using
    #     macvtap.
    #
    #   - none
    #     Used when customize network. Only creates a tap device. No veth pair.
    #
    #   - tcfilter
    #     Uses tc filter rules to redirect traffic from the network interface
    #     provided by plugin to a tap interface connected to the VM.
    #
    internetworking_model="tcfilter"
    
    # disable guest seccomp
    # Determines whether container seccomp profiles are passed to the virtual
    # machine and applied by the kata agent. If set to true, seccomp is not applied
    # within the guest
    # (default: true)
    disable_guest_seccomp=true
    
    # If enabled, the runtime will create opentracing.io traces and spans.
    # (See https://www.jaegertracing.io/docs/getting-started).
    # (default: disabled)
    #enable_tracing = true
    
    # If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
    # This option may have some potential impacts to your host. It should only be used when you know what you're doing.
    # `disable_new_netns` conflicts with `enable_netmon`
    # `disable_new_netns` conflicts with `internetworking_model=tcfilter` and `internetworking_model=macvtap`. It works only
    # with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
    # (like OVS) directly.
    # If you are using docker, `disable_new_netns` only works with `docker run --net=none`
    # (default: false)
    #disable_new_netns = true
    
    # if enabled, the runtime will add all the kata processes inside one dedicated cgroup.
    # The container cgroups in the host are not created, just one single cgroup per sandbox.
    # The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
    # The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
    # The sandbox cgroup is constrained if there is no container type annotation.
    # See: https://godoc.org/github.com/kata-containers/runtime/virtcontainers#ContainerType
    sandbox_cgroup_only=false
    
    # Enabled experimental feature list, format: ["a", "b"].
    # Experimental features are features not stable enough for production,
    # they may break compatibility, and are prepared for a big version bump.
    # Supported experimental features:
    # (default: [])
    experimental=[]
    
    Output of "`cat "/usr/share/defaults/kata-containers/configuration.toml"`":
    toml
    # Copyright (c) 2017-2019 Intel Corporation
    #
    # SPDX-License-Identifier: Apache-2.0
    #
    
    # XXX: WARNING: this file is auto-generated.
    # XXX:
    # XXX: Source file: "cli/config/configuration-qemu.toml.in"
    # XXX: Project:
    # XXX:   Name: Kata Containers
    # XXX:   Type: kata
    
    [hypervisor.qemu]
    path = "/usr/bin/qemu-vanilla-system-x86_64"
    kernel = "/usr/share/kata-containers/vmlinuz.container"
    image = "/usr/share/kata-containers/kata-containers.img"
    machine_type = "pc"
    
    # Optional space-separated list of options to pass to the guest kernel.
    # For example, use `kernel_params = "vsyscall=emulate"` if you are having
    # trouble running pre-2.15 glibc.
    #
    # WARNING: - any parameter specified here will take priority over the default
    # parameter value of the same name used to start the virtual machine.
    # Do not set values here unless you understand the impact of doing so as you
    # may stop the virtual machine from booting.
    # To see the list of default parameters, enable hypervisor debug, create a
    # container and look for 'default-kernel-parameters' log entries.
    kernel_params = ""
    
    # Path to the firmware.
    # If you want that qemu uses the default firmware leave this option empty
    firmware = ""
    
    # Machine accelerators
    # comma-separated list of machine accelerators to pass to the hypervisor.
    # For example, `machine_accelerators = "nosmm,nosmbus,nosata,nopit,static-prt,nofw"`
    machine_accelerators=""
    
    # Default number of vCPUs per SB/VM:
    # unspecified or 0                --> will be set to 1
    # < 0                             --> will be set to the actual number of physical cores
    # > 0 <= number of physical cores --> will be set to the specified number
    # > number of physical cores      --> will be set to the actual number of physical cores
    default_vcpus = 1
    
    # Default maximum number of vCPUs per SB/VM:
    # unspecified or == 0             --> will be set to the actual number of physical cores or to the maximum number
    #                                     of vCPUs supported by KVM if that number is exceeded
    # > 0 <= number of physical cores --> will be set to the specified number
    # > number of physical cores      --> will be set to the actual number of physical cores or to the maximum number
    #                                     of vCPUs supported by KVM if that number is exceeded
    # WARNING: Depending of the architecture, the maximum number of vCPUs supported by KVM is used when
    # the actual number of physical cores is greater than it.
    # WARNING: Be aware that this value impacts the virtual machine's memory footprint and CPU
    # the hotplug functionality. For example, `default_maxvcpus = 240` specifies that until 240 vCPUs
    # can be added to a SB/VM, but the memory footprint will be big. Another example, with
    # `default_maxvcpus = 8` the memory footprint will be small, but 8 will be the maximum number of
    # vCPUs supported by the SB/VM. In general, we recommend that you do not edit this variable,
    # unless you know what are you doing.
    default_maxvcpus = 0
    
    # Bridges can be used to hot plug devices.
    # Limitations:
    # * Currently only pci bridges are supported
    # * Until 30 devices per bridge can be hot plugged.
    # * Until 5 PCI bridges can be cold plugged per VM.
    #   This limitation could be a bug in qemu or in the kernel
    # Default number of bridges per SB/VM:
    # unspecified or 0   --> will be set to 1
    # > 1 <= 5           --> will be set to the specified number
    # > 5                --> will be set to 5
    default_bridges = 1
    
    # Default memory size in MiB for SB/VM.
    # If unspecified then it will be set 2048 MiB.
    default_memory = 2048
    #
    # Default memory slots per SB/VM.
    # If unspecified then it will be set 10.
    # This is will determine the times that memory will be hotadded to sandbox/VM.
    #memory_slots = 10
    
    # The size in MiB will be plused to max memory of hypervisor.
    # It is the memory address space for the NVDIMM devie.
    # If set block storage driver (block_device_driver) to "nvdimm",
    # should set memory_offset to the size of block device.
    # Default 0
    #memory_offset = 0
    
    # Specifies virtio-mem will be enabled or not.
    # Please note that this option should be used with the command
    # "echo 1 > /proc/sys/vm/overcommit_memory".
    # Default false
    #enable_virtio_mem = true
    
    # Disable block device from being used for a container's rootfs.
    # In case of a storage driver like devicemapper where a container's 
    # root file system is backed by a block device, the block device is passed
    # directly to the hypervisor for performance reasons. 
    # This flag prevents the block device from being passed to the hypervisor, 
    # 9pfs is used instead to pass the rootfs.
    disable_block_device_use = false
    
    # Shared file system type:
    #   - virtio-9p (default)
    #   - virtio-fs
    shared_fs = "virtio-9p"
    
    # Path to vhost-user-fs daemon.
    virtio_fs_daemon = "/usr/bin/virtiofsd"
    
    # Default size of DAX cache in MiB
    virtio_fs_cache_size = 1024
    
    # Extra args for virtiofsd daemon
    #
    # Format example:
    #   ["-o", "arg1=xxx,arg2", "-o", "hello world", "--arg3=yyy"]
    #
    # see `virtiofsd -h` for possible options.
    virtio_fs_extra_args = []
    
    # Cache mode:
    #
    #  - none
    #    Metadata, data, and pathname lookup are not cached in guest. They are
    #    always fetched from host and any changes are immediately pushed to host.
    #
    #  - auto
    #    Metadata and pathname lookup cache expires after a configured amount of
    #    time (default is 1 second). Data is cached while the file is open (close
    #    to open consistency).
    #
    #  - always
    #    Metadata, data, and pathname lookup are cached in guest and never expire.
    virtio_fs_cache = "always"
    
    # Block storage driver to be used for the hypervisor in case the container
    # rootfs is backed by a block device. This is virtio-scsi, virtio-blk
    # or nvdimm.
    block_device_driver = "virtio-scsi"
    
    # Specifies cache-related options will be set to block devices or not.
    # Default false
    #block_device_cache_set = true
    
    # Specifies cache-related options for block devices.
    # Denotes whether use of O_DIRECT (bypass the host page cache) is enabled.
    # Default false
    #block_device_cache_direct = true
    
    # Specifies cache-related options for block devices.
    # Denotes whether flush requests for the device are ignored.
    # Default false
    #block_device_cache_noflush = true
    
    # Enable iothreads (data-plane) to be used. This causes IO to be
    # handled in a separate IO thread. This is currently only implemented
    # for SCSI.
    #
    enable_iothreads = false
    
    # Enable pre allocation of VM RAM, default false
    # Enabling this will result in lower container density
    # as all of the memory will be allocated and locked
    # This is useful when you want to reserve all the memory
    # upfront or in the cases where you want memory latencies
    # to be very predictable
    # Default false
    #enable_mem_prealloc = true
    
    # Enable huge pages for VM RAM, default false
    # Enabling this will result in the VM memory
    # being allocated using huge pages.
    # This is useful when you want to use vhost-user network
    # stacks within the container. This will automatically 
    # result in memory pre allocation
    #enable_hugepages = true
    
    # Enable vhost-user storage device, default false
    # Enabling this will result in some Linux reserved block type
    # major range 240-254 being chosen to represent vhost-user devices.
    enable_vhost_user_store = false
    
    # The base directory specifically used for vhost-user devices.
    # Its sub-path "block" is used for block devices; "block/sockets" is
    # where we expect vhost-user sockets to live; "block/devices" is where
    # simulated block device nodes for vhost-user devices to live.
    vhost_user_store_path = "/var/run/kata-containers/vhost-user"
    
    # Enable file based guest memory support. The default is an empty string which
    # will disable this feature. In the case of virtio-fs, this is enabled
    # automatically and '/dev/shm' is used as the backing folder.
    # This option will be ignored if VM templating is enabled.
    #file_mem_backend = ""
    
    # Enable swap of vm memory. Default false.
    # The behaviour is undefined if mem_prealloc is also set to true
    #enable_swap = true
    
    # This option changes the default hypervisor and kernel parameters
    # to enable debug output where available. This extra output is added
    # to the proxy logs, but only when proxy debug is also enabled.
    # 
    # Default false
    #enable_debug = true
    
    # Disable the customizations done in the runtime when it detects
    # that it is running on top a VMM. This will result in the runtime
    # behaving as it would when running on bare metal.
    # 
    #disable_nesting_checks = true
    
    # This is the msize used for 9p shares. It is the number of bytes 
    # used for 9p packet payload.
    #msize_9p = 8192
    
    # If true and vsocks are supported, use vsocks to communicate directly
    # with the agent and no proxy is started, otherwise use unix
    # sockets and start a proxy to communicate with the agent.
    # Default false
    #use_vsock = true
    
    # If false and nvdimm is supported, use nvdimm device to plug guest image.
    # Otherwise virtio-block device is used.
    # Default is false
    #disable_image_nvdimm = true
    
    # VFIO devices are hotplugged on a bridge by default. 
    # Enable hotplugging on root bus. This may be required for devices with
    # a large PCI bar, as this is a current limitation with hotplugging on 
    # a bridge. This value is valid for "pc" machine type.
    # Default false
    #hotplug_vfio_on_root_bus = true
    
    # Before hot plugging a PCIe device, you need to add a pcie_root_port device.
    # Use this parameter when using some large PCI bar devices, such as Nvidia GPU
    # The value means the number of pcie_root_port
    # This value is valid when hotplug_vfio_on_root_bus is true and machine_type is "q35"
    # Default 0
    #pcie_root_port = 2
    
    # If vhost-net backend for virtio-net is not desired, set to true. Default is false, which trades off
    # security (vhost-net runs ring0) for network I/O performance. 
    #disable_vhost_net = true
    
    #
    # Default entropy source.
    # The path to a host source of entropy (including a real hardware RNG)
    # /dev/urandom and /dev/random are two main options.
    # Be aware that /dev/random is a blocking source of entropy.  If the host
    # runs out of entropy, the VMs boot time will increase leading to get startup
    # timeouts.
    # The source of entropy /dev/urandom is non-blocking and provides a
    # generally acceptable source of entropy. It should work well for pretty much
    # all practical purposes.
    #entropy_source= "/dev/urandom"
    
    # Path to OCI hook binaries in the *guest rootfs*.
    # This does not affect host-side hooks which must instead be added to
    # the OCI spec passed to the runtime.
    #
    # You can create a rootfs with hooks by customizing the osbuilder scripts:
    # https://github.com/kata-containers/osbuilder
    #
    # Hooks must be stored in a subdirectory of guest_hook_path according to their
    # hook type, i.e. "guest_hook_path/{prestart,postart,poststop}".
    # The agent will scan these directories for executable files and add them, in
    # lexicographical order, to the lifecycle of the guest container.
    # Hooks are executed in the runtime namespace of the guest. See the official documentation:
    # https://github.com/opencontainers/runtime-spec/blob/v1.0.1/config.md#posix-platform-hooks
    # Warnings will be logged if any error is encountered will scanning for hooks,
    # but it will not abort container execution.
    #guest_hook_path = "/usr/share/oci/hooks"
    
    [factory]
    # VM templating support. Once enabled, new VMs are created from template
    # using vm cloning. They will share the same initial kernel, initramfs and
    # agent memory by mapping it readonly. It helps speeding up new container
    # creation and saves a lot of memory if there are many kata containers running
    # on the same host.
    #
    # When disabled, new VMs are created from scratch.
    #
    # Note: Requires "initrd=" to be set ("image=" is not supported).
    #
    # Default false
    #enable_template = true
    
    # Specifies the path of template.
    #
    # Default "/run/vc/vm/template"
    #template_path = "/run/vc/vm/template"
    
    # The number of caches of VMCache:
    # unspecified or == 0   --> VMCache is disabled
    # > 0                   --> will be set to the specified number
    #
    # VMCache is a function that creates VMs as caches before using it.
    # It helps speed up new container creation.
    # The function consists of a server and some clients communicating
    # through Unix socket.  The protocol is gRPC in protocols/cache/cache.proto.
    # The VMCache server will create some VMs and cache them by factory cache.
    # It will convert the VM to gRPC format and transport it when gets
    # requestion from clients.
    # Factory grpccache is the VMCache client.  It will request gRPC format
    # VM and convert it back to a VM.  If VMCache function is enabled,
    # kata-runtime will request VM from factory grpccache when it creates
    # a new sandbox.
    #
    # Default 0
    #vm_cache_number = 0
    
    # Specify the address of the Unix socket that is used by VMCache.
    #
    # Default /var/run/kata-containers/cache.sock
    #vm_cache_endpoint = "/var/run/kata-containers/cache.sock"
    
    [proxy.kata]
    path = "/usr/libexec/kata-containers/kata-proxy"
    
    # If enabled, proxy messages will be sent to the system log
    # (default: disabled)
    #enable_debug = true
    
    [shim.kata]
    path = "/usr/libexec/kata-containers/kata-shim"
    
    # If enabled, shim messages will be sent to the system log
    # (default: disabled)
    #enable_debug = true
    
    # If enabled, the shim will create opentracing.io traces and spans.
    # (See https://www.jaegertracing.io/docs/getting-started).
    #
    # Note: By default, the shim runs in a separate network namespace. Therefore,
    # to allow it to send trace details to the Jaeger agent running on the host,
    # it is necessary to set 'disable_new_netns=true' so that it runs in the host
    # network namespace.
    #
    # (default: disabled)
    #enable_tracing = true
    
    [agent.kata]
    # If enabled, make the agent display debug-level messages.
    # (default: disabled)
    #enable_debug = true
    
    # Enable agent tracing.
    #
    # If enabled, the default trace mode is "dynamic" and the
    # default trace type is "isolated". The trace mode and type are set
    # explicity with the `trace_type=` and `trace_mode=` options.
    #
    # Notes:
    #
    # - Tracing is ONLY enabled when `enable_tracing` is set: explicitly
    #   setting `trace_mode=` and/or `trace_type=` without setting `enable_tracing`
    #   will NOT activate agent tracing.
    #
    # - See https://github.com/kata-containers/agent/blob/master/TRACING.md for
    #   full details.
    #
    # (default: disabled)
    #enable_tracing = true
    #
    #trace_mode = "dynamic"
    #trace_type = "isolated"
    
    # Comma separated list of kernel modules and their parameters.
    # These modules will be loaded in the guest kernel using modprobe(8).
    # The following example can be used to load two kernel modules with parameters
    #  - kernel_modules=["e1000e InterruptThrottleRate=3000,3000,3000 EEE=1", "i915 enable_ppgtt=0"]
    # The first word is considered as the module name and the rest as its parameters.
    # Container will not be started when:
    #  * A kernel module is specified and the modprobe command is not installed in the guest
    #    or it fails loading the module.
    #  * The module is not available in the guest or it doesn't met the guest kernel
    #    requirements, like architecture and version.
    #
    kernel_modules=[]
    
    
    [netmon]
    # If enabled, the network monitoring process gets started when the
    # sandbox is created. This allows for the detection of some additional
    # network being added to the existing network namespace, after the
    # sandbox has been created.
    # (default: disabled)
    #enable_netmon = true
    
    # Specify the path to the netmon binary.
    path = "/usr/libexec/kata-containers/kata-netmon"
    
    # If enabled, netmon messages will be sent to the system log
    # (default: disabled)
    #enable_debug = true
    
    [runtime]
    # If enabled, the runtime will log additional debug messages to the
    # system log
    # (default: disabled)
    #enable_debug = true
    #
    # Internetworking model
    # Determines how the VM should be connected to the
    # the container network interface
    # Options:
    #
    #   - macvtap
    #     Used when the Container network interface can be bridged using
    #     macvtap.
    #
    #   - none
    #     Used when customize network. Only creates a tap device. No veth pair.
    #
    #   - tcfilter
    #     Uses tc filter rules to redirect traffic from the network interface
    #     provided by plugin to a tap interface connected to the VM.
    #
    internetworking_model="tcfilter"
    
    # disable guest seccomp
    # Determines whether container seccomp profiles are passed to the virtual
    # machine and applied by the kata agent. If set to true, seccomp is not applied
    # within the guest
    # (default: true)
    disable_guest_seccomp=true
    
    # If enabled, the runtime will create opentracing.io traces and spans.
    # (See https://www.jaegertracing.io/docs/getting-started).
    # (default: disabled)
    #enable_tracing = true
    
    # If enabled, the runtime will not create a network namespace for shim and hypervisor processes.
    # This option may have some potential impacts to your host. It should only be used when you know what you're doing.
    # `disable_new_netns` conflicts with `enable_netmon`
    # `disable_new_netns` conflicts with `internetworking_model=tcfilter` and `internetworking_model=macvtap`. It works only
    # with `internetworking_model=none`. The tap device will be in the host network namespace and can connect to a bridge
    # (like OVS) directly.
    # If you are using docker, `disable_new_netns` only works with `docker run --net=none`
    # (default: false)
    #disable_new_netns = true
    
    # if enabled, the runtime will add all the kata processes inside one dedicated cgroup.
    # The container cgroups in the host are not created, just one single cgroup per sandbox.
    # The runtime caller is free to restrict or collect cgroup stats of the overall Kata sandbox.
    # The sandbox cgroup path is the parent cgroup of a container with the PodSandbox annotation.
    # The sandbox cgroup is constrained if there is no container type annotation.
    # See: https://godoc.org/github.com/kata-containers/runtime/virtcontainers#ContainerType
    sandbox_cgroup_only=false
    
    # Enabled experimental feature list, format: ["a", "b"].
    # Experimental features are features not stable enough for production,
    # they may break compatibility, and are prepared for a big version bump.
    # Supported experimental features:
    # (default: [])
    experimental=[]
    
    --- # KSM throttler ## version Output of "`/usr/libexec/kata-ksm-throttler/kata-ksm-throttler --version`":
    
    kata-ksm-throttler version 1.11.1-60526f8
    
    Output of "`/usr/lib/systemd/system/kata-ksm-throttler.service --version`":
    
    ./kata-collect-data.sh: line 178: /usr/lib/systemd/system/kata-ksm-throttler.service: Permission denied
    
    ## systemd service # Image details
    yaml
    ---
    osbuilder:
      url: "https://github.com/kata-containers/osbuilder"
      version: "unknown"
    rootfs-creation-time: "2020-06-09T05:36:02.623014927+0000Z"
    description: "osbuilder rootfs"
    file-format-version: "0.0.2"
    architecture: "x86_64"
    base-distro:
      name: "Clear"
      version: "33320"
      packages:
        default:
          - "chrony"
          - "iptables-bin"
          - "kmod-bin"
          - "libudev0-shim"
          - "systemd"
          - "util-linux-bin"
        extra:
    
    agent:
      url: "https://github.com/kata-containers/agent"
      name: "kata-agent"
      version: "1.11.1-34a000d0cf10c2e6bf1b46cd2bbfaa911de62679"
      agent-is-init-daemon: "no"
    
    --- # Initrd details No initrd --- # Logfiles ## Runtime logs Recent runtime problems found in system journal:
    
    time="2020-06-29T05:29:27.57784593Z" level=error msg="Invalid command \"kata-collect-data\"" arch=amd64 name=kata-runtime pid=3759572 source=runtime
    
    ## Proxy logs No recent proxy problems found in system journal. ## Shim logs No recent shim problems found in system journal. ## Throttler logs No recent throttler problems found in system journal. --- # Container manager details No `docker` Have `kubectl` ## Kubernetes Output of "`kubectl version`":
    
    Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.9", GitCommit:"2e808b7cb054ee242b68e62455323aa783991f03", GitTreeState:"archive", BuildDate:"2020-01-21T20:41:54Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
    The connection to the server 127.0.0.1:9999 was refused - did you specify the right host or port?
    
    Output of "`kubectl config view`":
    
    apiVersion: v1
    clusters: []
    contexts: []
    current-context: ""
    kind: Config
    preferences: {}
    users: []
    
    Output of "`systemctl show kubelet`":
    
    Type=simple
    Restart=on-failure
    NotifyAccess=none
    RestartUSec=100ms
    TimeoutStartUSec=1min 30s
    TimeoutStopUSec=1min 30s
    WatchdogUSec=0
    WatchdogTimestamp=Mon 2020-06-29 19:40:47 UTC
    WatchdogTimestampMonotonic=5426351159947
    StartLimitInterval=10000000
    StartLimitBurst=5
    StartLimitAction=none
    FailureAction=none
    PermissionsStartOnly=no
    RootDirectoryStartOnly=no
    RemainAfterExit=no
    GuessMainPID=yes
    MainPID=347646
    ControlPID=0
    FileDescriptorStoreMax=0
    StatusErrno=0
    Result=success
    ExecMainStartTimestamp=Mon 2020-06-29 19:40:47 UTC
    ExecMainStartTimestampMonotonic=5426351159899
    ExecMainExitTimestampMonotonic=0
    ExecMainPID=347646
    ExecMainCode=0
    ExecMainStatus=0
    ExecStart={ path=/usr/bin/kubelet ; argv[]=/usr/bin/kubelet $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBELET_ADDRESS $KUBELET_PORT $KUBELET_HOSTNAME $KUBE_ALLOW_PRIV $KUBELET_ARGS ; ignore_errors=no ; start_time=[Mon 2020-06-29 19:40:47 UTC] ; stop_time=[n/a] ; pid=347646 ; code=(null) ; status=0/0 }
    Slice=system.slice
    ControlGroup=/system.slice/kubelet.service
    MemoryCurrent=56438784
    TasksCurrent=59
    Delegate=no
    CPUAccounting=no
    CPUShares=18446744073709551615
    StartupCPUShares=18446744073709551615
    CPUQuotaPerSecUSec=infinity
    BlockIOAccounting=no
    BlockIOWeight=18446744073709551615
    StartupBlockIOWeight=18446744073709551615
    MemoryAccounting=no
    MemoryLimit=18446744073709551615
    DevicePolicy=auto
    TasksAccounting=no
    TasksMax=18446744073709551615
    EnvironmentFile=/etc/sysconfig/kubelet (ignore_errors=yes)
    UMask=0022
    LimitCPU=18446744073709551615
    LimitFSIZE=18446744073709551615
    LimitDATA=18446744073709551615
    LimitSTACK=18446744073709551615
    LimitCORE=18446744073709551615
    LimitRSS=18446744073709551615
    LimitNOFILE=65536
    LimitAS=18446744073709551615
    LimitNPROC=767294
    LimitMEMLOCK=65536
    LimitLOCKS=18446744073709551615
    LimitSIGPENDING=767294
    LimitMSGQUEUE=819200
    LimitNICE=0
    LimitRTPRIO=0
    LimitRTTIME=18446744073709551615
    WorkingDirectory=/var/lib/kubelet
    OOMScoreAdjust=0
    Nice=0
    IOScheduling=0
    CPUSchedulingPolicy=0
    CPUSchedulingPriority=0
    TimerSlackNSec=50000
    CPUSchedulingResetOnFork=no
    NonBlocking=no
    StandardInput=null
    StandardOutput=journal
    StandardError=inherit
    TTYReset=no
    TTYVHangup=no
    TTYVTDisallocate=no
    SyslogPriority=30
    SyslogLevelPrefix=yes
    SecureBits=0
    CapabilityBoundingSet=18446744073709551615
    AmbientCapabilities=0
    MountFlags=0
    PrivateTmp=no
    PrivateNetwork=no
    PrivateDevices=no
    ProtectHome=no
    ProtectSystem=no
    SameProcessGroup=no
    IgnoreSIGPIPE=yes
    NoNewPrivileges=no
    SystemCallErrorNumber=0
    RuntimeDirectoryMode=0755
    KillMode=control-group
    KillSignal=15
    SendSIGKILL=yes
    SendSIGHUP=no
    Id=kubelet.service
    Names=kubelet.service
    Requires=kube-proxy.service basic.target -.mount system.slice containerd.service
    WantedBy=multi-user.target
    Conflicts=shutdown.target
    Before=shutdown.target multi-user.target
    After=systemd-journald.socket basic.target -.mount system.slice containerd.service
    RequiresMountsFor=/var/lib/kubelet
    Documentation=https://github.com/kubernetes/kubernetes
    Description=Kubernetes Kubelet Server
    LoadState=loaded
    ActiveState=active
    SubState=running
    FragmentPath=/usr/lib/systemd/system/kubelet.service
    UnitFileState=enabled
    UnitFilePreset=disabled
    InactiveExitTimestamp=Mon 2020-06-29 19:40:47 UTC
    InactiveExitTimestampMonotonic=5426351159966
    ActiveEnterTimestamp=Mon 2020-06-29 19:40:47 UTC
    ActiveEnterTimestampMonotonic=5426351159966
    ActiveExitTimestamp=Mon 2020-06-29 19:40:47 UTC
    ActiveExitTimestampMonotonic=5426351081702
    InactiveEnterTimestamp=Mon 2020-06-29 19:40:47 UTC
    InactiveEnterTimestampMonotonic=5426351090272
    CanStart=yes
    CanStop=yes
    CanReload=no
    CanIsolate=no
    StopWhenUnneeded=no
    RefuseManualStart=no
    RefuseManualStop=no
    AllowIsolate=no
    DefaultDependencies=yes
    OnFailureJobMode=replace
    IgnoreOnIsolate=no
    IgnoreOnSnapshot=no
    NeedDaemonReload=no
    JobTimeoutUSec=0
    JobTimeoutAction=none
    ConditionResult=yes
    AssertResult=yes
    ConditionTimestamp=Mon 2020-06-29 19:40:47 UTC
    ConditionTimestampMonotonic=5426351141054
    AssertTimestamp=Mon 2020-06-29 19:40:47 UTC
    AssertTimestampMonotonic=5426351141054
    Transient=no
    
    No `crio` Have `containerd` ## containerd Output of "`containerd --version`":
    
    containerd github.com/containerd/containerd v1.3.3 d76c121f76a5fc8a462dc64594aea72fe18e1178
    
    Output of "`systemctl show containerd`":
    
    Type=simple
    Restart=always
    NotifyAccess=none
    RestartUSec=5s
    TimeoutStartUSec=1min 30s
    TimeoutStopUSec=1min 30s
    WatchdogUSec=0
    WatchdogTimestamp=Mon 2020-06-29 19:40:47 UTC
    WatchdogTimestampMonotonic=5426351139349
    StartLimitInterval=10000000
    StartLimitBurst=5
    StartLimitAction=none
    FailureAction=none
    PermissionsStartOnly=no
    RootDirectoryStartOnly=no
    RemainAfterExit=no
    GuessMainPID=yes
    MainPID=347640
    ControlPID=0
    FileDescriptorStoreMax=0
    StatusErrno=0
    Result=success
    ExecMainStartTimestamp=Mon 2020-06-29 19:40:47 UTC
    ExecMainStartTimestampMonotonic=5426351139320
    ExecMainExitTimestampMonotonic=0
    ExecMainPID=347640
    ExecMainCode=0
    ExecMainStatus=0
    ExecStartPre={ path=/sbin/modprobe ; argv[]=/sbin/modprobe overlay ; ignore_errors=no ; start_time=[Mon 2020-06-29 19:40:47 UTC] ; stop_time=[Mon 2020-06-29 19:40:47 UTC] ; pid=347638 ; code=exited ; status=0 }
    ExecStart={ path=/usr/local/bin/containerd ; argv[]=/usr/local/bin/containerd ; ignore_errors=no ; start_time=[Mon 2020-06-29 19:40:47 UTC] ; stop_time=[n/a] ; pid=347640 ; code=(null) ; status=0/0 }
    Slice=system.slice
    ControlGroup=/system.slice/containerd.service
    MemoryCurrent=58553622528
    TasksCurrent=355
    Delegate=yes
    CPUAccounting=no
    CPUShares=18446744073709551615
    StartupCPUShares=18446744073709551615
    CPUQuotaPerSecUSec=infinity
    BlockIOAccounting=no
    BlockIOWeight=18446744073709551615
    StartupBlockIOWeight=18446744073709551615
    MemoryAccounting=no
    MemoryLimit=18446744073709551615
    DevicePolicy=auto
    TasksAccounting=no
    TasksMax=18446744073709551615
    UMask=0022
    LimitCPU=18446744073709551615
    LimitFSIZE=18446744073709551615
    LimitDATA=18446744073709551615
    LimitSTACK=18446744073709551615
    LimitCORE=18446744073709551615
    LimitRSS=18446744073709551615
    LimitNOFILE=1048576
    LimitAS=18446744073709551615
    LimitNPROC=18446744073709551615
    LimitMEMLOCK=65536
    LimitLOCKS=18446744073709551615
    LimitSIGPENDING=767294
    LimitMSGQUEUE=819200
    LimitNICE=0
    LimitRTPRIO=0
    LimitRTTIME=18446744073709551615
    OOMScoreAdjust=-999
    Nice=0
    IOScheduling=0
    CPUSchedulingPolicy=0
    CPUSchedulingPriority=0
    TimerSlackNSec=50000
    CPUSchedulingResetOnFork=no
    NonBlocking=no
    StandardInput=null
    StandardOutput=journal
    StandardError=inherit
    TTYReset=no
    TTYVHangup=no
    TTYVTDisallocate=no
    SyslogPriority=30
    SyslogLevelPrefix=yes
    SecureBits=0
    CapabilityBoundingSet=18446744073709551615
    AmbientCapabilities=0
    MountFlags=0
    PrivateTmp=no
    PrivateNetwork=no
    PrivateDevices=no
    ProtectHome=no
    ProtectSystem=no
    SameProcessGroup=no
    IgnoreSIGPIPE=yes
    NoNewPrivileges=no
    SystemCallErrorNumber=0
    RuntimeDirectoryMode=0755
    KillMode=process
    KillSignal=15
    SendSIGKILL=yes
    SendSIGHUP=no
    Id=containerd.service
    Names=containerd.service
    Requires=basic.target system.slice
    RequiredBy=kubelet.service
    WantedBy=multi-user.target
    Conflicts=shutdown.target
    Before=shutdown.target kubelet.service multi-user.target
    After=systemd-journald.socket basic.target system.slice network.target
    Documentation=https://containerd.io
    Description=containerd container runtime
    LoadState=loaded
    ActiveState=active
    SubState=running
    FragmentPath=/etc/systemd/system/containerd.service
    UnitFileState=enabled
    UnitFilePreset=disabled
    InactiveExitTimestamp=Mon 2020-06-29 19:40:47 UTC
    InactiveExitTimestampMonotonic=5426351126999
    ActiveEnterTimestamp=Mon 2020-06-29 19:40:47 UTC
    ActiveEnterTimestampMonotonic=5426351139392
    ActiveExitTimestamp=Mon 2020-06-29 19:40:47 UTC
    ActiveExitTimestampMonotonic=5426351111933
    InactiveEnterTimestamp=Mon 2020-06-29 19:40:47 UTC
    InactiveEnterTimestampMonotonic=5426351125544
    CanStart=yes
    CanStop=yes
    CanReload=no
    CanIsolate=no
    StopWhenUnneeded=no
    RefuseManualStart=no
    RefuseManualStop=no
    AllowIsolate=no
    DefaultDependencies=yes
    OnFailureJobMode=replace
    IgnoreOnIsolate=no
    IgnoreOnSnapshot=no
    NeedDaemonReload=no
    JobTimeoutUSec=0
    JobTimeoutAction=none
    ConditionResult=yes
    AssertResult=yes
    ConditionTimestamp=Mon 2020-06-29 19:40:47 UTC
    ConditionTimestampMonotonic=5426351126565
    AssertTimestamp=Mon 2020-06-29 19:40:47 UTC
    AssertTimestampMonotonic=5426351126565
    Transient=no
    
    Output of "`cat /etc/containerd/config.toml`":
    
    # Originally version 1 was referred from https://github.com/kata-containers/documentation/blob/master/how-to/containerd-kata.md#configure-containerd-to-use-kata-containers
    # Used containerd config dump to generate version 2 config
    # Details of this config is in https://github.com/containerd/cri/blob/master/docs/config.md
    version = 2
    root = "/var/lib/containerd"
    state = "/run/containerd"
    plugin_dir = ""
    disabled_plugins = []
    required_plugins = []
    # The service unit file already have -999, keep it here for redundant
    oom_score = -999
    
    [grpc]
      address = "/run/containerd/containerd.sock"
      tcp_address = ""
      tcp_tls_cert = ""
      tcp_tls_key = ""
      uid = 0
      gid = 0
      max_recv_message_size = 16777216
      max_send_message_size = 16777216
    
    [ttrpc]
      address = ""
      uid = 0
      gid = 0
    
    [debug]
      address = ""
      uid = 0
      gid = 0
      level = "info"
    
    [metrics]
      address = "0.0.0.0:1338"
      grpc_histogram = false
    
    [cgroup]
      path = ""
    
    [timeouts]
      "io.containerd.timeout.shim.cleanup" = "5s"
      "io.containerd.timeout.shim.load" = "5s"
      "io.containerd.timeout.shim.shutdown" = "3s"
      "io.containerd.timeout.task.state" = "2s"
    
    [plugins]
      [plugins."io.containerd.gc.v1.scheduler"]
        pause_threshold = 0.02
        deletion_threshold = 0
        mutation_threshold = 100
        schedule_delay = "0s"
        startup_delay = "100ms"
      [plugins."io.containerd.grpc.v1.cri"]
        disable_tcp_service = true
        stream_server_address = "127.0.0.1"
        stream_server_port = "0"
        stream_idle_timeout = "4h0m0s"
        enable_selinux = false
        sandbox_image = "docker.ouroath.com:4443/yahoo-cloud/k8s.gcr.io/pause:3.1"
        stats_collect_period = 10
        enable_tls_streaming = false
        max_container_log_line_size = 16384
        disable_cgroup = false
        disable_apparmor = false
        restrict_oom_score_adj = false
        max_concurrent_downloads = 3
        disable_proc_mount = false
        [plugins."io.containerd.grpc.v1.cri".containerd]
          snapshotter = "devmapper"
          default_runtime_name = "runc"
          no_pivot = false
          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
            [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata]
              runtime_type = "io.containerd.kata.v2"
              privileged_without_host_devices = true
              [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata.options]
                ConfigPath = "/etc/kata-containers/configuration.toml"
            [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
              runtime_type = "io.containerd.runc.v2"
              privileged_without_host_devices = true
              [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
                NoPivotRoot = false
                NoNewKeyring = false
                ShimCgroup = ""
                IoUid = 0
                IoGid = 0
                BinaryName = ""
                Root = ""
                CriuPath = ""
                SystemdCgroup = true
                CriuImagePath = ""
                CriuWorkPath = ""
        [plugins."io.containerd.grpc.v1.cri".cni]
          bin_dir = "/opt/cni/bin"
          conf_dir = "/etc/cni/net.d"
          max_conf_num = 1
          conf_template = ""
        [plugins."io.containerd.grpc.v1.cri".registry]
          [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
            [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
              endpoint = ["https://registry-1.docker.io"]
        [plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
          tls_cert_file = ""
          tls_key_file = ""
      [plugins."io.containerd.internal.v1.opt"]
        path = "/opt/containerd"
      [plugins."io.containerd.internal.v1.restart"]
        interval = "10s"
      [plugins."io.containerd.metadata.v1.bolt"]
        content_sharing_policy = "shared"
      [plugins."io.containerd.monitor.v1.cgroups"]
        no_prometheus = false
      [plugins."io.containerd.runtime.v1.linux"]
        shim = "containerd-shim"
        runtime = "runc"
        runtime_root = ""
        no_shim = false
        shim_debug = false
        systemd_cgroup = true
      [plugins."io.containerd.runtime.v2.task"]
        platforms = ["linux/amd64"]
      [plugins."io.containerd.service.v1.diff-service"]
        default = ["walking"]
      [plugins."io.containerd.snapshotter.v1.devmapper"]
        root_path = ""
        pool_name = "sys-docker--pool"
        base_image_size = "100GB"
    
    --- # Packages No `dpkg` Have `rpm` Output of "`rpm -qa|egrep "(cc-oci-runtimecc-runtimerunv|kata-proxy|kata-runtime|kata-shim|kata-ksm-throttler|kata-containers-image|linux-container|qemu-)"`":
    
    qemu-vanilla-data-4.1.1+git.99c5874a9b-5.1.x86_64
    qemu-vanilla-bin-4.1.1+git.99c5874a9b-5.1.x86_64
    kata-proxy-1.11.1-5.1.x86_64
    kata-linux-container-debug-5.4.32.73-5.1.x86_64
    kata-shim-bin-1.11.1-5.1.x86_64
    qemu-vanilla-4.1.1+git.99c5874a9b-5.1.x86_64
    kata-proxy-bin-1.11.1-5.1.x86_64
    kata-containers-image-1.11.1-5.1.x86_64
    kata-runtime-1.11.1-5.1.x86_64
    kata-shim-1.11.1-5.1.x86_64
    kata-linux-container-5.4.32.73-5.1.x86_64
    kata-ksm-throttler-1.11.1-5.1.x86_64
    
    ---
    点赞 评论 复制链接分享
  • weixin_39637363 weixin_39637363 5月前

    are you running kata-shimv2 ?

    点赞 评论 复制链接分享
  • weixin_39976748 weixin_39976748 5月前

    yes

    点赞 评论 复制链接分享
  • weixin_39637363 weixin_39637363 5月前

    uhmm ok, that explains why I couldn't reproduce it with the CLI any thoughts ?

    点赞 评论 复制链接分享

相关推荐