weixin_39669163
weixin_39669163
2021-01-09 07:57

Buildah hangs on buldah commit when being called via buildah unshare

Description

Buildah hangs on buldah commit when being called via buildah unshare.

Steps to reproduce the issue: 1. Use this script:


#!/usr/bin/env bash
set -o errexit

FEDORA_VERSION=30
CONTAINER_LANG=en_US.UTF-8
CONTAINER=$(buildah from scratch)

MOUNTPOINT=$(buildah mount ${CONTAINER})

dnf install --disablerepo=* --enablerepo=fedora,updates \
            --installroot ${MOUNTPOINT} \
            --releasever ${FEDORA_VERSION} \
            bash coreutils dnf fedora-release-container glibc-minimal-langpack glibc-langpack-en \
            libcrypt procps-ng rootfiles rpm shadow-utils tar util-linux vim-minimal \
            --nodocs \
            --setopt install_weak_deps=false -y
dnf -y clean all --installroot=${MOUNTPOINT}

printf "tsflags=nodocs\n" >> ${MOUNTPOINT}/etc/dnf/dnf.conf
echo "%_install_langs $CONTAINER_LANG" > ${MOUNTPOINT}/etc/rpm/macros.image-language-conf
echo "# fstab intentionally empty for containers" > ${MOUNTPOINT}/etc/fstab

rm -fv ${MOUNTPOINT}/etc/localtime
ln -s ${MOUNTPOINT}/usr/share/zoneinfo/UTC ${MOUNTPOINT}/etc/localtime

rm -rfv ${MOUNTPOINT}/var/cache/* ${MOUNTPOINT}/var/log/* ${MOUNTPOINT}/tmp/*

buildah config --author ops.com \
               --env LANG=$CONTAINER_LANG \
               --user 1001 ${CONTAINER}

buildah commit ${CONTAINER} localhost/fedora:${FEDORA_VERSION}

buildah unmount ${CONTAINER}
buildah rm ${CONTAINER}

and run it with buildah unshare scripts/build_base_container.sh 2. Wait for buildah commit step. 3. Ctrl+C your hung buildah

Describe the results you received:


$ ps aux |grep buildah
jdoss    22194  0.3  0.2 684960 40948 pts/0    Sl+  19:00   0:00 buildah commit working-container localhost/fedora:30
jdoss    22229  0.0  0.0 215744   816 pts/1    S+   19:00   0:00 grep --color=auto buildah
jdoss    23614  0.0  0.1 750940 38340 pts/0    Sl+  18:56   0:00 buildah unshare scripts/build_base_container.sh
jdoss    23672  0.0  0.1 743832 38048 pts/0    Sl+  18:56   0:00 buildah-in-a-user-namespace unshare scripts/build_base_container.sh

You can see the it hung on buildah commit working-container localhost/fedora:30. It never finishes. Hitting ctrl+c results in a [buildah] <defunct> zombie process.

Describe the results you expected:

A super sweet localhost/fedora:30 container built via Buildah

Output of rpm -q buildah or apt list buildah:


$ rpm -q buildah
buildah-1.9.0-1.git00eb895.fc30.x86_64

This happens on 1.8.2-1.gite23314b.fc30 and buildah-1.7-18.git873f001.fc30.x86_64 too

Output of buildah version:


$ buildah version
Version:         1.9.0
Go Version:      go1.12.5
Image Spec:      1.0.0
Runtime Spec:    1.0.0
CNI Spec:        0.4.0
libcni Version:  
Git Commit:      
Built:           Wed Dec 31 18:00:00 1969
OS/Arch:         linux/amd64

Output of podman version if reporting a podman build issue:


$ podman version
Version:            1.4.0
RemoteAPI Version:  1
Go Version:         go1.12.5
OS/Arch:            linux/amd64

Output of cat /etc/*release:


$ cat /etc/*release
Fedora release 30 (Thirty)
NAME=Fedora
VERSION="30 (Workstation Edition)"
ID=fedora
VERSION_ID=30
VERSION_CODENAME=""
PLATFORM_ID="platform:f30"
PRETTY_NAME="Fedora 30 (Workstation Edition)"
ANSI_COLOR="0;34"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:30"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f30/system-administrators-guide/"
SUPPORT_URL="https://fedoraproject.org/wiki/Communicating_and_getting_help"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=30
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=30
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Workstation Edition"
VARIANT_ID=workstation
Fedora release 30 (Thirty)
Fedora release 30 (Thirtyu

Output of uname -a:


$ uname -a
Linux sts71 5.1.8-300.fc30.x86_64 #1 SMP Sun Jun 9 17:09:32 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:


$ cat /etc/containers/storage.conf 
# This file is is the configuration file for all tools
# that use the containers/storage library.
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver
driver = "overlay"

# Temporary storage location
runroot = "/var/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Size is used to set a maximum size of the container image.  Only supported by
# certain container storage drivers.
size = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
#mount_program = "/usr/bin/fuse-overlayfs"

# OverrideKernelCheck tells the driver to ignore kernel checks based on kernel version
override_kernel_check = "true"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev,metacopy=on"

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to UIDs/GIDs as they should appear outside of the container, and
# the length of the range of UIDs/GIDs.  Additional mapped sets can be listed
# and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and the a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped container-level ID,
# until all of the entries have been used for maps.
#
# remap-user = "storage"
# remap-group = "storage"

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base.
# device.
# mkfsarg = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver 
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"

# If specified, use OSTree to deduplicate files with the overlay backend
ostree_repo = ""

# Set to skip a PRIVATE bind mount on the storage home directory.  Only supported by
# certain container storage drivers
skip_mount_home = "false"

该提问来源于开源项目:containers/buildah

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

11条回答

  • weixin_39669163 weixin_39669163 4月前

    A quick follow-up. The script works when running it via the root user just fine. This also worked with buildah unshare back on Sat, May 11, 2019 as that is the last time I built this base image with it.

    点赞 评论 复制链接分享
  • weixin_39945816 weixin_39945816 4月前

    Thanks for the great report, . I can reproduce the issue with the mentioned script.

    点赞 评论 复制链接分享
  • weixin_39669163 weixin_39669163 4月前

    No problem ! Let me know if you need anything else to help track it down.

    点赞 评论 复制链接分享
  • weixin_39669163 weixin_39669163 4月前

    I was building another image that doesn't do this line in my script:

    rm -rfv ${MOUNTPOINT}/var/cache/* ${MOUNTPOINT}/var/log/* ${MOUNTPOINT}/tmp/* and it was able to finish without issues. It seems that removing files from the mount point is what causes things to hang up here.

    
    $ uname -a 
    Linux sts71 5.1.16-300.fc30.x86_64 #1 SMP Wed Jul 3 15:06:51 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
    $ buildah --version
    buildah version 1.9.0 (image-spec 1.0.0, runtime-spec 1.0.0)
    
    点赞 评论 复制链接分享
  • weixin_39945816 weixin_39945816 4月前

    , thanks for checking and apologies for my inactivity. I will have a look now.

    , could this be a fuse-overlay issue?

    点赞 评论 复制链接分享
  • weixin_39625975 weixin_39625975 4月前

    , could this be a fuse-overlay issue?

    it could be an issue there. We had an issue few months ago with fuse-overlayfs hanging with flocks.

    Were you able to reproduce it with the provided script?

    点赞 评论 复制链接分享
  • weixin_39625975 weixin_39625975 4月前

    the issue seems to be related to dnf install creating a symlink /etc/localtime that fully resolves to ~/.local/share/containers/storage/overlay/aaa537b35ac6f6aaf8396189c7bdd9d681cf6b454c1ae0a63877e1c7630dc321/merged/usr/share/zoneinfo/UTC.

    It causes fuse-overlayfs to hang as when it tries to open that file, the kernel will do another request to the fuse-overlayfs process but it won't be able to complete it as it is already blocked on the first request.

    The fuse-overlayfs hang can be solved by using threads, I am working on supporting them in https://github.com/containers/fuse-overlayfs/pull/88, and just added a patch to address this specific case.

    It is still an issue that there symlinks in the container image that points to the absolute host path. Even trying as root, I get (I've changed only the image name in your script):

    
    $ sudo podman run --rm tmp readlink  /etc/localtime
    /var/lib/containers/storage/overlay/c944874a04aab9f005f692b854e1121ac97b1c4bc79901d8097a34a9c8874910/merged/usr/share/zoneinfo/UTC
    
    点赞 评论 复制链接分享
  • weixin_39625975 weixin_39625975 4月前

    thanks for spotting it.

    So the issue is not in dnf, but the symlink is created as part of the script:

    
    ln -s ${MOUNTPOINT}/usr/share/zoneinfo/UTC ${MOUNTPOINT}/etc/localtime
    

    That is not correct as the target is outside of the container rootfs.

    It must be changed to:

    
    ln -s /usr/share/zoneinfo/UTC ${MOUNTPOINT}/etc/localtime
    
    点赞 评论 复制链接分享
  • weixin_39669163 weixin_39669163 4月前

    This makes total sense. I just tested it out in my script with the fixed symlink and it works as expect. Thanks a lot!!

    It was working a few months ago without issue with the incorrect symlink, so I expect how I was doing the symlink was working due to a bug of some sort that got fixed?

    点赞 评论 复制链接分享
  • weixin_39625975 weixin_39625975 4月前

    It was working a few months ago without issue with the incorrect symlink, so I expect how I was doing the symlink was working due to a bug of some sort that got fixed?

    yes I think it is related to changes in fuse-overlayfs. There were few ones that could have trigged the issue to appear.

    点赞 评论 复制链接分享
  • weixin_39669163 weixin_39669163 4月前

    and thank you both!!

    点赞 评论 复制链接分享

相关推荐