weixin_39699163
weixin_39699163
2021-01-09 08:13

BUG: buildah does not use layer cache when building dockerfiles with --layers on COPY commands

BUG REPORT INFORMATION

Description

buildah bud does not use layer cache on COPY if the build does not occur in the same directory as it did previously.

In build environments the github repo is checked out into a different location for each build. But the layer cache for COPY commands gets ignored when the build command occurs in a different directory then the original one

Steps to reproduce the issue: 1. Have a dockerfile with a copy command and run buildah bud --layers 2. After it builds rerun the command to see that the caching worked. 3. Copy the folder to a new location and see that the cache is not used on the COPY commands

Describe the results you received:

No cache is used

Describe the results you expected:

Cache should be used

Output of rpm -q buildah or apt list buildah:


buildah-1.11.6-4.module+el8.1.1+5259+bcdd613a.x86_64

Output of buildah version:


Version:         1.11.6
Go Version:      go1.12.12
Image Spec:      1.0.1-dev
Runtime Spec:    1.0.1-dev
CNI Spec:        0.4.0
libcni Version:  
image Version:   5.0.0
Git Commit:      
Built:           Wed Dec 31 16:00:00 1969
OS/Arch:         linux/amd64

Output of podman version if reporting a podman build issue:


(paste your output here)

Output of cat /etc/*release:


NAME="Red Hat Enterprise Linux"
VERSION="8.1 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.1"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.1 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8.1:GA"
HOME_URL="https://www.redhat.com/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.1
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.1"
Red Hat Enterprise Linux release 8.1 (Ootpa)
Red Hat Enterprise Linux release 8.1 (Ootpa)

Output of uname -a:


Linux terefah1.fyre.ibm.com 4.18.0-147.5.1.el8_1.x86_64 #1 SMP Tue Jan 14 15:50:19 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:


# This file is is the configuration file for all tools
# that use the containers/storage library.
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver
driver = "overlay"

# Temporary storage location
runroot = "/var/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Size is used to set a maximum size of the container image.  Only supported by
# certain container storage drivers.
size = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
#mount_program = "/usr/bin/fuse-overlayfs"

# OverrideKernelCheck tells the driver to ignore kernel checks based on kernel version
override_kernel_check = "true"

# mountopt specifies comma separated list of extra mount options
# mountopt = "nodev,metacopy=on"

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to UIDs/GIDs as they should appear outside of the container, and
# the length of the range of UIDs/GIDs.  Additional mapped sets can be listed
# and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and the a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped container-level ID,
# until all of the entries have been used for maps.
#
# remap-user = "storage"
# remap-group = "storage"

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base.
# device.
# mkfsarg = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver 
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"

# If specified, use OSTree to deduplicate files with the overlay backend
ostree_repo = ""

# Set to skip a PRIVATE bind mount on the storage home directory.  Only supported by
# certain container storage drivers
skip_mount_home = "false"

该提问来源于开源项目:containers/buildah

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

19条回答

  • weixin_39998795 weixin_39998795 4月前

    Yes. One of the patches in the #2480 WIP that will change that.

    点赞 评论 复制链接分享
  • weixin_39961943 weixin_39961943 4月前

    This is intentional. if you want to change the default behaviour you can set the BUILDAH_LAYERS environment variable. export BUILDAH_LAYERS=true

    点赞 评论 复制链接分享
  • weixin_39961943 weixin_39961943 4月前

    Looks like I misunderstood the error.

    点赞 评论 复制链接分享
  • weixin_39699163 weixin_39699163 4月前

    sorry, I updated the title to make it a bit more clear

    点赞 评论 复制链接分享
  • weixin_39699163 weixin_39699163 4月前

    unfortunately this defect makes a build that should take 5 minutes take 2 hours. And when you have hundreds of builds a day, it breaks the infrastructure.

    点赞 评论 复制链接分享
  • weixin_39961943 weixin_39961943 4月前

    Does Docker do this correctly.

    I think what is happening is the timestamps are different on the new directory and we re-copy when the timestamps change.

    点赞 评论 复制链接分享
  • weixin_39699163 weixin_39699163 4月前

    docker does this correctly, otherwise the cache is pretty useless. I'm assuming docker uses a checksum, but maybe they use a timestamp of the file, not of some parent directory. Because to be clear the timestamp on the file being copied isn't different

    点赞 评论 复制链接分享
  • weixin_39961943 weixin_39961943 4月前

    Ok I attempted to create what you are describing. I copied content into a directory then ran a Dockerfile that looked like:

    
    cat /tmp/Dockerfile 
    FROM alpine
    COPY . /tmp
    RUN echo hello
    

    Now I run the following:

    
    $ cd test1
    $ buildah bud --layers -f /tmp/Dockerfile .
    STEP 1: FROM alpine
    STEP 2: COPY . /tmp
    --> c8f5b1f8d21
    STEP 3: RUN echo hello
    hello
    STEP 4: COMMIT
    --> 2fa582fe16f
    2fa582fe16f42912fd998382f4b5b9419bf79eed78a247c5467cdbe27809ac31
    [dwalsh test1]$ buildah bud --layers -f /tmp/Dockerfile .
    STEP 1: FROM alpine
    STEP 2: COPY . /tmp
    --> Using cache c8f5b1f8d21a485dd1ccff84ff115dc1fb2c7811532b0d92689b30ef948aa7a3
    STEP 3: RUN echo hello
    --> Using cache 2fa582fe16f42912fd998382f4b5b9419bf79eed78a247c5467cdbe27809ac31
    2fa582fe16f42912fd998382f4b5b9419bf79eed78a247c5467cdbe27809ac31
    

    Now I create a new directory and mv all of the content to the new directory preserving the time stamps

    
    $ cd ../test2/
    $ mv ../test1/* .
    $ buildah bud --layers -f /tmp/Dockerfile .
    STEP 1: FROM alpine
    STEP 2: COPY . /tmp
    --> Using cache c8f5b1f8d21a485dd1ccff84ff115dc1fb2c7811532b0d92689b30ef948aa7a3
    STEP 3: RUN echo hello
    --> Using cache 2fa582fe16f42912fd998382f4b5b9419bf79eed78a247c5467cdbe27809ac31
    2fa582fe16f42912fd998382f4b5b9419bf79eed78a247c5467cdbe27809ac31
    

    This looks like Buildah is doing the correct thing the cache is used?

    点赞 评论 复制链接分享
  • weixin_39699163 weixin_39699163 4月前

    I guess my steps are a bit different. Here is a repro I just tried:

    
    [~]$ sudo buildah bud --layers repro
    STEP 1: FROM alpine
    Getting image source signatures
    Copying blob c9b1b535fdd9 done
    Copying config e7d92cdc71 done
    Writing manifest to image destination
    Storing signatures
    STEP 2: RUN echo "test1"
    test1
    631cd7e913a15875055fa301357c7dea4d1615c213c8970fb3b8f2ef99d52699
    STEP 3: COPY tmp /tmp/
    7a9b7e84ada3e6018dc12968bdc33f75077f002c7277434319fdc3fd6c237375
    STEP 4: RUN echo "test2"
    test2
    STEP 5: COMMIT
    4d78da16191c4780486a056b3497426c5881566207d53d55520b9f6cae179679
    [~]$ mkdir tempdir
    [~]$ cp -R repro/ tempdir
    [~]$ sudo buildah bud --layers tempdir/repro
    STEP 1: FROM alpine
    STEP 2: RUN echo "test1"
    --> Using cache 631cd7e913a15875055fa301357c7dea4d1615c213c8970fb3b8f2ef99d52699
    STEP 3: COPY tmp /tmp/
    5c6015c7affb7bd6c9ed54ff983f11073c190460df2f0b0f1cd8b5acdb5f849d
    STEP 4: RUN echo "test2"
    test2
    STEP 5: COMMIT
    54719a513ecdd4bfbf1e8692a418a72d3ba281dd9c3cb4a780ef4692db78f631
    
    点赞 评论 复制链接分享
  • weixin_39699163 weixin_39699163 4月前

    so you can imagine if this is a github repro it would be the same as just checking out the github repo in another directory.

    点赞 评论 复制链接分享
  • weixin_39961943 weixin_39961943 4月前

    Right your cp -R is creating files/directories with new time stamps.

    点赞 评论 复制链接分享
  • weixin_39699163 weixin_39699163 4月前

    why is the cache based on timestamps anyways. It should be based on a checksum

    点赞 评论 复制链接分享
  • weixin_39699163 weixin_39699163 4月前

    That's the main issue here, as a git clone will always have new timestamps

    点赞 评论 复制链接分享
  • weixin_39669163 weixin_39669163 4月前

    Any think we can do to work around this issue? We are seeing the same thing with our CI/CD server where we check the code out from git and it always has new timestamps on the directories.

    
    STEP 1: FROM quay.io/redacted/ruby:2.5.5 AS build
    STEP 2: USER root
    --> Using cache abcd59f7743cfa5fe1c719ac54acd8c79a5d69d3869b5cc64daea3978f8e88fb
    STEP 3: RUN dnf install -y gcc gcc-c++ make libcurl make mysql-devel nodejs python tzdata nodejs-yarn &&     ln -s /usr/bin/yarnpkg /usr/bin/yarn
    --> Using cache a7cc164bcb56ef649daf8fa7e15c0a99b347c913fb9316450ed0a650a78d36fd
    STEP 4: WORKDIR /opt/apps/fooapp
    --> Using cache 922c202af0471fe87406bdf044c43588e51267d874657e7fc000daf995b5a30c
    STEP 5: COPY Gemfile Gemfile.common Gemfile.lock /opt/apps/fooapp/
    --> Using cache 68b55b90c1436719901dad40f5eb643620d39e0210bbcef8d5610af6d3d5956c
    STEP 6: COPY vendor /opt/apps/fooapp/vendor
    --> Using cache 9d5c9fe80444b6013302965a4e1d6d85418af1911e891e63b2f54f80d50291e4
    STEP 7: RUN bundle install --clean --local --jobs $(nproc) --path vendor/bundle
    --> Using cache 9485c9b3355820ea2bec06f83f13a6142ab3a5718d924ab6c2e0101a4269e39a
    STEP 8: COPY . /opt/apps/fooapp
    e217f2c47e3c683048f2dbc12f65629aa33c4b71433ff7ce27a61d7167cc5e07
    

    so rather than using the cache if it is already present, it recopies. We tested it in the same directory and it built the cache and then reused it since nothing changed.

    点赞 评论 复制链接分享
  • weixin_39699163 weixin_39699163 4月前

    the workaround is to use docker instead.

    点赞 评论 复制链接分享
  • weixin_39993623 weixin_39993623 4月前

    This feels like a big barrier to using buildah in a CI/CD system. I've been wondering why only the first layer in my build file was getting cached, and now I think I see why.

    点赞 评论 复制链接分享
  • weixin_39844525 weixin_39844525 4月前

    This would be nice to change - improved performance by not rebuilding the whole image every time in a CI/CD pipeline.

    点赞 评论 复制链接分享
  • weixin_39961943 weixin_39961943 4月前

    Here is what I see.

    
    $ ls -l test
    total 4
    -rw-r--r--. 1 dwalsh dwalsh 65 Aug  6 10:46 Dockerfile
    $ cat test/Dockerfile 
    FROM docker.io/library/alpine:latest
    RUN echo hello
    COPY . /test
    $ buildah bud --layers=true test
    STEP 1: FROM docker.io/library/alpine:latest
    STEP 2: RUN echo hello
    STEP 3: COPY . /test
    STEP 4: COMMIT
    --> b5f67517c95
    b5f67517c95ad7c9e862ced0bfd35f14ac731f2245f134cb9e6cb05033bb7d57
    

    Now I build for a second time and I see the cache working.

    
    $ buildah bud --layers=true test
    STEP 1: FROM docker.io/library/alpine:latest
    STEP 2: RUN echo hello
    --> Using cache a150f9fab2a152d0cb32d3e887264f753a6375239a299ed20503683b018a1ca0
    STEP 3: COPY . /test
    --> Using cache b5f67517c95ad7c9e862ced0bfd35f14ac731f2245f134cb9e6cb05033bb7d57
    b5f67517c95ad7c9e862ced0bfd35f14ac731f2245f134cb9e6cb05033bb7d57
    

    Now I am going to copy the directory to another directory. And run the build against the new directory.

    
    $ cp -R test test1
    $ buildah bud --layers=true test1
    STEP 1: FROM docker.io/library/alpine:latest
    STEP 2: RUN echo hello
    --> Using cache a150f9fab2a152d0cb32d3e887264f753a6375239a299ed20503683b018a1ca0
    STEP 3: COPY . /test
    STEP 4: COMMIT
    --> 466d048d293
    466d048d293833b7ed8bcaef298acde2e20181c4486a15faab012bfa08b82616
    

    Notice that the RUN USES Cache and the COPY does not use the CACHE.

    You are saying that the CACHE should have been still used?

    点赞 评论 复制链接分享
  • weixin_39961943 weixin_39961943 4月前

    Is the COPY command looking at the dates that files were created for cache?

    点赞 评论 复制链接分享

相关推荐