weixin_39517797
weixin_39517797
2020-12-02 11:51

Build dependencies without codegen ("cargo check") where possible

It would be great if we could build libstd as well as dependent crates (in cargo miri) without codegen ("check" mode). This would (a) save time, (b) avoid having to generate MIR that works both in Miri and when compiled normally, and (c) hopefully drastically simplify cross-building.

looked into this mostly due to (b), as this turned into quite a challenge for unwinding. I am copying from his post:

The long-term solution is to switch to using cargo check. This completely disables codegen in a way that's built into the compiler. We will no longer need to worry about any assumptions made by the codegen backend, since that code will simply never be executed.

However, this turned out to be more complicated than I anticipated. There are several moving parts here:

  1. Making xargo run cargo check

Currently, xargo unconditionally uses cargo build when compiling libstd. I've opened a PR to add a cargo_mode = "check" option to Xargo.toml: https://github.com/japaric/xargo/pull/267

  1. Making cargo-miri run cargo check.

This is complicated by the fact that we currently run cargo rustc in order to pass arguments to the last (and only to the last) invocation of rustc. Sadly, there appears to be no way to combine cargo check and cargo rustc - you can either use cargo rustc and have no way to skip codegen, or use cargo check and have no way to pass final-crate-specific rustc arguments.

To work around this issue, I've added a hack to serialize arguments to an environment variable. When we compile the final crate, we deserialize these arguments from the environment variable, and manually pass them to the rustc invocation.

  1. Making build dependencies use the proper libstd.

This is by far the trickiest part of this entire PR. When Cargo invokes our cargo-miri wrapper, we have three cases to worry about:

  1. Build dependencies (including build scripts): We pass through all arguments completely unmodified to rustc. Miri does not interact in any way with build scripts, so we want to treat them as if we were doing a normal run of cargo.

  2. Normal dependencies: We add our custom sysroot, but still invoke rustc. Since we are in cargo check mode, this will cause rustc to produce metadata for the normal dependenices of our runtime crate, using our custom libstd.

  3. The target itself (e.g. a test or a binary): We invoke miri, and actually begin execution.

Handles these three cases ensures that build dependencies are built using the normal platform libstd (as if cargo-miri did not exist), while normal dependencies and our target crate are built against our custom libstd).

Unfortunately, distinguishing between these three cases is a huge pain. I'm currently relying on the following tricks:

  1. Use CARGO_MANIFEST_DIR to detect when our target crate is being built.

The CARGO_MANIFEST_DIR is set by Cargo to the directory containing the manifest of the package currently being built. During the initial, non-wrapper invocation of cargo miri (e.g. when the user types cargo miri run on the command line), we determine the manifest directly for the crate they are building. We then compare this to CARGO_MANIFEST_DIR when we are being invoked by Cargo as a wrapper.

However, this fails to distinguish between a build.rs and the actual compilation, since both use the same manifest directory. This brings us to the second trick.

  1. Inspect the --emit= flag passed to rustc by Cargo.

This trick relies on the fact that we are using cargo check to build the crate. When Cargo compiles a build dependency, the --emit= flag will always contain link- this is because we always need to produce a runnable binary for build scripts.

When we are building normal dependencies, the --emit= flag will not contain link. This is how cargo check tells rustc not to perform codegen - the --emit= flag will be --emit=dep-info,metadata.

By checking for the presence of link, we can determine whether or not Cargo is trying to compile a build dependency. Note that the same crate could theoretically be build as both - e.g. you could add [dependencies] cc=x.y.z and [build-dependencies] cc=x.y.z to your Cargo.toml.

Adding more information to Cargo

While I believe assumptions behind the above Cargo hacks are fairly sound, this is not really a viable long-term solution. For example, Cargo could choose to stop passing the --emit flag if rustc would have it default to what Cargo already wanted.

Ideally, Cargo would set an environment variable to let us know which of the three cases we are in - target crate, build dependency, or normal dependency. I'm currently working on a PR that does just that.

Conclusion

I think our best path forward is to:

  1. Land cargo check support in xargo
  2. Merge some form of this PR, possible in multiple pieces. Depending on how long review takes, it might make sense to merge the eh_catch_typeinfo hack into rustc so that nightly users can have a (somewhat) working Miri again
  3. Work with the Cargo team to expose the information we need in a cleaner way. I think this information could be of use to other Cargo wrappers. In particular, build dependencies have very different semantics from regular dependencies (e.g. they will be built for a different target during cross-compilation), but wrappers have no clean way of determining which they are building.

该提问来源于开源项目:rust-lang/miri

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

12条回答

  • weixin_39762464 weixin_39762464 5月前

    Does that work when you use --target, but specify whatever target would otherwise have been used?

    Yes, cargo is dumb :)

    点赞 评论 复制链接分享
  • weixin_39548968 weixin_39548968 5月前

    Yes, cargo is dumb :)

    Things like that are why I'd (eventually) like to get Cargo to explicitly give us the information we need via an environment variable. I'm worried that Cargo could accidentally or deliberately break any of the weird quirks we might want to rely on, leaving us with no path forward.

    点赞 评论 复制链接分享
  • weixin_39517797 weixin_39517797 5月前

    Indeed, cargo build --target $host is not the same as cargo build. In fact, the --target variant is in some sense more sane as it properly distinguishes intermediate artifacts (build scripts) from the final artifact. Also see https://github.com/rust-lang/cargo/issues/4423.

    点赞 评论 复制链接分享
  • weixin_39762464 weixin_39762464 5月前

    I'm worried that Cargo could accidentally or deliberately break any of the weird quirks we might want to rely on, leaving us with no path forward.

    Xargo and rustc's build system currently depend on this to prevent build dependencies (linked to a different libstd) from becoming part of the sysroot.

    点赞 评论 复制链接分享
  • weixin_39548968 weixin_39548968 5月前

    Xargo and rustc's build system currently depend on this to prevent build dependencies (linked to a different libstd) from becoming part of the sysroot.

    Ah, okay. If we're already depending on that, I guess it doesn't matter if we add another dependency on it (at least for the time being).

    I'll see if I can update check to use that.

    点赞 评论 复制链接分享
  • weixin_39517797 weixin_39517797 5月前

    's xargo-check is now available in the latest released xargo.

    点赞 评论 复制链接分享
  • weixin_39517797 weixin_39517797 5月前

    So, as already mentioned I generally agree that we want "cargo check". However, I am not quite sure about the approach you chose to distinguish, inside cargo miri, the three cases. As mentioned, when --target is set, cargo already does the work for us of distinguishing "build dependencies" host architecture) from "normal" dependencies (target architecture); I think we should use that. That probably involves using RUSTFLAGS for some things where we currently immediately inject arguments in the rustc wrapper.

    Right now, we are using the presence of the flags passed via cargo rustc -- flags to distinguish the final binary (that we want to run) from everything else. What is the reason for you to switch to CARGO_MANIFEST_DIR for that?

    Also is there any chance cargo could tell us a bit more ("inside" cargo rustc) about what the crate is being built for?

    点赞 评论 复制链接分享
  • weixin_39548968 weixin_39548968 5月前

    What is the reason for you to switch to CARGO_MANIFEST_DIR for that?

    There's no way to combine cargo check and cargo rustc, so if we're doing a cago check build of the target crate, we need some other way to detect when we get invoked for the final crate.

    点赞 评论 复制链接分享
  • weixin_39517797 weixin_39517797 5月前

    But we could do cargo rustc and adjust the flags passed to rustc such that these are check-only builds, right?

    点赞 评论 复制链接分享
  • weixin_39548968 weixin_39548968 5月前

    : I tried that at first - it turned out to require even more hacks that this current PR does.

    点赞 评论 复制链接分享
  • weixin_39517797 weixin_39517797 5月前

    How do check mode and -Zno-codegen interact?

    Using cargo check will also require a careful check of things like crates with multiple binaries and cargo miri test to make sure what we do is at least not less sensible than before. We do not have a test suite of such cases.

    Also, what about using the crate type (bin vs lib) to distinguish things? Assuming we already recognize build dependencies because we use --target and RUSTFLAGS; binary crates should be the ones we actually want to run.

    点赞 评论 复制链接分享
  • weixin_39548968 weixin_39548968 5月前

    How do check mode and -Zno-codegen interact?

    I gave up on -Z no-codegen after seeing it produce an enormous number of linker errors on Windows. My guess is that it broke at some point in the past, and no one noticed.

    Also, what about using the crate type (bin vs lib) to distinguish things? Assuming we already recognize build dependencies because we use --target

    Does that work when you use --target, but specify whatever target would otherwise have been used? If not, is there some special Miri target we should use?

    点赞 评论 复制链接分享

相关推荐