2020-11-24 16:09 阅读 1

running inference on wasm32?

I hope this isn't too off topic.

Is there anyway to run inference on wasm32?

The XY problem / use case is:

  1. define a single model in tch-rs
  2. do training on Cuda
  3. save model to file
  4. compile same tch-rs model to wasm32
  5. load model in tch-rs/wasm32, run inference in wasm32/browser


  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

5条回答 默认 最新

  • weixin_39777967 weixin_39777967 2020-11-24 16:09

    I recognize the answer is probably 'no', but instead of 'no' prefer an answer of the form: "you need to overcome the following obstacles ... to make this work"

    点赞 评论 复制链接分享
  • weixin_39640444 weixin_39640444 2020-11-24 16:09

    I'm not at all a wasm expert. That said the main difficulty would probably be to link to the libtorch shared library if that's your plan (recompiling the full pytorch codebase to wasm is likely to be harder). I think the nimtorch project was able to do something like this using nim + pytorch so overall it's likely to be possible. Anyway you will probably want to google things around a bit as there may be some already existing proof of concept showing rust compiled to wasm using a C shared library.

    点赞 评论 复制链接分享
  • weixin_39777967 weixin_39777967 2020-11-24 16:09

    I have previously, via LLVM + Emscriptsen, gotten Rust crates with C dependencies to work on wasm32. (Modify build.rs to target wasm32; get it to recompile C dependency from source; pray.)

    Thank you for the nimtorch link -- leveraging that, it should be straight forward to get 'aten' to compile to wasm32.

    For inference, I don't need auto diff, Adam, -- all I really need is basic tensor ops.

    I don't know how tightly coupled tch-rs and torch-sys are. I also don't know how you managed to auto generate the bindings.

    Is there a way to leverage your work to generate a "mini-tch-rs" that only generated bindings to aten?

    I think I can split my code into: crate foo-model: can do inference using either tch-rs or aten-rs crate foo-train: imports foo-model, uses tch-rs for training

    点赞 评论 复制链接分享
  • weixin_39640444 weixin_39640444 2020-11-24 16:09

    Splitting the aten vs non-aten part of tch-rs is probably not super straightforward (but certainly doable). Maybe you can first give a try at recompiling all the current dependencies with wasm32 if that's what it takes? Also if your goal is to do inference it would probably be a nice thing to support torch script modules: https://pytorch.org/tutorials/advanced/cpp_export.html An advantage is that such modules pack both the architecture and the weights. You can see how these are used in examples/jit or tests/jit_tests.rs.

    点赞 评论 复制链接分享
  • weixin_39777967 weixin_39777967 2020-11-24 16:09

    I believe aten -> wasm32 is doable because the nimtorch project has a line:

    conda create -n aten -c fragcolor aten={version} wasm

    I'm not confident about my ability to do "libtorch -> wasm32". In my efforts to reduce 4-5 second CUDA load time, I tried compiling libtorch from source -- and only got it working via compiling all of PyTorch and copying over the libtorch portion.

    I don't know anything about TorchScript, but compiling anything involving JIT to wasm32 seems non trivial.

    Thanks for the insights, I will think over this more.

    点赞 评论 复制链接分享