weixin_39647180 2020-11-22 04:48
浏览 0

numpy.ndarray vs. torch.Tensor

Issue by hbredin Friday Oct 30, 2020 at 17:10 GMT Originally opened as https://github.com/hbredin/pyannote-audio-v2/issues/55

I have noticed over the last few pr's that you often use numpy instead of torch tensors? Is this because of some performance improvement or are you just more familiar with the API? If we want to boost performance at some stage, it might be a good idea to convert everything to tensors including the random number operations as they are much slower than the torch ones.

Can you please clarify your thoughts on this in a dedicated issue?

Originally posted by in https://github.com/hbredin/pyannote-audio-v2/pull/37#issuecomment-719658266

该提问来源于开源项目:pyannote/pyannote-audio

  • 写回答

5条回答 默认 最新

  • weixin_39647180 2020-11-22 04:48
    关注

    Comment by mogwai Saturday Oct 31, 2020 at 15:02 GMT

    At some point, the library will benefit massively from torchscripting as many areas as possible. I'm going to double check this now, but I don't think we can JIT a lot of the code if it has numpy calls:

    py
    import torch
    
    .jit.script
    def test_method():
        t = torch.rand(1,3,40)
        t = t.numpy() 
        t += 1
        return t
    
    
    Traceback (most recent call last):
      File "play.py", line 10, in <module>
        def test_method():
      File "/home/harry/miniconda3/envs/v2/lib/python3.8/site-packages/torch/jit/_script.py", line 939, in script
        fn = torch._C._jit_script_compile(
    RuntimeError: 
    Tried to access nonexistent attribute or method 'numpy' of type 'Tensor'.:
      File "play.py", line 12
    def test_method():
        t = torch.rand(1,3,40)
        t = t.numpy()
            ~~~~~~~ </module>

    Another reason is that the torch.Tensor API is almost identical to the numpy one so considering that we have to use torch.Tensors for training, we may as well stick with them the whole time?

    评论

报告相同问题?