2020-12-05 22:09

Debugging tips

With regular pytorch tensors, I can just insert a breakpoint (I'm using PyCharm and VS Code, though debugging with VS Code is painfully slow) somewhere inside the model and just take a look at the values. For XLA tensors, I can never see the tensor values and doing something like logits.cpu()[0, 0] just results in a timeout...

Does anyone have nice debugging tips for torch-xla? (Other than prints. I know how to print)


  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答


  • weixin_39618956 weixin_39618956 5月前

    Timeout? It is likely compiling.

    点赞 评论 复制链接分享
  • weixin_39609770 weixin_39609770 5月前

    I guess the timeout is from PyCharm/VS Code? Could you try increasing that value a bit as the first compile is expected to take much longer than the latter runs? I'm working in a terminal where the logits.cpu() also returns without timeout, one trick I found out is querying less elements would take much less time like logits[0].cpu() and I usually try to avoid querying the whole tensor in debugging.

    点赞 评论 复制链接分享
  • weixin_39801356 weixin_39801356 5月前

    fwiw, I look at xla tensors pretty often in an ssh session inside a terminal , using pdb and it works reliably for me.

    点赞 评论 复制链接分享
  • weixin_39611340 weixin_39611340 5月前

    OK I think it could actually be due to VS Code's and PyCharm's variable inspector, which tries to load all kinds of info about the tensors (pdb + terminal clearly doesn't have that problem). I think I could try to just disable all variable inspectors etc. I'll report my findings. Thanks all!

    点赞 评论 复制链接分享