dtmjqyfz21793 2017-05-09 23:18
浏览 55

调试/运行Go程序会冻结Windows计算机

I am facing a weird issue, and am struggling to find the cause. I have a go program that iterates over a number of EC2 instances, and fetches CPU utilization metrics for them, then iterates to average and prints out a report. There are several loops and several call outs to external endpoints (Prometheus, Influx, Cloudwatch, AWS APIs, etc.).

The issue I am facing is that, when I debug the application in Visual Studio Code, randomly sometimes, it causes my entire computer to freeze up. Randomly = I have not been able to associate any particular change or trigger point that seems to cause the issue or remove the issue. Freezes = completely freezes the machine (no mouse operations, no keyboard, cannot do anything other than force-power-cycle).

I believe it is originating from my golang code, due to a few reasons: 1. The freeze has occured about 7 times now, and only occurs when I run the code. 2. The freeze consistently reproduced when I tried running the code through the debugger (after each restart)

However, contrarily, stepping through the code that resulted in a freeze, does not reproduce the freeze at all.

Finally, I ran the code outside of the Visual Studio environment (post build, as an exe). After several runs without any incident, I ran into a modified form of freeze. In this case, my computer did not completely freeze. The Mouse pointer was still responsive, but everything else froze up (no keyboard, no ctrl-alt-del), and I still had to force-power-cycle.

I had logs in my code, and I found a pattern repeating when the freeze happens. My log pattern is a repeating pattern of 4 lines like this (where the value of i represents the iteration count):

2017/05/09 17:31:28 Computing avg & normalized util for i=506, s=0, i-s=506, i-beg=506, len(avgs)=4049, len(ut.Values)=3278, pre time-point avg: 0.000000, pre overall avg: 0.111194, metricValue: 0.084000
2017/05/09 17:31:28 
Running avg for time-point raw avg:0.084000, normalized avg:5.376000
Overall running avg=> raw:0.111140, normalized:7.112963 2017/05/09

Now, in this instance when the freeze occurred, the last entry to the log consisted of this unfinished snippet followed by a long stream of NUL characters, like this (# of NULs truncated):

17:31:28 Computing avg & normalized util for i=507, s=0, i-s=507, i-beg=507, len(avgs)=4049, len(ut.Values)=3278, pre time-point avg: 0.000000, pre overall avg: 0.111140, metricValue: 0.104000 
2017/05/09 17:31:28
Running avg for time-point raw avg:0.104000, normalized avg:6.656000 Overall running avg=> raw:0.111126, normalNULNULNULNULNUL

In this particular case, it occurred on the 507th iteration. In a previous run, when the freeze occurred, the NULs were printed at a different iteration. And, the code has successfully processed iterations upto 5000 previously.

Because the freeze, freezes me out, I am unable to conclusively debug the code. I am running this with go 1.7.4 on Windows 7 Pro 64-bit, and my laptop is a fairly powerful machine with 16GB RAM and Intel i7. I have not had many programs open when the freezes repeated (like today when I had to power-cycle my machine 3 times), so resource usage during subsequent freezes would have been minimal.

Any help or pointers in how I can debug this issue, would be greatly appreciated.

--Edit 05/10/2017--
I had a few more freezes since posting this question. A few new observations:
1. Running as a binary (as against debugging using Visual Studio Code) does produce complete freezes too.
2. Sometimes the freeze is instantaneous (everything seems to be moving along fine, and then suddenly everything freezes), while at other times the freeze develops slowly (first the cursor in the command window freezes, then the tooltips freeze, then the windows cannot be switched, the mouse freezes, and finally the taskmanager freezes).
3. I am now able to reproduce the freeze (running certain workloads always causes the freeze, although at different points in time/code). All of the workloads that produce a freeze have more iterations than the others. My logic processes a collection of containers, where each container itself is a collection of instances, and where each instance has a collection of metrics, and these workloads all had more number of top-level collections.

The above observations make it very likely to be a resource related issue. I was able to get a few pictures of the task manager after the machine had frozen. And, the task manager and resource manager do not indicate any untoward resource utilization.

Task manager view when the machine had frozen Resource Manager & Task Manager view during another freeze

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥50 易语言把MYSQL数据库中的数据添加至组合框
    • ¥20 求数据集和代码#有偿答复
    • ¥15 关于下拉菜单选项关联的问题
    • ¥20 java-OJ-健康体检
    • ¥15 rs485的上拉下拉,不会对a-b<-200mv有影响吗,就是接受时,对判断逻辑0有影响吗
    • ¥15 使用phpstudy在云服务器上搭建个人网站
    • ¥15 应该如何判断含间隙的曲柄摇杆机构,轴与轴承是否发生了碰撞?
    • ¥15 vue3+express部署到nginx
    • ¥20 搭建pt1000三线制高精度测温电路
    • ¥15 使用Jdk8自带的算法,和Jdk11自带的加密结果会一样吗,不一样的话有什么解决方案,Jdk不能升级的情况