douzhushen_9776 2018-04-09 08:01
浏览 58

了解去pprof结果

I have go code and I have written unittest for it. The test runs for 6.5 sec and it creates huge cpu load. I am running tests in docker container and its CPU usage reaches at 400-500%. I got cpuprofile and it shows following result:

(pprof) top
Showing nodes accounting for 4260ms, 69.49% of 6130ms total
Dropped 80 nodes (cum <= 30.65ms)
Showing top 10 nodes out of 79
      flat  flat%   sum%        cum   cum%
    1250ms 20.39% 20.39%     1250ms 20.39%  runtime._ExternalCode /usr/local/go/src/runtime/proc.go
     640ms 10.44% 30.83%      960ms 15.66%  runtime.scanobject /usr/local/go/src/runtime/mgcmark.go
     580ms  9.46% 40.29%      580ms  9.46%  runtime.memmove /usr/local/go/src/runtime/memmove_amd64.s
     420ms  6.85% 47.15%      730ms 11.91%  runtime.mapassign_faststr /usr/local/go/src/runtime/hashmap_fast.go
     310ms  5.06% 52.20%      310ms  5.06%  runtime.duffcopy /usr/local/go/src/runtime/duff_amd64.s
     260ms  4.24% 56.44%      260ms  4.24%  runtime.heapBitsForObject /usr/local/go/src/runtime/mbitmap.go
     220ms  3.59% 60.03%      220ms  3.59%  runtime.scanobject /usr/local/go/src/runtime/mbitmap.go
     200ms  3.26% 63.30%     1630ms 26.59%  runtime.mallocgc /usr/local/go/src/runtime/malloc.go
     200ms  3.26% 66.56%      200ms  3.26%  runtime.memclrNoHeapPointers /usr/local/go/src/runtime/memclr_amd64.s
     180ms  2.94% 69.49%     2250ms 36.70%  runtime.systemstack /usr/local/go/src/runtime/asm_amd64.s

I don't understand what is ExternalCode here. Also there is none of my actual code's function in this result. I understand some of my functions might be calling these built-in go stuff and could end up spending a lot of time there. But then it does not give me direction to work on improving performance in my code. Is there anything obvious from this result that I need to give attention to?

EDIT: I try to use list command in pprof interactive mode for the functions which were showing high cpu usage.

(pprof) list _ExternalCode
Total: 6.13s
ROUTINE ======================== runtime._ExternalCode in /usr/local/go/src/runtime/proc.go
     1.25s      1.25s (flat, cum) 20.39% of Total
         .          .   3231:   signalLock uint32
         .          .   3232:   hz         int32
         .          .   3233:}
         .          .   3234:
         .          .   3235:func _System()                    { _System() }
     1.25s      1.25s   3236:func _ExternalCode()              { _ExternalCode() }
         .          .   3237:func _LostExternalCode()          { _LostExternalCode() }
         .          .   3238:func _GC()                        { _GC() }
         .          .   3239:func _LostSIGPROFDuringAtomic64() { _LostSIGPROFDuringAtomic64() }
         .          .   3240:
         .          .   3241:// Counts SIGPROFs received while in atomic64 critical section, on mips{,le}
(pprof) 
(pprof) 
(pprof) 
(pprof) list scanobject
Total: 6.13s
ROUTINE ======================== runtime.scanobject in /usr/local/go/src/runtime/mbitmap.go
     220ms      220ms (flat, cum)  3.59% of Total
         .          .    479:
         .          .    480:// The caller can test morePointers and isPointer by &-ing with bitScan and bitPointer.
         .          .    481:// The result includes in its higher bits the bits for subsequent words
         .          .    482:// described by the same bitmap byte.
         .          .    483:func (h heapBits) bits() uint32 {
         .          .    484:   // The (shift & 31) eliminates a test and conditional branch
         .          .    485:   // from the generated code.
     200ms      200ms    486:   return uint32(*h.bitp) >> (h.shift & 31)
         .          .    487:}
         .          .    488:
         .          .    489:// morePointers returns true if this word and all remaining words in this object
         .          .    490:// are scalars.
         .          .    491:// h must not describe the second word of the object.
ROUTINE ======================== runtime.scanobject in /usr/local/go/src/runtime/mgcmark.go
     640ms      960ms (flat, cum) 15.66% of Total
         .          .   1155:           n = maxObletBytes
         .          .   1156:       }
         .          .   1157:   }
         .          .   1158:
         .          .   1159:   var i uintptr
     100ms      100ms   1160:   for i = 0; i < n; i += sys.PtrSize {
         .          .   1161:       // Find bits for this word.
      10ms       10ms   1162:       if i != 0 {
         .          .   1163:           // Avoid needless hbits.next() on last iteration.
      10ms       10ms   1164:           hbits = hbits.next()
         .          .   1165:       }
         .          .   1166:       // Load bits once. See CL 22712 and issue 16973 for discussion.
         .          .   1167:       bits := hbits.bits()
         .          .   1168:       // During checkmarking, 1-word objects store the checkmark
         .          .   1169:       // in the type bit for the one word. The only one-word objects
         .          .   1170:       // are pointers, or else they'd be merged with other non-pointer
         .          .   1171:       // data into larger allocations.
      80ms       80ms   1172:       if i != 1*sys.PtrSize && bits&bitScan == 0 {
         .          .   1173:           break // no more pointers in this object
         .          .   1174:       }
      10ms       10ms   1175:       if bits&bitPointer == 0 {
         .          .   1176:           continue // not a pointer
         .          .   1177:       }
         .          .   1178:
         .          .   1179:       // Work here is duplicated in scanblock and above.
         .          .   1180:       // If you make changes here, make changes there too.
      20ms       20ms   1181:       obj := *(*uintptr)(unsafe.Pointer(b + i))
         .          .   1182:
         .          .   1183:       // At this point we have extracted the next potential pointer.
         .          .   1184:       // Check if it points into heap and not back at the current object.
     340ms      340ms   1185:       if obj != 0 && arena_start <= obj && obj < arena_used && obj-b >= n {
         .          .   1186:           // Mark the object.
      60ms      300ms   1187:           if obj, hbits, span, objIndex := heapBitsForObject(obj, b, i); obj != 0 {
      10ms       90ms   1188:               greyobject(obj, b, i, hbits, span, gcw, objIndex)
         .          .   1189:           }
         .          .   1190:       }
         .          .   1191:   }
         .          .   1192:   gcw.bytesMarked += uint64(n)
         .          .   1193:   gcw.scanWork += int64(i)
  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 八爪鱼爬数据为什么自己停了
    • ¥15 交替优化波束形成和ris反射角使保密速率最大化
    • ¥15 树莓派与pix飞控通信
    • ¥15 自动转发微信群信息到另外一个微信群
    • ¥15 outlook无法配置成功
    • ¥30 这是哪个作者做的宝宝起名网站
    • ¥60 版本过低apk如何修改可以兼容新的安卓系统
    • ¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
    • ¥50 有数据,怎么建立模型求影响全要素生产率的因素
    • ¥50 有数据,怎么用matlab求全要素生产率