weixin_39949607
weixin_39949607
2020-12-01 22:39

isis crash


(gdb) bt
#0 0x00007f8a5c930067 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f8a5c931448 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f8a5c96e1b4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#3 0x00007f8a5c97398e in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00007f8a5c974696 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x000055dda631e615 in lsp_clear_data (lsp=lsp=0x55dda73daaa0) at isis_lsp.c:123
#6 0x000055dda631e731 in lsp_destroy (lsp=0x55dda73daaa0) at isis_lsp.c:154
#7 0x000055dda6322ea9 in lsp_tick (thread=) at isis_lsp.c:2670
#8 0x00007f8a5d74539d in thread_call (thread=0x7ffde5c4a0c8) at thread.c:1462
#9 0x000055dda631d070 in main (argc=4, argv=, envp=) at isis_main.c:390
(gdb)

[6:35]


2016/12/14 23:27:32.188829 ISIS: lan hello on non broadcast circuit
2016/12/14 23:27:32.189054 ISIS: %ADJCHANGE: Adjacency to 1921.6810.0009 (swp1) changed from Unknown to Initializing, unspecified
2016/12/14 23:27:32.189064 ISIS: %ADJCHANGE: Adjacency to 1921.6810.0009 (swp1) changed from Initializing to Up, unspecified
2016/12/14 23:27:37.186777 ISIS: ISIS-Upd (FOO): LSP 0000.0000.0000.00-00 seq 0x00000001 with confused checksum received.
2016/12/14 23:27:37.186876 ISIS: ISIS-Spf: TENT is empty SPF-root:r10
2016/12/14 23:27:37.247257 ISIS: ISIS-Upd (FOO): LSP 1921.6810.0009.00-00 invalid LSP is type 0
2016/12/14 23:27:49.675359 ISIS: ISIS-Spf: TENT is empty SPF-root:r10
2016/12/14 23:27:50.250427 ISIS: ISIS-Spf: TENT is empty SPF-root:r10
2016/12/14 23:27:51.189042 ISIS: ISIS-Spf: TENT is empty SPF-root:r10
2016/12/14 23:28:02.189823 ISIS: ISIS-Spf: TENT is empty SPF-root:r10
2016/12/14 23:28:03.252071 ISIS: ISIS-Spf: TENT is empty SPF-root:r10
2016/12/14 23:28:38.532282 ISIS: ISIS-Upd (FOO): L1 LSP 0000.0000.0000.00-00 seq 0x00000001 aged out
2016/12/14 23:30:39.319943 ZEBRA: client 12 disconnected. 9 isis routes removed from the rib
2016/12/14 23:31:09.433950 ZEBRA: Terminating on signal
2016/12/14 23:31:09.433991 ZEBRA: IRDP: Received shutdown notification.

[6:35]


 r6 ---- r9 --- r10
 |\      |       |
 | \     |       |
 |  \    r8 --- r11
 |   r7
 r5  |
 | \ |
 |  r3 --- r2
 | /        |
 r4        r1

[6:36]
In the above topology we are seeing crashes in isis on r11, r4, and r7

[6:36]
config on r10:

[6:37]

!
interface lo
ip router isis FOO
isis circuit-type level-1
isis passive
!
interface swp1
ip router isis FOO
isis circuit-type level-1
isis network point-to-point
!
interface swp2
ip router isis FOO
isis circuit-type level-1
isis network point-to-point
!
router isis FOO
net 49.0003.1921.6810.0010.00
metric-style wide
is-type level-1
log-adjacency-changes
!

[6:37]
oh yeah crash on r10 aswell

该提问来源于开源项目:FRRouting/frr

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

7条回答

  • weixin_39952031 weixin_39952031 5月前

    do we have some pcap files for this?

    点赞 评论 复制链接分享
  • weixin_39949607 weixin_39949607 5月前

    output.swp2.pcap.gz output.swp1.pcap.gz

    Multiple iterations of the crash hopefully included in the 2 pcap files. This is on r11

    点赞 评论 复制链接分享
  • weixin_39949607 weixin_39949607 5月前

    Valgrind caught it this time:

    ==7946== Invalid free() / delete / delete[] / realloc() ==7946== at 0x4C29E90: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7946== by 0x119634: lsp_clear_data (isis_lsp.c:119) ==7946== by 0x119750: lsp_destroy (isis_lsp.c:150) ==7946== by 0x11DEC8: lsp_tick (isis_lsp.c:2639) ==7946== by 0x4E611CB: thread_call (thread.c:1442) ==7946== by 0x11808F: main (isis_main.c:389) ==7946== Address 0x7d94fdc is 28 bytes inside a block of size 43 alloc'd ==7946== at 0x4C28C20: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7946== by 0x4E7E9B8: qmalloc (memory.c:61) ==7946== by 0x4E69901: stream_new (stream.c:105) ==7946== by 0x4E69AB3: stream_dup (stream.c:149) ==7946== by 0x11990B: lsp_update_data (isis_lsp.c:485) ==7946== by 0x11A579: lsp_update (isis_lsp.c:554) ==7946== by 0x12628F: process_lsp (isis_pdu.c:1526) ==7946== by 0x12679F: isis_handle_pdu (isis_pdu.c:2116) ==7946== by 0x12679F: isis_receive (isis_pdu.c:2157) ==7946== by 0x4E611CB: thread_call (thread.c:1442) ==7946== by 0x11808F: main (isis_main.c:389) ==7946== ==7946== Invalid free() / delete / delete[] / realloc() ==7946== at 0x4C29E90: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7946== by 0x119657: lsp_clear_data (isis_lsp.c:121) ==7946== by 0x119750: lsp_destroy (isis_lsp.c:150) ==7946== by 0x11DEC8: lsp_tick (isis_lsp.c:2639) ==7946== by 0x4E611CB: thread_call (thread.c:1442) ==7946== by 0x11808F: main (isis_main.c:389) ==7946== Address 0x7d94fe7 is 39 bytes inside a block of size 43 alloc'd ==7946== at 0x4C28C20: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7946== by 0x4E7E9B8: qmalloc (memory.c:61) ==7946== by 0x4E69901: stream_new (stream.c:105) ==7946== by 0x4E69AB3: stream_dup (stream.c:149) ==7946== by 0x11990B: lsp_update_data (isis_lsp.c:485) ==7946== by 0x11A579: lsp_update (isis_lsp.c:554) ==7946== by 0x12628F: process_lsp (isis_pdu.c:1526) ==7946== by 0x12679F: isis_handle_pdu (isis_pdu.c:2116) ==7946== by 0x12679F: isis_receive (isis_pdu.c:2157) ==7946== by 0x4E611CB: thread_call (thread.c:1442) ==7946== by 0x11808F: main (isis_main.c:389) ==7946==

    点赞 评论 复制链接分享
  • weixin_39949607 weixin_39949607 5月前

    So in isis_tlv.c we have this:

    case DYNAMIC_HOSTNAME:
      *found |= TLVFLAG_DYN_HOSTNAME;
    

    ifdef EXTREME_TLV_DEBUG

      zlog_debug ("ISIS-TLV (%s): Dynamic Hostname length %d",
              areatag, length);
    

    endif / EXTREME_TLV_DEBUG /

      if (*expected & TLVFLAG_DYN_HOSTNAME)
        {
          /* the length is also included in the pointed struct */
          tlvs->hostname = (struct hostname *) (pnt - 1);
        }
      pnt += length;
      break;
    

    pnt is set off the stream_dup that lsp->pdu is set from, but we have a lsp->own_lsp set to true hence the crash.

    点赞 评论 复制链接分享
  • weixin_39949607 weixin_39949607 5月前

    Commit 4fedc05c8895ae5400a13c17b7 addresses the issue from happening. But I would like to see Christian comment on the further debugging I provided before closing to see if he thinks what he has done has sufficiently closed the loop holes.

    点赞 评论 复制链接分享
  • weixin_39949607 weixin_39949607 5月前

    actually I was wrong, I just happened to recheck my test setup and am seeing the same core files.

    点赞 评论 复制链接分享
  • weixin_39949607 weixin_39949607 5月前

    Resolved via 07f2fb1374661390f48bc1fd748b5a72ecd4f60b

    点赞 评论 复制链接分享

相关推荐