Description of problem: peer的lgr被销毁后,local主机会极小概率的crash。相关的栈信息如下: [133.9636771Call Trace: [133.963685]smc ism unset conn+x24/0x60 [smc] [133.963702]smc conn kill+0x8b/0xf0[smc] [133.963716]smc lgr terminate.part.36+0x98/0x110 [smc] [133.963731]process one work+x1a7/0x360 [133.963741]?create worker+0xla0/0x1a0 [133.963749]worker thread+x30/0x390 [133:963758]?create worker+0xla0/0x1a0 [133.963766]kthread+0x10a/0x120 [133.963775]? set kthread struct+0x50/0x50 [133.963785]ret from fork+0x1f/0x40 当peer的lgr被销毁后,local主机会触发smcr_link_down,在smcr_link_down中,smc_switch_conns中的流程与smcr_link_clear为异步执行,当smcr_link_clear执行早于smc_switch_conns中异步执行的__smc_lgr_teninate时,lgr会在smcr_link_clear中被free。导致__smc_lgr_teninate中执行smc_conn_kill(conn,soft)时crash. Version-Release number of selected component (if applicable): How reproducible: 正常复现的概率较小,可以通过在smc_lgr_terminate_work的__smc_lgr_teninate前加些延时,然后将peer端的与本节点相关的lgr销毁。 Steps to Reproduce: 1.在local节点中smc_core.c的代码中smc_lgr_terminate_work的__smc_lgr_teninate前加些延时来增大复现概率。 2.在已经建立smcr连接的peer端,销毁linkgroup。 Actual results: smcr_link_clear执行早于smc_switch_conns中异步执行的__smc_lgr_teninate,lgr会在smcr_link_clear中被free。导致__smc_lgr_teninate中执行smc_conn_kill(conn,soft)时crash: [133.9636771Call Trace: [133.963685]smc ism unset conn+x24/0x60 [smc] [133.963702]smc conn kill+0x8b/0xf0[smc] [133.963716]smc lgr terminate.part.36+0x98/0x110 [smc] [133.963731]process one work+x1a7/0x360 [133.963741]?create worker+0xla0/0x1a0 [133.963749]worker thread+x30/0x390 [133:963758]?create worker+0xla0/0x1a0 [133.963766]kthread+0x10a/0x120 [133.963775]? set kthread struct+0x50/0x50 [133.963785]ret from fork+0x1f/0x40 Expected results: local主机会触发smcr_link_down,在其中的smc_switch_conns中通过异步流程,销毁lgr,然后再通过smcr_link_down的smcr_link_clear清理 link。 Additional info: