[Home]

Summary:ASTERISK-24556: Asterisk 13 core dumps when calling from pjsip extension to another pjsip extension
Reporter:Abhay Gupta (agupta)Labels:
Date Opened:2014-11-25 06:47:48.000-0600Date Closed:2014-12-08 10:44:09.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Channels/chan_pjsip
Versions:13.1.0 Frequency of
Occurrence
Constant
Related
Issues:
duplicatesASTERISK-24448 Crash on multi-leg call with directmedia=yes using pjsip
duplicatesASTERISK-23864 Asterisk crashing on an outbound call over PJSIP channel tech
Environment:ubuntu 12.04 , pjproject 2.3 and asterisk 13.1.0Attachments:( 0) abhay.txt
( 1) asterisk_cli.docx
( 2) clidebug.txt
( 3) dump.txt
( 4) pjsip.conf
( 5) pjsip.txt
( 6) valgrind_new.txt
( 7) valgrind.txt
( 8) valgrind.txt
( 9) valgrind.txt
Description:When i make a call from pjsip extension 6002 to 6001 it always crashes with the same core dump .

#0  0x00007ffff4228595 in pj_pool_alloc (pool=0xb7260610b, size=13) at ../include/pj/pool_i.h:60
Comments:By: Abhay Gupta (agupta) 2014-11-25 06:48:59.406-0600

This is the backtrace of the problem

By: Abhay Gupta (agupta) 2014-11-25 07:04:22.770-0600

Invalid read of size 4
==28948==    at 0xBCF0E38: pjsip_msg_find_hdr (sip_msg.c:347)
==28948==    by 0xBCFFC54: pjsip_get_request_dest (sip_util.c:902)
==28948==    by 0xBD16F27: pjsip_tsx_create_uac2 (sip_transaction.c:1376)
==28948==    by 0xBD16C03: pjsip_tsx_create_uac (sip_transaction.c:1270)
==28948==    by 0xBD1C7D3: pjsip_dlg_send_request (sip_dialog.c:1214)
==28948==    by 0xBACF7F9: pjsip_inv_send_msg (sip_inv.c:3102)
==28948==    by 0xB8BEA50: send_delayed_request (res_pjsip_session.c:530)
==28948==    by 0xB8BEB8C: really_resend_reinvite (res_pjsip_session.c:1716)
==28948==    by 0x5E587E: ast_taskprocessor_execute (taskprocessor.c:769)
==28948==    by 0x5EDE64: execute_tasks (threadpool.c:1157)
==28948==    by 0x5E587E: ast_taskprocessor_execute (taskprocessor.c:769)
==28948==    by 0x5EC66F: threadpool_execute (threadpool.c:351)
==28948==  Address 0x74468a8 is 2,712 bytes inside a block of size 4,000 free'd
==28948==    at 0x4C2A82E: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==28948==    by 0xC3DEEEB: default_block_free (pool_policy_malloc.c:78)
==28948==    by 0xC3E6B4A: reset_pool (pool.c:254)
==28948==    by 0xC3E6BD2: pj_pool_reset (pool.c:276)
==28948==    by 0xC3E740C: cpool_release_pool (pool_caching.c:247)
==28948==    by 0xC3E66BC: pj_pool_release (pool_i.h:92)
==28948==    by 0xBCFD184: pjsip_endpt_release_pool (sip_endpoint.c:688)
==28948==    by 0xBD03A85: tx_data_destroy (sip_transport.c:484)
==28948==    by 0xBD03AB7: pjsip_tx_data_dec_ref (sip_transport.c:495)
==28948==    by 0xBD19A2B: tsx_on_state_proceeding_uac (sip_transaction.c:3015)
==28948==    by 0xBD18F1A: tsx_on_state_calling (sip_transaction.c:2493)
==28948==    by 0xBD17961: pjsip_tsx_recv_msg (sip_transaction.c:1765)
==28948==
==28948== Invalid read of size 8
==28948==    at 0xBCF0E4A: pjsip_msg_find_hdr (sip_msg.c:346)
==28948==    by 0xBCFFC54: pjsip_get_request_dest (sip_util.c:902)
==28948==    by 0xBD16F27: pjsip_tsx_create_uac2 (sip_transaction.c:1376)
==28948==    by 0xBD16C03: pjsip_tsx_create_uac (sip_transaction.c:1270)
==28948==    by 0xBD1C7D3: pjsip_dlg_send_request (sip_dialog.c:1214)
==28948==    by 0xBACF7F9: pjsip_inv_send_msg (sip_inv.c:3102)
==28948==    by 0xB8BEA50: send_delayed_request (res_pjsip_session.c:530)
==28948==    by 0xB8BEB8C: really_resend_reinvite (res_pjsip_session.c:1716)
==28948==    by 0x5E587E: ast_taskprocessor_execute (taskprocessor.c:769)
==28948==    by 0x5EDE64: execute_tasks (threadpool.c:1157)
==28948==    by 0x5E587E: ast_taskprocessor_execute (taskprocessor.c:769)
==28948==    by 0x5EC66F: threadpool_execute (threadpool.c:351)
==28948==  Address 0x74468a0 is 2,704 bytes inside a block of size 4,000 free'd
==28948==    at 0x4C2A82E: free (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==28948==    by 0xC3DEEEB: default_block_free (pool_policy_malloc.c:78)
==28948==    by 0xC3E6B4A: reset_pool (pool.c:254)
==28948==    by 0xC3E6BD2: pj_pool_reset (pool.c:276)
==28948==    by 0xC3E740C: cpool_release_pool (pool_caching.c:247)
==28948==    by 0xC3E66BC: pj_pool_release (pool_i.h:92)
==28948==    by 0xBCFD184: pjsip_endpt_release_pool (sip_endpoint.c:688)
==28948==    by 0xBD03A85: tx_data_destroy (sip_transport.c:484)
==28948==    by 0xBD03AB7: pjsip_tx_data_dec_ref (sip_transport.c:495)
==28948==    by 0xBD19A2B: tsx_on_state_proceeding_uac (sip_transaction.c:3015)
==28948==    by 0xBD18F1A: tsx_on_state_calling (sip_transaction.c:2493)
==28948==    by 0xBD17961: pjsip_tsx_recv_msg (sip_transaction.c:1765)
==28948==
==28948== Invalid read of size 8
==28948==    at 0xC3E6595: pj_pool_alloc (pool_i.h:60)
==28948==    by 0xC3F052F: pj_strdup (string_i.h:40)
==28948==    by 0xBCFFB80: pjsip_get_dest_info (sip_util.c:857)
==28948==    by 0xBCFFCA7: pjsip_get_request_dest (sip_util.c:910)
==28948==    by 0xBD16F27: pjsip_tsx_create_uac2 (sip_transaction.c:1376)
==28948==    by 0xBD16C03: pjsip_tsx_create_uac (sip_transaction.c:1270)
==28948==    by 0xBD1C7D3: pjsip_dlg_send_request (sip_dialog.c:1214)
==28948==    by 0xBACF7F9: pjsip_inv_send_msg (sip_inv.c:3102)
==28948==    by 0xB8BEA50: send_delayed_request (res_pjsip_session.c:530)
==28948==    by 0xB8BEB8C: really_resend_reinvite (res_pjsip_session.c:1716)
==28948==    by 0x5E587E: ast_taskprocessor_execute (taskprocessor.c:769)
==28948==    by 0x5EDE64: execute_tasks (threadpool.c:1157)
==28948==  Address 0xb72606163 is not stack'd, malloc'd or (recently) free'd
==28948==
==28948==
==28948== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==28948==  Access not within mapped region at address 0xB72606163
==28948==    at 0xC3E6595: pj_pool_alloc (pool_i.h:60)
==28948==    by 0xC3F052F: pj_strdup (string_i.h:40)
==28948==    by 0xBCFFB80: pjsip_get_dest_info (sip_util.c:857)
==28948==    by 0xBCFFCA7: pjsip_get_request_dest (sip_util.c:910)
==28948==    by 0xBD16F27: pjsip_tsx_create_uac2 (sip_transaction.c:1376)
==28948==    by 0xBD16C03: pjsip_tsx_create_uac (sip_transaction.c:1270)
==28948==    by 0xBD1C7D3: pjsip_dlg_send_request (sip_dialog.c:1214)
==28948==    by 0xBACF7F9: pjsip_inv_send_msg (sip_inv.c:3102)
==28948==    by 0xB8BEA50: send_delayed_request (res_pjsip_session.c:530)
==28948==    by 0xB8BEB8C: really_resend_reinvite (res_pjsip_session.c:1716)
==28948==    by 0x5E587E: ast_taskprocessor_execute (taskprocessor.c:769)
==28948==    by 0x5EDE64: execute_tasks (threadpool.c:1157)
==28948==  If you believe this happened as a result of a stack
==28948==  overflow in your program's main thread (unlikely but
==28948==  possible), you can try to increase the size of the
==28948==  main thread stack using the --main-stacksize= flag.
==28948==  The main thread stack size used in this run was 8388608.
==28948==
==28948== HEAP SUMMARY:
==28948==     in use at exit: 16,405,451 bytes in 121,390 blocks
==28948==   total heap usage: 1,774,100 allocs, 1,652,710 frees, 140,712,292 bytes allocated
==28948==
==28948== LEAK SUMMARY:
==28948==    definitely lost: 638 bytes in 30 blocks
==28948==    indirectly lost: 65 bytes in 2 blocks
==28948==      possibly lost: 4,658,204 bytes in 18,279 blocks
==28948==    still reachable: 11,746,544 bytes in 103,079 blocks
==28948==         suppressed: 0 bytes in 0 blocks
==28948== Rerun with --leak-check=full to see details of leaked memory
==28948==
==28948== For counts of detected and suppressed errors, rerun with: -v
==28948== Use --track-origins=yes to see where uninitialised values come from
==28948== ERROR SUMMARY: 136579 errors from 39 contexts (suppressed: 2 from 2)

By: Abhay Gupta (agupta) 2014-11-25 07:04:57.383-0600

valgrind.txt

By: Abhay Gupta (agupta) 2014-11-25 07:11:47.238-0600

This is the correct valgrind output file

By: Abhay Gupta (agupta) 2014-11-26 05:59:05.311-0600

Adding more files

By: Abhay Gupta (agupta) 2014-11-26 07:42:09.040-0600

I have simulated and found that the problem is caused by wrong codec negotiation . Attached text file has Asterisk CLI dump when only alaw was enabled in pjsip profile and also when both alaw and ulaw were applied and it created a segmentation fault .

By: Rusty Newton (rnewton) 2014-12-01 09:29:47.242-0600

[~agupta] can you attach a backtrace from the dump from the seg fault?

[Wiki instructions for getting a backtrace|https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace]

By: Matt Jordan (mjordan) 2014-12-01 09:37:05.766-0600

Please attach:
* Your {{pjsip.conf}} configuration for the endpoints
* A debug log with {{pjsip set logger on}} showing the message traffic leading up to the crash.

Thanks!

By: Joshua C. Colp (jcolp) 2014-12-01 09:39:35.138-0600

I've tracked this down, no further information is needed.

The underlying issue is that when rescheduling as a result of receiving a 491 response we don't increase the reference count of the message. Since the reference is held by the transaction that received the 491 it eventually releases the reference and the message is no longer valid.

By: Abhay Gupta (agupta) 2015-01-23 01:03:30.758-0600

The problem still exist , whenever the call is answered from one extension to another it core dumps . Attaching valgrind , pjsip.conf and asterisk cli output with pjsip logger on .

By: Abhay Gupta (agupta) 2015-01-23 01:03:49.929-0600

The problem still exist . The bug is not closed

By: Abhay Gupta (agupta) 2015-01-23 01:11:57.810-0600

This is gdb dump

By: Abhay Gupta (agupta) 2015-01-23 01:43:49.518-0600

Raising the issue as a separate bug as it related to min-se timers