Summary: | ASTERISK-24731: res_pjsip_session cannot be unloaded | ||
Reporter: | Corey Farrell (coreyfarrell) | Labels: | |
Date Opened: | 2015-01-28 08:17:46.000-0600 | Date Closed: | 2015-03-26 12:48:43 |
Priority: | Major | Regression? | |
Status: | Closed/Complete | Components: | Resources/res_pjsip_session |
Versions: | SVN 13.1.0 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) backtrace_15784.txt ( 1) chan_pjsip-frack.txt ( 2) chan_pjsip-ref-fixes.patch ( 3) chan_pjsip-ref-fixes-r2.patch | |
Description: | res_pjsip_session cannot be unloaded or shutdown, causing huge numbers of leaks to be reported by REF_DEBUG or valgrind. This makes it impossible to do automated checks for memory leaks against chan_pjsip. All testsuite tests to fail if REF_DEBUG is enabled and res_pjsip_session is loaded.
This is follow up to ASTERISK-24485. As with that bug it's important for the module to clean itself up on graceful shutdown, less important to allow users to unload the module without shutdown. | ||
Comments: | By: Corey Farrell (coreyfarrell) 2015-03-12 08:41:56.761-0500 First attempt at a patch to allow all pjsip modules to load and unload with no reference leaks. Unfortunately it results in many segmentation faults with the testsuite. By: Corey Farrell (coreyfarrell) 2015-03-12 08:47:42.218-0500 All backtraces so far look the same. {{pjsip_endpt_destroy(ast_pjsip_endpoint);}} segfaults during unload of res_pjsip. Also I noticed AO2 frack's from tests/channels/pjsip/ami/show_registrations_outbound. Not sure what to do from here. By: Corey Farrell (coreyfarrell) 2015-03-12 10:09:33.533-0500 Just completed a run of tests/channels/pjsip with the patch. Out of 148 tests I got 67 total failures. 64 of those failures had reference leaks, and 27 had backtraces. I've just attached a second (different) backtrace caused by my patch. By: Corey Farrell (coreyfarrell) 2015-03-12 20:03:47.654-0500 In case it will help anyone to know the sources, here's a list of the 27 backtraces I got. Segfaults from {{unload_pjsip}}: {noformat} logs/channels/pjsip/basic_calls/two_parties/nominal/alice_initiated/bob_hangs_up/backtrace_9915.txt logs/channels/pjsip/basic_calls/outgoing/nominal/auth/backtrace_7724.txt logs/channels/pjsip/basic_calls/outgoing/nominal/echo/backtrace_14690.txt logs/channels/pjsip/basic_calls/outgoing/nominal/nat/backtrace_10822.txt logs/channels/pjsip/basic_calls/outgoing/off-nominal/bob_incompatible_codecs/backtrace_2804.txt logs/channels/pjsip/endpoint_identify/backtrace_859.txt logs/channels/pjsip/user_eq_phone/backtrace_1818.txt logs/channels/pjsip/hold_inactive/backtrace_1655.txt logs/channels/pjsip/call_pickup/backtrace_5239.txt logs/channels/pjsip/transfers/blind_transfer/caller_refer_only/backtrace_13185.txt logs/channels/pjsip/transfers/blind_transfer/caller_direct_media/backtrace_10342.txt logs/channels/pjsip/transfers/blind_transfer/callee_direct_media/backtrace_15444.txt logs/channels/pjsip/transfers/blind_transfer/callee_refer_only/backtrace_10715.txt logs/channels/pjsip/accountcode/backtrace_3296.txt logs/channels/pjsip/hold/backtrace_9193.txt logs/channels/pjsip/sdp_offer_answer/attribute_passthrough/backtrace_11103.txt logs/channels/pjsip/message/message_in_dialog/backtrace_12521.txt logs/channels/pjsip/hold_ice/backtrace_1852.txt logs/channels/pjsip/diversion/diversion_basic/backtrace_9554.txt logs/channels/pjsip/diversion/diversion_request/backtrace_1883.txt logs/channels/pjsip/diversion/diversion_caller_id/backtrace_7468.txt logs/channels/pjsip/diversion/diversion_response/backtrace_9173.txt {noformat} Segfaults from {{pjsip_endpt_destroy(ast_pjsip_endpoint)}}: {noformat} logs/channels/pjsip/transfers/attended_transfer/nominal/callee_remote/backtrace_15608.txt logs/channels/pjsip/transfers/attended_transfer/nominal/caller_local/backtrace_7937.txt logs/channels/pjsip/transfers/attended_transfer/nominal/callee_local/backtrace_10844.txt logs/channels/pjsip/transfers/blind_transfer/callee_with_hold/backtrace_15381.txt logs/channels/pjsip/refer_send_to_vm/backtrace_9935.txt {noformat} By: Matt Jordan (mjordan) 2015-03-12 20:07:28.967-0500 You're getting crashes due to calling PJSIP functions from a non-PJSIP registered thread: {noformat} #0 0x00007fd5ac56bcc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. #0 0x00007fd5ac56bcc9 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 resultvar = 0 pid = 30776 selftid = 30876 #1 0x00007fd5ac56f0d8 in __GI_abort () at abort.c:89 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x7fff58f2f2db, sa_sigaction = 0x7fff58f2f2db}, sa_mask = {__val = {140555697465532, 140555312626520, 692, 4294967295, 140555696104675, 4294967296, 140555736862448, 38654705664, 0, 3519, 0, 0, 0, 21474836480, 140555738025984, 140555697480656}}, sa_flags = -1787101968, sa_restorer = 0x7fd5957aff01 <__PRETTY_FUNCTION__.5427>} sigs = {__val = {32, 0 <repeats 15 times>}} #2 0x00007fd5ac564b86 in __assert_fail_base (fmt=0x7fd5ac6b63d0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x7fd5957afcf0 "!\"Calling pjlib from unknown/external thread. You must \" \"register external threads with pj_thread_register() \" \"before calling any pjlib functions.\"", file=file@entry=0x7fd5957afb58 "../src/pj/os_core_unix.c", line=line@entry=692, function=function@entry=0x7fd5957aff01 <__PRETTY_FUNCTION__.5427> "pj_thread_this") at assert.c:92 str = 0x7fd5a400da50 "\200", <incomplete sequence \333> total = 4096 #3 0x00007fd5ac564c32 in __GI___assert_fail (assertion=0x7fd5957afcf0 "!\"Calling pjlib from unknown/external thread. You must \" \"register external threads with pj_thread_register() \" \"before calling any pjlib functions.\"", file=0x7fd5957afb58 "../src/pj/os_core_unix.c", line=692, function=0x7fd5957aff01 <__PRETTY_FUNCTION__.5427> "pj_thread_this") at assert.c:101 No locals. #4 0x00007fd59579758c in pj_thread_this () from /usr/lib/libpj.so.2 No symbol table info available. #5 0x00007fd5957a115a in pj_log () from /usr/lib/libpj.so.2 No symbol table info available. #6 0x00007fd5957a169b in pj_log_4 () from /usr/lib/libpj.so.2 No symbol table info available. #7 0x00007fd595c40da5 in unload_module () from /usr/lib/libpjsip.so.2 No symbol table info available. #8 0x00007fd595c40c3c in pjsip_endpt_unregister_module () from /usr/lib/libpjsip.so.2 No symbol table info available. {noformat} You'll need to marshal the call to {{pjsip_endpt_unregister_module}} from a PJSIP thread, synchronize on it completing, then continue the unload process. By: Corey Farrell (coreyfarrell) 2015-03-12 20:39:25.087-0500 So this PJSIP thread issue effected 5 of the 27 backtraces. I just moved most of {{res_pjsip.c:module_unload}} to {{unload_pjsip}}, it resolved the {{pjsip_endpt_unregister_module}} crash. Unfortunately 4 of the 5 went on to crash at {{pjsip_endpt_destroy(ast_pjsip_endpoint)}}. By: Corey Farrell (coreyfarrell) 2015-03-15 00:45:27.010-0500 Revision 2 of the patch. Contains fixes to res_pjsip_outbound_registration. Each change seems to fix a FRACK, but about every other run of tests/channels/pjsip/ami/show_registrations_outbound still has a FRACK. Still have an extra or missing ao2_ref somewhere, or something is going out of order with a task processor. All other tests now succeed, some still have leaks. By: Corey Farrell (coreyfarrell) 2015-03-15 00:54:56.336-0500 Attached is the refs entries for an object that FRACK'ed (unref after free) and the backtrace. I added the last ref line (error) using information in the backtrace. |