[Home]

Summary:ASTERISK-26699: res_pjsip: Assertion when sending OPTIONS request to endpoint
Reporter:Ross Beer (rossbeer)Labels:
Date Opened:2017-01-06 04:11:14.000-0600Date Closed:2017-02-02 09:36:26.000-0600
Priority:MajorRegression?
Status:Closed/CompleteComponents:Resources/res_pjsip
Versions:13.13.1 Frequency of
Occurrence
Occasional
Related
Issues:
is duplicated byASTERISK-26727 FRACK!, Failed assertion bad magic number 0x0
is duplicated byASTERISK-26725 astobj2.c: FRACK!, Failed assertion bad magic number
Environment:Fedora 23, PJSIP Trunk 2.5.5 (Commit 5510) or Bundled PJSIPAttachments:( 0) assertions-1.txt
( 1) assertions-2.txt
( 2) backtrace_20170126_clean.txt
( 3) backtrace_20170131_clean.txt
( 4) hpbx06_backtrace_options_clean.txt
( 5) syslog.txt
Description:Every so often an asserting happens when sending OPTIONS pings to endpoints.

The follwoing snippit shows the asterisk log output:

{noformat}
[2017-01-06 08:44:47] ERROR[12271]: res_pjsip.c:3613 endpt_send_request: Error                                                                                                                                                                120001 'Operation not permitted' sending OPTIONS request to endpoint offnetVo                                                                                                                                                               xboneIA02
[2017-01-06 08:44:47] ERROR[12271]: astobj2.c:131 INTERNAL_OBJ: FRACK!, Failed                                                                                                                                                                assertion bad magic number 0x0 for object 0x7f2f48fbe190 (0)
Got 19 backtrace records
#0: [0x606b37] main/utils.c:2471 __ast_assert_failed() (0x606aaa+8D)
#1: [0x45dea2] main/astobj2.c:132 INTERNAL_OBJ()
#2: [0x45defd] main/astobj2.c:152 __ao2_lock() (0x45ded2+2B)
#3: [0x7f2f20319189] res/res_pjsip.c:3622 endpt_send_request()
#4: [0x7f2f2031b838] res/res_pjsip.c:3742 ast_sip_send_out_of_dialog_request()                                                                                                                                                                (0x7f2f2031b6d0+168)
#5: [0x7f2f2031eb5a] res_pjsip/pjsip_options.c:444 qualify_contact()
{noformat}
Comments:By: Asterisk Team (asteriskteam) 2017-01-06 04:11:16.142-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Rusty Newton (rnewton) 2017-01-06 20:00:00.562-0600

If we can get some more of the debug log including the SIP trace of the OPTIONS that would help us out. Thanks!

By: Ross Beer (rossbeer) 2017-01-09 05:21:56.154-0600

Unfortunately, I can't reproduce on the fly, the issue happens randomly so I can't keep a SIP trace running due to the vast amount of traffic

By: Rusty Newton (rnewton) 2017-01-10 14:01:38.817-0600

Okay, opening this one up as triage from [~jcolp] determined we may be able to investigate with what we have here. Obviously any further data, debug or SIP trace would be helpful if it turns out you can get it. The more the better, the faster a resolution will be found.

Thanks!

By: Ross Beer (rossbeer) 2017-01-13 10:15:00.452-0600

Just had another crash, the following log was shown on the CLI:

{noformat}
[2017-01-12 15:39:28] ERROR[182933]: res_pjsip.c:3614 endpt_send_request: Error 120001 'Operation not permitted' sending OPTIONS request to endpoint
[2017-01-12 15:39:28] ERROR[182933]: astobj2.c:131 INTERNAL_OBJ: FRACK!, Failed assertion bad magic number 0x0 for object 0x7f0ac407fc00 (0)
Got 19 backtrace records
#0: [0x606dba] main/utils.c:2471 __ast_assert_failed() (0x606d2d+8D)
#1: [0x45e0be] main/astobj2.c:132 INTERNAL_OBJ()
#2: [0x45e119] main/astobj2.c:152 __ao2_lock() (0x45e0ee+2B)
#3: [0x7f0af4ea7209] res/res_pjsip.c:3623 endpt_send_request()
{noformat}

By: Ross Beer (rossbeer) 2017-01-13 10:39:08.209-0600


This issue is happening across multiple asterisk instances.

Back trace shows:

{noformat}
Program terminated with signal SIGSEGV, Segmentation fault.
#0  pj_atomic_dec_and_get (atomic_var=0x30) at ../src/pj/os_core_unix.c:962
962    pj_mutex_lock( atomic_var->mutex );
[Current thread is 1 (Thread 0x7f0a797e0700 (LWP 182933))]
#0  0x00007f0b1cab0279 in pj_atomic_dec_and_get (atomic_var=0x30) at ../src/pj/os_core_unix.c:962
       new_value = <optimized out>
#1  0x00007f0b1cab583d in pj_grp_lock_dec_ref (glock=0x7f0ac4753d00) at ../src/pj/lock.c:554
       cnt = <optimized out>
#2  0x00007f0b1cab583d in pj_grp_lock_dec_ref (glock=0x7f0ac4753d00) at ../src/pj/lock.c:631
#3  0x00007f0b1cabf790 in cancel_timer (ht=0x155e500, entry=0x7f0ac4000078, flags=flags@entry=6, id_val=id_val@entry=0) at ../src/pj/timer.c:588
       grp_lock = <optimized out>
#4  0x00007f0b1cabfa3c in pj_timer_heap_cancel_if_active (id_val=id_val@entry=0, flags=6, entry=entry@entry=0x7f0ac4000078, ht=<optimized out>) at ../src/pj/timer.c:606
#5  0x00007f0b1cabfa3c in pj_timer_heap_cancel_if_active (ht=<optimized out>, entry=entry@entry=0x7f0ac4000078, id_val=id_val@entry=0) at ../src/pj/timer.c:605
#6  0x00007f0af4ea7222 in endpt_send_request (endpoint=endpoint@entry=0x23fddc8, tdata=tdata@entry=0x7f0aac3d78e8, timeout=timeout@entry=3000, token=token@entry=0x7f0ac430c978, cb=0x7f0af4ea7290 <send_request_cb>) at res_pjsip.c:3623
       timers_cancelled = <optimized out>
       errmsg = "Operation not permitted\000hK2\002\000\000\000\000\210\206=\254\n\177\000\000@x=\254\n\177\000\000\004\000\000\000\005\000\000\000PK2\002\000\000\000\000\360\372}y\n\177\000\000r\352E\000\000\000\000"
       req_wrapper = 0x7f0ac407fc00
       ret_val = 120001
       endpt = 0x155e218
       selector = {type = PJSIP_TPSELECTOR_TRANSPORT, u = {transport = 0x2373548, listener = 0x2373548, ptr = 0x2373548}}
#7  0x00007f0af4ea98c8 in ast_sip_send_out_of_dialog_request (tdata=0x7f0aac3d78e8, endpoint=endpoint@entry=0x23fddc8, timeout=3000, token=token@entry=0x2324b68, callback=callback@entry=0x7f0af4eacec0 <qualify_contact_cb>) at res_pjsip.c:3743
       supplement = 0x0
       contact = 0x2324b68
       __PRETTY_FUNCTION__ = "ast_sip_send_out_of_dialog_request"
#8  0x00007f0af4eacbea in qualify_contact (endpoint=endpoint@entry=0x0, contact=contact@entry=0x2324b68) at res_pjsip/pjsip_options.c:444
       tdata = 0x7f0aac3d78e8
       endpoint_local = 0x23fddc8
       __PRETTY_FUNCTION__ = "qualify_contact"
#9  0x00007f0af4eace63 in qualify_contact_task (obj=0x2324b68) at res_pjsip/pjsip_options.c:519
       contact = 0x2324b68
       res = <optimized out>
#10 0x00000000005ee6fc in ast_taskprocessor_execute (tps=0x155cfb0) at taskprocessor.c:967
       local = {local_data = 0x7f0a797e0700, data = 0x5f5e7f <ast_threadstorage_set_ptr+60>}
       t = 0x7f0aa80126b0
       size = 22400944
       __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#11 0x00000000005f82b1 in execute_tasks (data=0x155cfb0) at threadpool.c:1322
       tps = 0x155cfb0
#12 0x00000000005ee6fc in ast_taskprocessor_execute (tps=0x155c3c0) at taskprocessor.c:967
       local = {local_data = 0x7f0a797dfc80, data = 0x155aa90}
       t = 0x7f0aa8005f80
       size = 22391496
       __PRETTY_FUNCTION__ = "ast_taskprocessor_execute"
#13 0x00000000005f6566 in threadpool_execute (pool=0x155aae0) at threadpool.c:351
       __PRETTY_FUNCTION__ = "threadpool_execute"
#14 0x00000000005f7c1d in worker_active (worker=0x7f0ae40061d0) at threadpool.c:1105
       alive = 0
#15 0x00000000005f79d5 in worker_start (arg=0x7f0ae40061d0) at threadpool.c:1024
       worker = 0x7f0ae40061d0
       saved_state = (DEAD | unknown: 32520)
       __PRETTY_FUNCTION__ = "worker_start"
#16 0x0000000000603e98 in dummy_start (data=0x7f0ae40064d0) at utils.c:1235
       __cancel_buf = {__cancel_jmp_buf = {{__cancel_jmp_buf = {0, -2806034405047432239, 139686362798719, 139682964702976, 507904, 507904, -2806034405072598063, 2817569640175733713}, __mask_was_saved = 0}}, __pad = {0x7f0a797dfdf0, 0x0, 0x1, 0x7f0bf3c326e8 <__pthread_keys+1032>}}
       __cancel_routine = 0x4527db <ast_unregister_thread>
       __cancel_arg = 0x7f0a797e0700
       __not_first_call = 0
       ret = 0x7f0bf30118d8
       a = {start_routine = 0x5f794e <worker_start>, data = 0x7f0ae40061d0, name = 0x7f0ae40020e0 "worker_start         started at [ 1079] threadpool.c worker_thread_start()"}
#17 0x00007f0bf3a2061a in start_thread () at /lib64/libpthread.so.0
#18 0x00007f0bf2d5c5fd in clone () at /lib64/libc.so.6
{noformat}

By: Dmitriy Serov (Demon) 2017-01-18 13:57:56.551-0600

Installing asterisk 13.13.1 on ubuntu server with bundled pjproject.
Asterisk segfaults a lot of time (attachment syslog.txt)

syslog.txt - grep from syslog
assertions-1.txt, assertions-2.txt - cut from asterisk logs.

FRACK was after each "res_pjsip.c: Error 320047 'No answer record in the DNS response (PJLIB_UTIL_EDNSNOANSWERREC)' sending OPTIONS request" (why it's ERROR?)
but does not after:
res_pjsip.c: Error 171005 'Missing route set (for tel: URI) (PJSIP_ENOROUTESET)' sending OPTIONS request to endpoint

I tried compile and run without optimizations and better trace... * dosn't start at all. Hangs on startup. Systemd restarts it.

By: Ross Beer (rossbeer) 2017-01-26 10:23:59.179-0600

Just had another crash, using:

Asterisk GIT-13-13.12.2-250-gd30bef1
PJSUIP Trunk 2.5.5 (Commit 5510)

Please see attached backtrace.

By: Ross Beer (rossbeer) 2017-01-31 08:10:22.263-0600

This issue is still occurring, the version in use now is:

Asterisk GIT-13-13.12.2-311-g56e925f
PJSIP Bundled

Please see attached latest backtrace.

By: Friendly Automation (friendly-automation) 2017-02-02 09:36:27.036-0600

Change 4847 merged by Joshua Colp:
res_pjsip: Handle invocation of callback on outgoing request when error occurs.

[https://gerrit.asterisk.org/4847|https://gerrit.asterisk.org/4847]

By: Friendly Automation (friendly-automation) 2017-02-02 10:45:54.438-0600

Change 4848 merged by zuul:
res_pjsip: Handle invocation of callback on outgoing request when error occurs.

[https://gerrit.asterisk.org/4848|https://gerrit.asterisk.org/4848]

By: Friendly Automation (friendly-automation) 2017-02-02 10:50:53.733-0600

Change 4846 merged by Joshua Colp:
res_pjsip: Handle invocation of callback on outgoing request when error occurs.

[https://gerrit.asterisk.org/4846|https://gerrit.asterisk.org/4846]

By: Friendly Automation (friendly-automation) 2017-02-02 11:52:18.685-0600

Change 4868 merged by zuul:
res_pjsip: Handle invocation of callback on outgoing request when error occurs.

[https://gerrit.asterisk.org/4868|https://gerrit.asterisk.org/4868]