[Home]

Summary:ASTERISK-26506: [patch]res_pjsip_outbound_publish: Crash when publishing, in publisher_client_send at res_pjsip_outbound_publish.c
Reporter:Matt Krokosz (mkrokosz@vonage.com)Labels:
Date Opened:2016-10-26 12:28:26Date Closed:2016-11-01 18:33:17
Priority:MinorRegression?
Status:Closed/CompleteComponents:Resources/res_pjsip_outbound_publish
Versions:14.0.2 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Linux 3.10.0-229.14.1.el7.x86_64 on AWS EC2 c4.xlargeAttachments:( 0) ASTERISK-26506.patch
( 1) backtrace.txt
( 2) extensions_test.conf
( 3) pjsip_devstate.conf
( 4) pjsip_presence.conf
( 5) pjsip.conf
Description:With multiple instances of Asterisk sharing device state via a SIP publish using pjsip, core dumps are occurring.  The core dump is easier to reproduce under high CPS (100+) testing but also occurs at low levels.  

Currently using pjsip for device state but using chan_sip for call.  
Comments:By: Asterisk Team (asteriskteam) 2016-10-26 12:28:27.510-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Matt Krokosz (mkrokosz@vonage.com) 2016-10-26 12:29:16.930-0500

Backtrace of core dump.

By: Joshua C. Colp (jcolp) 2016-10-26 12:35:53.911-0500

Thank you for taking the time to report this bug and helping to make Asterisk better. Unfortunately, we cannot work on this bug because your description did not include enough information. Please read over the Asterisk Issue Guidelines [1] which discusses the information necessary for your issue to be resolved and the format that information needs to be in. We would be grateful if you would then provide a more complete description of the problem. At a minimum, we need:

1. The specific steps or actions you took that caused you to encounter the problem.
2. The behavior you expected and the location of documentation that led you to that expectation.
3. The behavior you actually encountered.

To demonstrate the issue in detail, please include Asterisk log files generated per the instructions on the wiki [2]. If applicable, please ensure that protocol-level trace debugging is enabled, e.g., 'sip set debug on' if the issue involves chan_sip, and configuration information such as dialplan and channel configuration.

Thanks!

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines

[2] https://wiki.asterisk.org/wiki/display/AST/Collecting+Debug+Information



By: Matt Krokosz (mkrokosz@vonage.com) 2016-10-26 12:43:56.841-0500

PJSIP conf files.

By: Matt Krokosz (mkrokosz@vonage.com) 2016-10-26 12:50:05.276-0500

Dialplan

By: Matt Krokosz (mkrokosz@vonage.com) 2016-10-26 16:03:14.480-0500

Some more information which may help is that I am using realtime sip peers (chan_sip) to generate call traffic with rtautoclear set to 300.  This means the peer I am using for testing will get cleared and reloaded every 5 mins during the testing.  I will disable rtautoclear and see if that changes the behavior.

By: Matt Krokosz (mkrokosz@vonage.com) 2016-10-27 12:41:28.874-0500

Changing behavior of real time peer caching had no effect, problem still occurs with peer that is cached without expiration.

By: Rusty Newton (rnewton) 2016-10-28 07:47:33.595-0500

[~mkrokosz@vonage.com] can you attach a log captured during the reproduction as well?

Try to get a log with : WARNING,ERROR,NOTICE,VERBOSE,DEBUG.  Make sure Verbose and Debug are both turned up to 5.

Thanks!

By: Matt Krokosz (mkrokosz@vonage.com) 2016-10-28 07:52:21.432-0500

I'll work on getting logs, just had some issues dealing with the amount of logs as I am reproducing under high CPS.  I do have a patch that appears to be working.  I will attach it as well and can submit via gerrit but want to let tests run for few more hours.  

By: Matt Krokosz (mkrokosz@vonage.com) 2016-10-28 07:53:29.619-0500

Potential patch for fix.

By: Matt Krokosz (mkrokosz@vonage.com) 2016-10-28 07:58:22.872-0500

Possible patch for fix.

By: Rusty Newton (rnewton) 2016-10-28 08:37:59.895-0500

Awesome!

By: Friendly Automation (friendly-automation) 2016-10-28 13:30:42.078-0500

Change 4221 had a related patch set uploaded by Matt Krokosz:
res_pjsip_outbound_publish: Fix crash when publishing device state.

[https://gerrit.asterisk.org/4221|https://gerrit.asterisk.org/4221]

By: Friendly Automation (friendly-automation) 2016-11-01 13:38:25.774-0500

Change 4269 had a related patch set uploaded by George Joseph:
res_pjsip_outbound_publish: Fix crash when publishing device state.

[https://gerrit.asterisk.org/4269|https://gerrit.asterisk.org/4269]

By: Friendly Automation (friendly-automation) 2016-11-01 18:33:18.589-0500

Change 4221 merged by Joshua Colp:
res_pjsip_outbound_publish: Fix crash when publishing device state.

[https://gerrit.asterisk.org/4221|https://gerrit.asterisk.org/4221]

By: Friendly Automation (friendly-automation) 2016-11-02 01:22:18.590-0500

Change 4269 merged by zuul:
res_pjsip_outbound_publish: Fix crash when publishing device state.

[https://gerrit.asterisk.org/4269|https://gerrit.asterisk.org/4269]