Summary: | ASTERISK-26291: res_pjsip_session: segfault on already disconnected session | ||
Reporter: | Alexei Gradinari (alexei gradinari) | Labels: | |
Date Opened: | 2016-08-11 16:47:08 | Date Closed: | 2017-03-03 06:20:24.000-0600 |
Priority: | Major | Regression? | |
Status: | Closed/Complete | Components: | Resources/res_pjsip_session |
Versions: | 13.10.0 | Frequency of Occurrence | |
Related Issues: | |||
Environment: | Attachments: | ( 0) bt_20160812.txt ( 1) bt_full_208160811.txt ( 2) pjproject_log.txt | |
Description: | On heavy loaded system the TCP/TLS incoming calls could be
disconnected by pjproject while these calls are being processed by asterisk which could use the session's memory pools. If the session in the disconnected state then the session memory pools were already freed, so we get segfault. | ||
Comments: | By: Asterisk Team (asteriskteam) 2016-08-11 16:47:09 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Alexei Gradinari (alexei gradinari) 2016-08-11 16:47:59.017-0500 Full backtrace By: Alexei Gradinari (alexei gradinari) 2016-08-11 16:49:21.578-0500 pjproject WARNING/ERROR log about Failed sending because of Broken pipe before segfault By: Alexei Gradinari (alexei gradinari) 2016-08-12 15:23:28.234-0500 new segfault backtrace on handle_incoming_sdp By: Joshua C. Colp (jcolp) 2016-08-15 05:15:04.869-0500 Per my comment on the review I think we need a full Asterisk log and full backtrace with all threads to understand how exactly the off-nominal situation happened and whether it's the appropriate fix or not. By: Joshua C. Colp (jcolp) 2016-08-17 05:10:07.670-0500 Copy/pasting from Gerrit: {quote} I used SIPp to stress test asterisk using TLS. The scenario: SIPp-sender: INVITE transport:TLS -> ASTERISK ASTERISK: INVITE transport:TLS -> SIPp-receiver SIPp-receiver: 200 OK with sdp -> ASTERISK ASTERISK: 200 OK with sdp -> SIPp-sender If SIPp-sender terminates TCP connection than the pjproject calls on_tsx_state_changed with state PJSIP_EVENT_TRANSPORT_ERROR. I think session_inv_on_tsx_state_changed is run on pjsip monitor thread, at the same time there may be task in the queue of the session serializer. So when taskprocessor execs the function new_invite, the session is already in disconnected state. I see a difference between PJSIP_EVENT_TRANSPORT_ERROR and PJSIP_EVENT_TIMER in the function session_inv_on_tsx_state_changed. {quote} By: Joshua C. Colp (jcolp) 2016-08-17 05:11:26.839-0500 [~alexei gradinari] If you could attach what I mentioned it would be great, so that others can take a look and come to a complete solution. If not someone else will have to lab it up like you have and see. By: Rusty Newton (rnewton) 2016-08-24 09:51:58.825-0500 Opening this up since discussion is happening in Gerrit. |