Summary: | ASTERISK-27061: res_pjsip: Crash/segfault during T.38 reinvite / negotiation | ||
Reporter: | Michael Maier (micha) | Labels: | fax pjsip |
Date Opened: | 2017-06-18 10:30:05 | Date Closed: | |
Priority: | Major | Regression? | |
Status: | Open/New | Components: | Resources/res_pjsip Resources/res_pjsip_session |
Versions: | 13.16.0 | Frequency of Occurrence | Constant |
Related Issues: | |||
Environment: | Centos 6, 64 bit | Attachments: | ( 0) 2018-04-04T11-19-06-0600-brief.txt ( 1) 2018-04-04T11-19-06-0600-full.txt ( 2) 2018-04-04T11-19-06-0600-locks.txt ( 3) 2018-04-04T11-19-06-0600-thread1.txt ( 4) asterisk-sigseg.coredump ( 5) callflow-1.png ( 6) callflow-2.png ( 7) core.myfw-2017-06-20T18-35-52+0200-brief.txt ( 8) core.myfw-2017-06-20T18-35-52+0200-full.txt ( 9) core.myfw-2017-06-20T18-35-52+0200-locks.txt (10) core.myfw-2017-06-20T18-35-52+0200-thread1.txt (11) core.tar.gz (12) sigseg-on-reinvite |
Description: | sigseg happens during sending of a fax / t.38 the following way:
fax client -> asterisk -> telekom -> easybell -> asterisk -> fax server Fax server sends t.38 reinvite via asterisk to easybell. {noformat} Session Description Protocol Version (v): 0 Owner/Creator, Session Id (o): - 2447581897 4 IN IP4 46.17.15.23 Session Name (s): Asterisk Connection Information (c): IN IP4 46.17.15.23 Time Description, active time (t): 0 0 Media Description, name and address (m): image 4573 udptl t38 Media Attribute (a): T38FaxVersion:0 Media Attribute (a): T38MaxBitRate:14400 Media Attribute (a): T38FaxRateManagement:transferredTCF Media Attribute (a): T38FaxMaxDatagram:397 Media Attribute (a): T38FaxUdpEC:t38UDPRedundancy {noformat} This reinvite is received by asterisk via telekom: {noformat} Session Description Protocol Version (v): 0 Owner/Creator, Session Id (o): - 1811299599 2925027276 IN IP4 0.0.0.0 Session Name (s): - Time Description, active time (t): 0 0 Media Description, name and address (m): image 0 udptl t38 Media Attribute (a): sendrecv Media Attribute (a): T38FaxVersion:0 Media Attribute (a): T38MaxBitRate:14400 Media Attribute (a): T38FaxRateManagement:transferredTCF Media Attribute (a): T38FaxMaxDatagram:397 Media Attribute (a): T38FaxUdpEC:t38UDPRedundancy {noformat} And asterisk gives it to the fax client: {noformat} Session Description Protocol Version (v): 0 Owner/Creator, Session Id (o): - 1497774025 5 IN IP4 192.168.12.13 Session Name (s): Asterisk Connection Information (c): IN IP4 192.168.12.13 Time Description, active time (t): 0 0 Media Description, name and address (m): image 4284 udptl t38 Media Attribute (a): T38FaxVersion:0 Media Attribute (a): T38MaxBitRate:14400 Media Attribute (a): T38FaxRateManagement:transferredTCF Media Attribute (a): T38FaxMaxDatagram:393 Media Attribute (a): T38FaxUdpEC:t38UDPRedundancy {noformat} Completely ignoring, that telekom doesn't support it (port and ip addresses are set to 0). On completing the negotiation after 200 ok SDP and ACK from fax client, asterisk crashes. Stack trace and asterisk log is attached! | ||
Comments: | By: Asterisk Team (asteriskteam) 2017-06-18 10:30:07.198-0500 Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report. Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process]. By: Michael Maier (micha) 2017-06-18 10:33:14.123-0500 See attached file By: Michael Maier (micha) 2017-06-18 10:35:42.911-0500 Core dump By: George Joseph (gjoseph) 2017-06-20 10:19:36.382-0500 There's not enough info in the backtrace to debug. Please follow the steps located at https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines#AsteriskIssueGuidelines-Submittingthebugreport Specifically https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace By: Michael Maier (micha) 2017-06-20 12:10:47.788-0500 Hope this coredump provides better information. I missed DEBUG_THREAD. If this should be necessary, too, I may redo the action. By: Rusty Newton (rnewton) 2017-06-22 17:44:07.366-0500 If you used DONT_OPTIMIZE and BETTER_BACKTRACES you should be good. Please attach files directly to the issue, use the .txt format so we can easily view from browser and so on. You can read the guidelines for what we want generally: https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines You may want to include a pcap that correlates to a backtrace, as well as the (sanitized) configuration files you used when reproducing the issue. By: Michael Maier (micha) 2017-06-23 00:47:08.924-0500 If you don't want to have tar.gz files, please don't write at https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace that you would like to have them ... . Anyway, I provided them separately now. Following, I'm trying again to describe the problem. Fax-client sends a fax to a fax server. The fax-server sends a t.38 reinvite to the fax-client via asterisk, t.38 enabled provider (easybell), t.38 disabled provider (Telekom), asterisk, fax client. The interesting path starts up here: Asterisk gets the initial reinivate from fax server: {noformat} <--- Received SIP request (961 bytes) from UDP:192.168.13.27:5060 ---> INVITE sip:asterisk@192.168.17.45:5061 SIP/2.0 CSeq: 1 INVITE Via: SIP/2.0/UDP 192.168.13.27:5060;branch=z9hG4bK8e713751-a052-e711-86c0-20aa4beaa0d8;rport User-Agent: T38Modem/3.15.2 From: <sip:91@192.168.13.27>;tag=784a9850-a052-e711-86c0-20aa4beaa0d8 Call-ID: 00956b20-8c05-4291-a7bf-6de53ecb151b Supported: 100rel,replaces Organization: Frolov,Holtschneider,Davidson To: "11111111111" <sip:11111111111@192.168.17.45>;tag=8203967c-17d0-4702-a505-cd87d60e1e1f Contact: "root" <sip:91@192.168.13.27:5060> Allow: INVITE,ACK,OPTIONS,BYE,CANCEL,SUBSCRIBE,NOTIFY,REFER,MESSAGE,INFO,PING,PRACK Content-Length: 292 Content-Type: application/sdp Max-Forwards: 70 v=0 o=- 1497796215 2 IN IP4 192.168.13.27 s=T38Modem/3.15.2 c=IN IP4 192.168.13.27 t=0 0 m=image 10020 udptl t38 a=sendrecv a=T38FaxVersion:0 a=T38FaxRateManagement:transferredTCF a=T38FaxMaxBuffer:2000 a=T38FaxMaxDatagram:1400 a=T38FaxUdpEC:t38UDPRedundancy a=T38MaxBitRate:14400 {noformat} Asterisk sends this reinvite to easybell, supporting t.38 {noformat} <--- Transmitting SIP request (1026 bytes) to UDP:195.185.37.60:5060 ---> INVITE sip:2441B741-59468E4C0005AFB2-57B1D700@195.185.37.60;transport=udp SIP/2.0 Via: SIP/2.0/UDP 47.23.24.13:5061;rport;branch=z9hG4bKPj90b9bd66-aac4-42d2-a573-fc899bddd4df From: "Anonymous" <sip:+4933333333333@sip.easybell.de>;tag=8a945beb-6a79-4b48-ab32-dea8993cd1dd To: <sip:11111111111@sip.easybell.de>;tag=621ADAB6-59468E4C0005A1B0-00F57700 Contact: <sip:47.23.24.13:5061> Call-ID: fba5dd2a8960e150-Acc1802-B2b15@10.233.181.52_b2b-1 CSeq: 2634 INVITE Allow: OPTIONS, SUBSCRIBE, NOTIFY, PUBLISH, INVITE, ACK, BYE, CANCEL, UPDATE, PRACK, REGISTER, REFER, MESSAGE Supported: 100rel, timer, replaces, norefersub Session-Expires: 1800 Min-SE: 90 Max-Forwards: 70 User-Agent: FPBX-13.0.191.11(13.16.0) Content-Type: application/sdp Content-Length: 249 v=0 o=- 3615070031 4 IN IP4 47.23.24.13 s=Asterisk c=IN IP4 47.23.24.13 t=0 0 m=image 4403 udptl t38 a=T38FaxVersion:0 a=T38MaxBitRate:14400 a=T38FaxRateManagement:transferredTCF a=T38FaxMaxDatagram:397 a=T38FaxUdpEC:t38UDPRedundancy {noformat} Now, easybell puts this reinivite to Telekom, which does not support t.38. Telekom sends it to asterisk, "zeroing" IP and port in the SDP (I call it "zeroed" reinvite). Probably it would have been better (correct?) by telekom to send back a t.38 not acceptable at this point to easybell instead of passing a "zeroed" reinvite to the client side. Asterisk now gets the "zeroed" reinivite from Telekom: {noformat} <--- Received SIP request (1025 bytes) from UDP:217.0.18.36:5060 ---> INVITE sip:+4911111111111@47.23.24.13:5061 SIP/2.0 Max-Forwards: 59 Via: SIP/2.0/UDP 217.0.18.36:5060;branch=z9hG4bKg3Zqkv7iqilhwxg8ldenht4q1dnp7osfh To: <sip:+4911111111111@tel.t-online.de>;tag=9d0638f9-36ac-4029-9d2e-7cbedaf60acb From: <sip:222222222222@tel.t-online.de>;tag=h7g4Esbg_p65547t1497796172m158005c212284176s1_3637090326-573937032 Call-ID: 8c76f245-0d1f-41e7-884d-951fb2e511dd CSeq: 17718 INVITE Contact: <sip:sgc_c@217.0.18.36;transport=udp>;+g.3gpp.icsi-ref="urn%3Aurn-7%3A3gpp-service.ims.icsi.mmtel" Min-Se: 900 Session-Expires: 1800;refresher=uac Supported: timer Supported: norefersub Content-Type: application/sdp Content-Length: 230 X-Hmr: iis12!ois12! Allow: REGISTER, REFER, NOTIFY, SUBSCRIBE, PRACK, PUBLISH, UPDATE, INVITE, ACK, OPTIONS, CANCEL, BYE v=0 o=- 1958103834 3638078137 IN IP4 0.0.0.0 s=- t=0 0 m=image 0 udptl t38 a=sendrecv a=T38FaxVersion:0 a=T38MaxBitRate:14400 a=T38FaxRateManagement:transferredTCF a=T38FaxMaxDatagram:397 a=T38FaxUdpEC:t38UDPRedundancy {noformat} At this point, the problem is going on: Asterisk is sending a completely wrong reinivte to the fax client, because it is ignoring, that telekom doesn't support t.38 at all: {noformat} o=- 1958103834 3638078137 IN IP4 0.0.0.0 m=image 0 udptl t38 {noformat} At this point, it isn't a good idea to negotiate with the extension t.38. What should be done after wards? Asterisk will get t.38 packages from fax extension. But where should they sent to? It just won't work! Anyway - asterisk didn't recognize the problem and starts t.38 negotiation with the fax extension. {noformat} <--- Transmitting SIP request (976 bytes) to UDP:192.168.13.27:5060 ---> INVITE sip:91@192.168.13.27:5060 SIP/2.0 Via: SIP/2.0/UDP 192.168.17.45:5061;rport;branch=z9hG4bKPj24004eb2-8488-4af0-bf5b-c53c1cfaf9c5 From: "CID:+4911111111111" <sip:222222222222@192.168.17.45>;tag=6129c79c-5ff4-4170-9474-e22c3e70a93d To: "root" <sip:91@192.168.17.45>;tag=6cd7314f-a052-e711-86c0-20aa4beaa0d8 Contact: <sip:192.168.17.45:5061> Call-ID: dcda314f-a052-e711-86c0-20aa4beaa0d8@notebook2 CSeq: 5710 INVITE Allow: OPTIONS, SUBSCRIBE, NOTIFY, PUBLISH, INVITE, ACK, BYE, CANCEL, UPDATE, PRACK, REGISTER, REFER, MESSAGE Supported: 100rel, timer, replaces, norefersub Session-Expires: 1800 Min-SE: 90 Max-Forwards: 70 User-Agent: FPBX-13.0.191.11(13.16.0) Content-Type: application/sdp Content-Length: 247 v=0 o=- 1497796213 5 IN IP4 192.168.17.45 s=Asterisk c=IN IP4 192.168.17.45 t=0 0 m=image 4206 udptl t38 a=T38FaxVersion:0 a=T38MaxBitRate:14400 a=T38FaxRateManagement:transferredTCF a=T38FaxMaxDatagram:393 a=T38FaxUdpEC:t38UDPRedundancy {noformat} Asterisk now successfully performs a complete t.38 negotiation with the fax client. After the negotiation with the fax client has been successfully completed, asterisk resumes the processing of the negotiation towards Telekom. At this point, the crash happens, mostly because asterisk didn't realise at all at the very beginning receiving the zeroed package from Telekom, that it was completely wrong what it did towards the fax client. Reached this point now, asterisk can't match those two negotiations: Firstly, asterisk told the fax client to use t.38, but this successfull negotiation can't be sent to Telekom, because Telekom already disagreed to use t.38. At this moment, asterisk crashes. It can't build an answer to Telekom. To summarize it, or: How to reproduce it Given is an asterisk server, providing a t.38 enabled trunk and extension. The t.38 enabled trunk gets a "zeroed" t.38 reinvite for the t.38 enabled extension. The fax extension connected to the t.38 extension of asterisk is t.38 enabled, too. From my point of view, the key question boils down to: how does asterisk handle the initial incoming *zeroed* reinivite. I don't think that it is handled correctly at this time. Create a test scenario like this and you most probably will see the crash, too. By: Michael Maier (micha) 2017-06-23 00:49:16.564-0500 txt files extracted from core.tar.gz By: Michael Maier (micha) 2017-06-23 04:22:12.171-0500 The config: nothing special. As the machine is a multihomed machine (local and global IP), NAT isn't relevant at all. {noformat} [telekomPJSIP-111] type=endpoint transport=0.0.0.0-udp context=from-trunk disallow=all allow=alaw aors=telekomPJSIP-111 language=de outbound_auth=telekomPJSIP-111 from_domain=tel.t-online.de t38_udptl=yes t38_udptl_ec=redundancy fax_detect=no t38_udptl_nat=no or yes dtmf_mode=rfc4733 tos_audio=0xb8 direct_media=no [91] type=endpoint aors=91 auth=91-auth disallow=all allow=alaw,ulaw context=from-fax-extension callerid=device <91> dtmf_mode=rfc4733 aggregate_mwi=yes use_avpf=no ice_support=no media_use_received_transport=no trust_id_inbound=yes media_encryption=no timers=yes media_encryption_optimistic=no rtp_symmetric=yes rewrite_contact=yes force_rport=yes language=de t38_udptl=yes t38_udptl_ec=redundancy direct_media=no t38_udptl_maxdatagram=400 t38_udptl_nat=no or yes [easybellPJSIP] type=endpoint transport=0.0.0.0-udp context=from-trunk disallow=all allow=g722,alaw aors=easybellPJSIP language=de outbound_auth=easybellPJSIP from_domain=sip.easybell.de t38_udptl=yes t38_udptl_ec=redundancy fax_detect=no t38_udptl_nat=no or yes send_rpid=yes send_pai=yes dtmf_mode=auto {noformat} If you want to rebuild my binaries, I could provide you with the src.rpm-file for CentOS 6.9. But it exceeds 10 MB (31 MB). Therefore, if you want to have it, please drop me a note where to send it to. The relevant part of the spec-File: {noformat} menuselect/menuselect --disable-category MENUSELECT_CORE_SOUNDS --disable-category MENUSELECT_EXTRA_SOUNDS --disable-category MENUSELECT_MOH --enable-category MENUSELECT_ADDONS --enable res_pktccops --enable chan_mgcp --enable chan_motif --enable app_meetme --enable app_page --enable res_snmp --enable res_srtp --enable DONT_OPTIMIZE --enable BETTER_BACKTRACES --disable BUILD_NATIVE --enable res_statsd --enable res_chan_stats --enable res_endpoint_stats menuselect.makeopts {noformat} {noformat} ps -C asterisk u USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND asterisk 28280 2.6 4.5 2022472 91640 ? Sl Jun20 104:25 /usr/sbin/asterisk -f -U asterisk -G asterisk -g -v -v -vvvg -c {noformat} -> I don't know how to provide better stack traces. The pcap trace file wouldn't contain anything more as already described in this bug report. For me, it contains sensible data I'm not willing to provide - sorry. But if you have a piece of code, which *completely* anonymizes the trace ... . By: Michael Maier (micha) 2017-06-23 07:08:07.910-0500 Graphical overview of connections as seen by wireshark until the crash happens. 192.168.13.27 is fax client and fax server. 192.168.17.45 and 47.23.24.13 represents the multihomed asterisk server 217.0.18.36 is the sip telekom server 195.185.37.60 is the sip and rtp easybell server 217.0.4.230 is the telekom rtp server Hope this helps a little bit to more easily understand what's going on until the crash happens. By: Rusty Newton (rnewton) 2017-07-01 13:13:56.414-0500 [~micha] , thanks for the detailed description and analysis as well as debug information. bq. If you don't want to have tar.gz files, please don't write at https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace that you would like to have them ... . Sorry about the confusion there. I forgot momentarily that we changed that requirement a while back. I'm used to doing it a particular way for years - hard to change the habit. :) I'm opening the issue up as you've provided plenty of information for the issue to be investigated. By: Bryan Nelson (bnelsonfs) 2018-04-04 18:34:08.409-0500 We experienced a possibly related crash today, with a bit of a simpler setup. I am attempting to build a SIPP testing scenario now to reliably reproduce the crash. Incoming call to asterisk, asterisk answers, fax detect sends to fax extension, FaxReceive() begins, asterisk sends re-invite to T.38, far end responds with an OK that lacks an ip address in the c header of the SDP: {quote} v=0 o=Sonus_UAC 29312 16158 IN IP4 *redacted* s=SIP Media Capabilities *c=IN IP4* t=0 0 m=image 0 udptl t38 a=T38FaxVersion:0 a=T38MaxBitRate:14400 a=T38FaxRateManagement:transferredTCF a=T38FaxMaxDatagram:1400 a=T38FaxUdpEC:t38UDPRedundancy a=sendrecv {quote} Attached are the results of of core dump. By: Bryan Nelson (bnelsonfs) 2018-04-04 18:35:03.082-0500 ast_coredumper results |