[Home]

Summary:ASTERISK-25744: res_pjsip: Segfaults in ssl3_write_bytes, pj_ssl_sock_send, tls_send_msg
Reporter:Dmitriy Serov (Demon)Labels:
Date Opened:2016-02-04 03:32:48.000-0600Date Closed:
Priority:MajorRegression?
Status:Open/NewComponents:Resources/res_pjsip
Versions:13.7.0 Frequency of
Occurrence
Related
Issues:
Environment:openssl-1.0.1q-x86_64 pjproject 2.4.5Attachments:( 0) 2016_02_03__17_00_07.backtrace-threads.txt
( 1) 2016_02_03__17_00_07.full.tail.txt
( 2) 2016_02_03__23_38_07.backtrace-threads.txt
( 3) 2016_02_03__23_38_07.full.tail.txt
( 4) 2016_02_03__23_52_07.backtrace-threads.txt
( 5) 2016_02_03__23_52_07.full.tail.txt
( 6) 2016_02_04__00_42_07.backtrace-threads.txt
( 7) 2016_02_04__00_42_07.full.tail.txt
( 8) 2016_02_04__10_36_07.backtrace-threads.txt
( 9) 2016_02_04__10_36_07.full.tail.txt
(10) 2016_02_10__00_04_07.backtrace-threads.txt
(11) 2016_02_10__00_04_07.full.tail.txt
(12) 2016_02_10__06_18_07.backtrace-threads.txt
(13) 2016_02_10__06_18_07.full.tail.txt
(14) 2016_02_13__02_54_07.backtrace-threads.txt
(15) 2016_02_13__02_54_07.full.tail.txt
(16) 2016_02_14__21_44_07.backtrace-threads.txt
(17) 2016_02_14__21_44_07.full.tail.txt
Description:Regular segfaults in the same place of the program.
Asterisk faults, 3-4 times per day

Several logs attached.
Comments:By: Asterisk Team (asteriskteam) 2016-02-04 03:32:50.416-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Dmitriy Serov (Demon) 2016-02-04 03:34:22.437-0600

Log each fault is represented by two files: backtrace and full log tail.

By: Joshua C. Colp (jcolp) 2016-02-04 06:43:34.580-0600

We're going to need more information in addition to this to try to narrow it down more.

How many TLS connections are active? Are there few, are there a lot? Are they connecting/disconnecting a lot? Is there a lot of traffic going on?

By: Dmitriy Serov (Demon) 2016-02-04 07:08:22.319-0600

cmd "pjsip show contacts like transport=TLS" / grep -c Contact
Result: 18.
But this number may change as you connect and disconnect devices.


By: Rusty Newton (rnewton) 2016-02-04 17:56:46.804-0600

Dmitriy is this a new install or did it start after an upgrade?

By: Dmitriy Serov (Demon) 2016-02-05 06:04:58.720-0600

This error was before, but much less frequently. But it is difficult to say, because the server kept falling over for a different reason: ASTERISK-25439
After eliminating one of the causes started to fall with TLS.
Most likely this error is a consequence, a manifestation of some other situation.

I'm ready to put in the right place assert in order to catch the situation earlier. For example, I don't understand how in backtraces can be such an IP address (despite the fact that the length of the value 12 bytes):

{noformat}
#7  0x00007fb2d40c5eab in pjsip_endpt_respond_stateless (endpt=0x1fb8a28, rdata=0x7fb2900c23c8, st_code=501, st_text=0x0, hdr_list=0x0, body=0x0) at ../src/pjsip/sip_util.c:1869
       status = 0
       res_addr = {transport = 0x7fb2902b9928, addr = {addr = {sa_family = 2}, ipv4 = {sin_family = 2, sin_port = 53779, sin_addr = {s_addr = 624595237}, sin_zero = "\000\000\000\000\000\000\000"}, ipv6 = {sin6_family = 2, sin6_port = 53779, sin6_flowinfo = 624595237, sin6_addr = {s6_addr = '\000' <repeats 15 times>, u6_addr32 = {0, 0, 0, 0}}, sin6_scope_id = 0}}, addr_len = 28, dst_host = {flag = 3, type = PJSIP_TRANSPORT_TLS, addr = {host = {ptr = 0x7fb28ae31708 "37.145.58.3710)", slen = 12}, port = 5074}}}
       tdata = 0x7fb28ae30bb8
{noformat}

{noformat}
#11 0x00007f3d4a0baeab in pjsip_endpt_respond_stateless (endpt=0x1fb7d38, rdata=0x7f3ce442c158, st_code=501, st_text=0x0, hdr_list=0x0, body=0x0) at ../src/pjsip/sip_util.c:1869
       status = 0
       res_addr = {transport = 0x7f3ce41cdc28, addr = {addr = {sa_family = 2}, ipv4 = {sin_family = 2, sin_port = 53779, sin_addr = {s_addr = 624595237}, sin_zero = "\000\000\000\000\000\000\000"}, ipv6 = {sin6_family = 2, sin6_port = 53779, sin6_flowinfo = 624595237, sin6_addr = {s6_addr = '\000' <repeats 15 times>, u6_addr32 = {0, 0, 0, 0}}, sin6_scope_id = 0}}, addr_len = 28, dst_host = {flag = 3, type = PJSIP_TRANSPORT_TLS, addr = {host = {ptr = 0x7f3ce4a4ced8 "37.145.58.37", slen = 12}, port = 5074}}}
       tdata = 0x7f3ce4a4c388
{noformat}

{noformat}
#8  0x00007fca268f5eab in pjsip_endpt_respond_stateless (endpt=0x1fb7df8, rdata=0x7fca08303be8, st_code=501, st_text=0x0, hdr_list=0x0, body=0x0) at ../src/pjsip/sip_util.c:1869
       status = 0
       res_addr = {transport = 0x7fca08253eb8, addr = {addr = {sa_family = 2}, ipv4 = {sin_family = 2, sin_port = 50451, sin_addr = {s_addr = 624595237}, sin_zero = "\000\000\000\000\000\000\000"}, ipv6 = {sin6_family = 2, sin6_port = 50451, sin6_flowinfo = 624595237, sin6_addr = {s6_addr = '\000' <repeats 15 times>, u6_addr32 = {0, 0, 0, 0}}, sin6_scope_id = 0}}, addr_len = 28, dst_host = {flag = 3, type = PJSIP_TRANSPORT_TLS, addr = {host = {ptr = 0x7fca5810dea8 "37.145.58.37\312\177", slen = 12}, port = 5061}}}
       tdata = 0x7fca5810d358
{noformat}


By: Dmitriy Serov (Demon) 2016-02-05 13:37:06.879-0600

pjproject config_site.h
{noformat}
#define NDEBUG
#define PJ_ENABLE_EXTRA_CHECK 1
#define PJ_MAX_HOSTNAME 1024
#define PJSIP_MAX_URL_SIZE 1024
#define PJ_IOQUEUE_MAX_HANDLES       5000
#define PJSIP_MAX_TSX_COUNT          ((640*1024)-1)
#define PJSIP_MAX_DIALOG_COUNT       ((640*1024)-1)
#define PJSIP_UDP_SO_SNDBUF_SIZE     (24*1024*1024)
#define PJSIP_UDP_SO_RCVBUF_SIZE     (24*1024*1024)
#define PJSUA_MAX_CALLS              512
{noformat}

By: Dmitriy Serov (Demon) 2016-02-10 01:47:21.588-0600

Was the idea that reason is shared with res_xmpp (iksemel-1.5).
xmpp uses TLS and I've met a lot of references to xmpp in backtraces.
I disabled this module and was even happy when the server then has worked for more than a day.
But after that, still segfaults, the log of which is attached (2016_02_10).

By: Dmitriy Serov (Demon) 2016-02-12 01:28:03.680-0600

The server crashes at least 2-3 times a day with this error. Logs already accumulated for the publication of separate books.
What's wrong with the libraries or the server settings?
No one is using res_pjsip TLS?

By: Dmitriy Serov (Demon) 2016-02-13 08:33:16.435-0600

2016_02_13__02_54_07.backtrace-threads.txt
Other stack of segfault in SSL.


By: Dmitriy Serov (Demon) 2016-02-14 13:17:48.640-0600

once more backtrace with more deep stack.
2016_02_14__21_44_07.backtrace-threads.txt