[Home]

Summary:ASTERISK-27533: res_pjsip: Large number of inbound/outbound registrations exhausts memory pool
Reporter:Dmitriy Serov (Demon)Labels:pjsip
Date Opened:2017-12-26 11:56:33.000-0600Date Closed:
Priority:MinorRegression?Yes
Status:Open/NewComponents:Resources/res_pjsip
Versions:15.2.0 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Attachments:( 0) core.asterisk.32135.talk37.1514731124.zip
( 1) core.asterisk.32229.talk37.1514314350-brief.txt
( 2) core.asterisk.32229.talk37.1514314350-full.txt
( 3) core.asterisk.32229.talk37.1514314350-thread1.txt
( 4) core.asterisk.6926.talk37.1514706246.zip
( 5) core.asterisk.8524.talk37.1514306431-brief.txt
( 6) core.asterisk.8524.talk37.1514306431-full.txt
( 7) core.asterisk.8524.talk37.1514306431-thread1.txt
( 8) pjsip_transport_get_flag_from_type.zip
Description:Asterisk periodically (2 times per day) segfaults in libasteriskpj.so. This error was not observed in previous versions (up to 15.2.0)

Thread 1 (Thread 0x7f05b1815700 (LWP 8566)):
#0  0x00007f0625f7f039 in pjsip_endpt_create_pool (endpt=0xc88f, pool_name=0x7f06260411a7 "tls", initial=512, increment=512) at ../src/pjsip/sip_endpoint.c:664
       pool = 0x797469726f687475
#1  0x00007f0625f91f1c in tls_create (listener=0x7f055893a5e8, pool=0x0, ssock=0x7f05c25393e8, is_server=1, local=0x7f05b18148bc, remote=0x7f05b1814890, remote_name=0x0, p_tls=0x7f05b1814858)
Comments:By: Asterisk Team (asteriskteam) 2017-12-26 11:56:34.116-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

By: Dmitriy Serov (Demon) 2017-12-26 11:58:19.156-0600

Backtrace attached
core.asterisk.8524.*

By: Dmitriy Serov (Demon) 2017-12-26 12:59:32.603-0600

Backtrace of other segfault in libasteriskpj.so
core.asterisk.32229.*


By: Joshua C. Colp (jcolp) 2018-01-02 05:45:19.712-0600

One of these crashes is due to an inability to allocate a memory pool, and the other seems to be because a transport has gone away.

In both cases we need to see the console output at the time as well as have further information about the system. It's usage patterns, how many TCP/TLS connections are up, how often they are dropping.

By: Dmitriy Serov (Demon) 2018-01-02 10:53:45.097-0600

Attached two .zip archives with segfault logs.

Asterisk segfaults 1-3 times per day.

~8000 inbound registrations.
~700 outbound registrations.

~130 ESTABLISHED TCP/TLS connections.

By: Joshua C. Colp (jcolp) 2018-01-02 17:04:38.949-0600

That is not a level that has been tested against the PJSIP support, so I don't know how it would handle such a thing. It is entirely possible that the memory pool configuration needs further tweaking to allow such high level of use.

By: Dmitriy Serov (Demon) 2018-01-11 10:29:35.160-0600

Today was 9 drops of servers in less than a day.
Analysis of the dump showed that the vast majority of the drops in one place: pjsip_transport_get_flag_from_type which is called from the pjsip_endpt_acquire_transport2.
Usually found in the stack code, which is responsible for subscription_notify or subscription_notify.

Attached zip file with several dumps: pjsip_transport_get_flag_from_type.zip

By: Dmitriy Serov (Demon) 2018-01-22 04:14:27.823-0600

One week ago i added to modules.conf:
noload => res_resolver_unbound.so
noload => res_pjsip_pubsub.so
noload => res_pjsip_exten_state.so
noload => res_pjsip_dialog_info_body_generator.so
noload => res_pjsip_mwi_body_generator.so
noload => res_pjsip_pidf_body_generator.so
noload => res_pjsip_xpidf_body_generator.so

As the result: server segfaults once per day with other stack.