[Home]

Summary:ASTERISK-29230: pjsip: Asterisk goes crazy and massively spams logfile if registration can't be send
Reporter:Michael Maier (micha)Labels:fax patch
Date Opened:2020-12-30 11:45:22.000-0600Date Closed:2021-01-11 09:28:11.000-0600
Priority:BlockerRegression?
Status:Closed/CompleteComponents:Channels/chan_pjsip
Versions:18.1.0 Frequency of
Occurrence
Related
Issues:
Environment:CentOS 7, x86_64Attachments:( 0) easybeltrace.png
( 1) forced-tcp-kill.tar.gz
( 2) no-abort.patch.tar.gz
( 3) obr_abort.patch
Description:If registration is impossible because of an error sending a packet, asterisk goes crazy and spams the log with > 6740(!) identical entries during 2 seconds like this one (-> no mediasec involved!):

[2020-12-29 22:34:09] ERROR[4871] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:[number]@secure.sip.easybell.de'

There is a warning at the beginning of the spam attack:
[2020-12-29 22:34:09] WARNING[3661] taskprocessor.c: The 'pjsip/outreg/easybellPJSIP-00000069' task processor queue reached 500 scheduled tasks.

Another problem was at this time: The reRegistration started 14s too early (can't see any reason why)!  The first package seen in the pcap maps to the end of the timeline of the spam entries, means:
- 6740 spam entries
- first package seen in the pcap trace (written by asterisk)

The account is a standard account, configured using tcp/tls.

I consider this problem as a security problem, because it has the ability to kill a machine completely, depending on the machine's resources and the duration of the problem and the affected numbers in parallel (I could see the same problem some time later affecting 2(!) other numbers in parallel, but it luckily ended after about 0.1 seconds).
Comments:By: Asterisk Team (asteriskteam) 2020-12-30 11:45:23.534-0600

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: Asterisk Team (asteriskteam) 2020-12-30 11:45:24.736-0600

This issue has been automatically restricted and set to a blocker due to being a security type issue. If this is not a security vulnerability issue it will be moved to the appropriate issue type when triaged.

Please DO NOT put a code review up for this change at this time. Attach any applicable patches to this issue.

By: Joshua C. Colp (jcolp) 2020-12-30 12:09:07.209-0600

Please attach the packet capture. As well, to ensure that there is overlap on registration outbound registration will re-register 15 seconds early.

As well what does "can't be send" actually mean? What happened? Did the TLS connection drop? Do you have an Asterisk log as well?

By: Joshua C. Colp (jcolp) 2020-12-30 13:08:42.441-0600

Also, what is the complete outbound registration configuration?

By: Michael Maier (micha) 2020-12-30 14:26:59.857-0600

About register interval:
the expire interval is 90 s. The registration usually takes place each 80 s. Can be seen perfectly in Wireshark trace. But this time, the registration started 14 s earlier, i.e. after 66 s (where the first error messages in the log file came up). After 67 s, the first package can be seen in the pcap trace. See the attached screenshot.

The TLS connection seems not to be dropped - because I would have expected an error like this:
WARNING[3661] pjproject:                     SSL SSL_ERROR_SSL (Read): Level: 0 err: <336151538> <SSL routines-ssl3_read_bytes-sslv3 alert unexpected message> len: 65535 peer: 212.172.58.207:5061

I don't have any additional logs.

About the registration configuration:
[easybellPJSIP]
type=registration
transport=0.0.0.0-tls
outbound_auth=easybellPJSIP
retry_interval=60
fatal_retry_interval=0
forbidden_retry_interval=10
max_retries=10000
expiration=3600
auth_rejection_permanent=yes
line=yes
endpoint=easybellPJSIP
contact_user=+49....
server_uri=sip:secure.sip.easybell.de
client_uri=sip:$number@secure.sip.easybell.de

About "can't be send": That's said by the log entry: "Failed *send* registration to ..."

By: Michael Maier (micha) 2020-12-30 14:32:49.924-0600

The attached screenshot of the trace shows the error situation. Packet number 14207 is the reRegister packet coming up much too early. Before this packet, the spam in the logfile happened.

By: George Joseph (gjoseph) 2020-12-31 07:52:33.368-0600

Michael,

Can you send the actual pcap file to asteriskteam @ digium.com or host it on Google Drive, DropBox, etc and send us the link?  I'm going through the code to see if recent changes I made to outbound registration might be a factor but in the mean time, if you catch the issue happening, can you run the following command and provide the output?

{{pjsip show scheduled_tasks like outreg}}

thanks




By: George Joseph (gjoseph) 2020-12-31 10:53:04.622-0600

On Monday, can you set up a test instance of Asterisk and another account at easybell that you can leave running with just an outbound registration?  I'd like to give you a test build that will cause Asterisk to abort with a stack trace when the "Failed send" message is triggered.




By: Michael Maier (micha) 2020-12-31 15:25:59.215-0600

You have mail to asteriskteam!

By: George Joseph (gjoseph) 2020-12-31 16:04:14.187-0600

Thanks Michael,

You can do 2 things to get better logging...

In  pjproject.conf set
{code}
[log_mappings]  ; NOTE: The id of this object MUST be log_mappings.
type = log_mappings
asterisk_notice = 0,1
asterisk_debug = 2,3,4,5

[startup]
type = startup
log_level = 4
{code}
pjproject log level 4 has tons of stuff in it but it also has the registration debugging.  The output will be sent to the "debug" log channel so you can set the following in logger.conf...
{code}
pjsip.log => debug
{code}

You can also turn on debugging for res_pjsip_outbound_registration like so:
{code}
CLI> core set debug 3 res_pjsip_outbound_registration.so
{code}

That will also send our debug messages to the same pjsip.log file.

You can add some lines to cli.conf to always turn on that debugging...
{code}
[startup_commands]
core set debug 3 res_pjsip_outbound_registration.so = yes
{code}

Anyway, I _think_ what's happening is that the asterisk transport manager, which watches over tcp and tls connections, may be triggering the registration to re-establish and somehow it gets into a tight loop.  Now why it's doing that is still a mystery.  I hope the pcap can shed some light on it.,

Happy New Year!



By: George Joseph (gjoseph) 2021-01-04 09:07:28.387-0600

Michael, is {{support_mediasec}} turned on for this endpoint?
Can you give me the complete output of {{pjsip show endpoint easybellPJSIP}}?

EDIT:  I'm guessing it's not turned on since I don't see the header in the outgoing REGISTER requests but just wanted to make sure.


By: George Joseph (gjoseph) 2021-01-04 09:50:31.511-0600

One more question...  It looks like easybell requires reauthentication about every 8 minutes.  Does this make sense?


By: Michael Maier (micha) 2021-01-04 14:05:01.032-0600

Regarding mediasec: no, easybell doesn't require mediasec. that's why it's turned off. I will send you an example how mediasec looks like in a pcap trace.

In the configuration, it would look like this if it would be enabled:
{code}
[telekomPJSIP-1]
type=registration
transport=0.0.0.0-tls
outbound_auth=telekomPJSIP-1
retry_interval=60
fatal_retry_interval=0
forbidden_retry_interval=10
max_retries=10000
expiration=660
auth_rejection_permanent=yes
line=yes
endpoint=telekomPJSIP-1
contact_user=+4912341234123
server_uri=sip:tel.t-online.de
client_uri=sip:+4912341234123@tel.t-online.de
support_mediasec=true
^^^^^^^^^^^^^^^^^^^^^^
{code}
{code}
[telekomPJSIP-1]
type=endpoint
transport=0.0.0.0-tls
context=from-pstn-remove-plus
disallow=all
allow=alaw,ulaw
aors=telekomPJSIP-1
send_connected_line=false
language=de
outbound_auth=telekomPJSIP-1
from_domain=tel.t-online.de
from_user=+4912341234123
contact_user=+4912341234123
user_eq_phone=no
t38_udptl=no
t38_udptl_ec=none
fax_detect=no
trust_id_inbound=no
t38_udptl_nat=no
direct_media=no
rewrite_contact=yes
media_encryption=sdes
rtp_symmetric=yes
dtmf_mode=rfc4733
support_mediasec=1
^^^^^^^^^^^^^^^^^^^^
{code}

Well, about reauthentication: Yes, you're right - but I never thought about this (it just works). The whole reRegistertime is quite short in case of easybell - if you compare it with Deutsche Telekom (see the pcap trace I sent you via email).

Hope I could answer your questions.

By: George Joseph (gjoseph) 2021-01-04 16:36:37.395-0600

Makes sense now, thanks!


By: George Joseph (gjoseph) 2021-01-05 14:27:40.456-0600

I completely lost track of whether you can or have set up a test server, sorry :)

If you have or can, can you apply the patch I supplied above?  It will cause an abort in asterisk when the "Failed send registration" message happens.  Compile with DONT_OPTIMIZE and let it run.  At the same time, can you get a pcap using wireshark or tcpdump as well as using the internal "pjsip set logger pcap" facility?  The external wireshark/tcpdump is needed to see what's happening on the TCP/TLS layer.  The payload will be encrypted but that's fine because the unencrypted payload will be in the pcap generated by asterisk itself.

Tnanks!


By: Michael Maier (micha) 2021-01-05 16:43:01.793-0600

I tried to force a broken connection with tcpkill. tcpkill just resets an existing connection [1].

I'm not sure if this is the problem I'm searching for - there is no SSL error. But you may take a look at the attached traces anyway. I did add only the tcpdump trace just to show how tcpkill acts. The asterisk trace is unspectacular.

Do you by chance know of any tool, which can inject data to a running tcp connection? I would like to force an SSL error.

[1] https://rtomaszewski.blogspot.com/2012/11/how-to-forcibly-kill-established-tcp.html

By: Michael Maier (micha) 2021-01-06 12:40:25.802-0600

While trying to force an SSL_ERROR by different means, I could find following:
- pjsip uses a [default tls tcp keep alive|https://www.pjsip.org/pjsip/docs/html/group__PJSIP__CONFIG.htm#ga9d874db29a45e1908e92444c82edd719] of 90s (this is not the SIP qualify or the Registers).

I tried to force it by changing the system time. No chance. The keep alives are working perfectly - regardless if the clock is set ahead or back.
The registers behave different after switching the clock: If the clock is set back, there isn't any register any more until the next reRegister time is arrived again.
If the clock is set ahead and the next planned reRegister time is overrun by this step, no more reRegister will take place at all. If you check this on the CLI, you get the following output:
{code}
virtast*CLI> pjsip show registrations

<Registration/ServerURI..............................>  <Auth..........>  <Status.......>  <Last Reg..>  <Intvl>  <Next Start.....secs>
========================================================================================================================================

easybellPJSIP/sip:secure.sip.easybell.de                easybellPJSIP     Registered       Wed 18:24:57       80  Wed 19:18:02        0

Objects found: 1

virtast*CLI> pjsip show registrations

<Registration/ServerURI..............................>  <Auth..........>  <Status.......>  <Last Reg..>  <Intvl>  <Next Start.....secs>
========================================================================================================================================

easybellPJSIP/sip:secure.sip.easybell.de                easybellPJSIP     Registered       Wed 18:24:57       80  Wed 19:18:06        0

Objects found: 1

virtast*CLI> pjsip show registrations

<Registration/ServerURI..............................>  <Auth..........>  <Status.......>  <Last Reg..>  <Intvl>  <Next Start.....secs>
========================================================================================================================================

easybellPJSIP/sip:secure.sip.easybell.de                easybellPJSIP     Registered       Wed 18:24:57       80  Wed 19:18:09        0

Objects found: 1

[2021-01-06 19:18:09] DEBUG[1295]: res_pjsip_registrar.c:1293 check_expiration_thread: Expiring 0 contacts
virtast*CLI> pjsip show registrations

<Registration/ServerURI..............................>  <Auth..........>  <Status.......>  <Last Reg..>  <Intvl>  <Next Start.....secs>
========================================================================================================================================

easybellPJSIP/sip:secure.sip.easybell.de                easybellPJSIP     Registered       Wed 18:24:57       80  Wed 19:18:14        0
{code}
-> The "Next Start" is always the actual time and secs is always 0 - but reRegister never starts again.
What does this mean? Even a very small time step could lead to a complete outage of a trunk. I'm wondering that keep alives are completely resistant - but not the reRegisters.
It's even impossible to force an unRegister - it is just ignored! No - wrong. "pjsip unregister *all" did work - but "pjsip unregister easybellPJSIP" didn't work.

By: George Joseph (gjoseph) 2021-01-06 13:23:20.657-0600

Sigh.   Added to the list. :(


By: Michael Maier (micha) 2021-01-06 14:45:53.029-0600

I'm so sorry - please excuse me!

By: George Joseph (gjoseph) 2021-01-06 15:22:40.125-0600

Heh, not _your_ fault.  It's mine.  I think we're going to revert that commit from November that seems to have caused these issues.


By: Michael Maier (micha) 2021-01-07 01:38:34.483-0600

Next test I did - just to verify, if the tcpkill forcing the "Failed send registration to server" is a means to reproduce the original problem.
Therefore I redid the tcpkill w/o the abort patch applied. I just wanted to see, if I'm getting more than one "Failed send registration" - entries. As you can see in the attached traces: yes, I'm getting more of them: 9 times. There have been 9 reRegistrations before (and one initial Registration). Chance?
{code}
grep -e "in 80000 ms" -e "Asterisk 18.1.1 built by test" -e "Failed send" /var/log/asterisk/full
[2021-01-07 07:20:26] Asterisk 18.1.1 built by test @ localhost.localdomain on a x86_64 running Linux on 2020-12-23 15:31:26 UTC
[2021-01-07 07:20:39] DEBUG[1386] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:21:59] DEBUG[1439] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:23:19] DEBUG[1480] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:24:39] DEBUG[1485] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:25:59] DEBUG[1490] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:27:19] DEBUG[1495] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:28:39] DEBUG[1500] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:29:59] DEBUG[1507] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:31:19] DEBUG[1513] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:32:39] DEBUG[1522] res_pjsip_outbound_registration.c: easybellPJSIP: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80000 ms
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
[2021-01-07 07:33:59] ERROR[1532] res_pjsip_outbound_registration.c: easybellPJSIP: Failed send registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de'
{code}
The attached full trace of asterisk starts at the moment tcpkill gets active.

To sum it up:
- tcpkill seems to be a valid means to reproduce the problem.
- The number of "Failed send registration" entries seems to be dependent on the number of the reRegisters done before? That's most probably why there have been much less spamming as the same problem happened with t-online hours later (I wrote it directly to you via mail), because the lifetime of a register is much longer with t-online (660s vs. 90s).


Next I will test, how Asterisk 18.0 behaves (before your changes from November you mentioned). I'll be back.

By: Michael Maier (micha) 2021-01-07 03:50:27.877-0600

Ok, back again. Testing 18.0.1 in relation of the problems detected here:

Testcases:
- tcpkill
   slightly different - but finally same behavior
{code}
   [2021-01-07 08:59:01] Asterisk 18.0.1 built by test @ localhost.localdomain on a x86_64 running Linux on 2020-11-06 05:39:17 UTC
   [2021-01-07 08:59:08] DEBUG[2069] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:00:28] DEBUG[2122] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:01:48] DEBUG[2140] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:03:08] DEBUG[2147] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:04:28] DEBUG[2152] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:05:48] DEBUG[2157] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:07:08] DEBUG[2162] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:08:28] DEBUG[2167] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:09:49] DEBUG[2172] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:11:09] DEBUG[2177] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:12:29] DEBUG[2182] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:13:49] DEBUG[2209] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:15:09] DEBUG[2215] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:16:29] DEBUG[2220] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:17:49] DEBUG[2225] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:19:09] DEBUG[2230] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:20:29] DEBUG[2239] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:21:49] DEBUG[2241] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:23:09] DEBUG[2246] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:24:29] DEBUG[2255] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   tcpkill takes effect
   [2021-01-07 09:25:49] DEBUG[2068] res_pjsip/pjsip_transport_events.c: Reliable transport 'tlsc0x7f2da82ef328' state:SHUTDOWN
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown
   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Outbound registration transport to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' shutdown

   [2021-01-07 09:25:49] DEBUG[2263] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:27:09] DEBUG[2268] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:28:29] DEBUG[2273] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:29:49] DEBUG[2293] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
{code}
   => spam (21) seems to be dependent on the number of Registrations done before (21), too.

- CLI: unregister / register / show registrations
   unregister to trunk name: works
   pjsip show registrations w/o an active registration: works
   pjsip send register trunk name: works

- stepping the system time backward and forward
   *forward*: at ~ 09:47:05 set system time to 09:50:00
   from full - starting with last registration bevor the system time changed:
{code}
   [2021-01-07 09:46:17] DEBUG[2416] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds

   [Jan  7 09:50:00 virtast systemd: Time has been changed] (09:50:00 is already the new time!)

   [2021-01-07 09:50:13] DEBUG[2425] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:51:33] DEBUG[2435] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:52:53] DEBUG[2442] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 09:54:13] DEBUG[2450] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
{code}
   -> stepping forward seems to be handled fine
   BTW: chronyd realized the forward step here:
{code}
   [Jan  7 09:57:22 virtast chronyd[876]: System clock wrong by -156.407752 seconds, adjustment started]
{code}
   *backward*: at ~ 10:01:50 (-> the time which has been set before) via chronyd makestep
   from full - starting with last registration before the system time changed:
{code}
   [2021-01-07 09:59:33] DEBUG[2476] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 10:00:54] DEBUG[2481] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   
   [Jan  7 09:59:44 virtast chronyd[876]: System clock was stepped by -131.509575 seconds] (09:59:44 is already the corrected system time)
   [Jan  7 09:59:44 virtast systemd: Time has been changed]

   [2021-01-07 10:00:02] DEBUG[2504] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 10:01:22] DEBUG[2524] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
   [2021-01-07 10:02:42] DEBUG[2554] res_pjsip_outbound_registration.c: Scheduling outbound registration to server 'sip:secure.sip.easybell.de' from client 'sip:004912345678990@secure.sip.easybell.de' in 80 seconds
{code}
   -> stepping backwards seems to be handled fine


*To sum it up*:
A *revert* would *not fix* the original reason for opening this ticket - but it would fix the other problems detected by chance while trying to reproduce / force the problem.


A few more considerations:
- Registration and all subsequent actions performed must always go to the [same destination|https://www.mail-archive.com/asterisk-dev@lists.digium.com/msg45021.html]
- Think about the handling of more than one number registered to the same destination - please ensure, that these registers are really independent though they are using the same transport and the exactly same connection! I think there are problems, too (if one number is unregistered and registered again, sometimes suddenly all numbers are reRegistered - at least with 18.1.x - I didn't realize this before. Why did I [detect it|https://www.mail-archive.com/asterisk-dev@lists.digium.com/msg45207.html]? Because I realized that the problem described in the [mentioned link|https://www.mail-archive.com/asterisk-dev@lists.digium.com/msg45207.html] seems to disappear as long as the registrations of the 3 numbers aren't done at the same time - therefore I'm trying to serialize them with enough time (more than 30s) between the registration of each number - I didn't saw any problem any more since serialization). The problem itself (-> faulty warning messages) could already be seen with asterisk 16.x.

By: George Joseph (gjoseph) 2021-01-07 08:33:43.558-0600

Are you sure that the multiple messages aren't caused by tcpkill continuing to kill new connections?   The last change to res_pjsip_outbound_registration before my commit in November was in February of last year and it only involved retrying when the upstream provider used SRV records.


By: Michael Maier (micha) 2021-01-07 08:55:53.736-0600

Yes, I'm pretty sure as tcpkill needs the local port - this port is changed after the old connection is dropped and the new connection is started.
That's how tcpkill is started:
{code}
tcpkill -i eth0 port 48067
{code}
48067 is the local port of the actual tls tcp connection to the ISP's server. After the new connection is built up, the local port changes - therefore, tcpkill doesn't work any more (even if you don't switch it off - one time I forgot it to switch off - the few RST-packages sent by tcpkill had no effect any more after there was a new connection with a new local port). Besides that, in the logfile of asterisk, you get only one time the connection reset info, e.g.:
{code}
[2021-01-07 06:59:03] DEBUG[4178] res_pjsip/pjsip_transport_events.c: Reliable transport 'tlsc0x7fdfe02f0388' state:DISCONNECTED
[2021-01-07 06:59:03] DEBUG[4178] res_pjsip/pjsip_transport_events.c: Reliable transport 'tlsc0x7fdfe02f0388' state:SHUTDOWN
...
[2021-01-07 06:59:21] DEBUG[8015] res_pjsip/pjsip_transport_events.c: Reliable transport 'tlsc0x7faf882eff68' state:CONNECTED
{code}

Regarding last changes - I have to take a look at my old logs. I fear, this problem isn't new ... .

By: Michael Maier (micha) 2021-01-07 09:26:54.872-0600

Regarding old logs (I have many of them until 2016):
Well, before asterisk 18.1, there wasn't any log output besides *DEBUG* (normally, I disable DEBUG output, because my device runs on a APU2 and SD-Card - that's why a massive amount of logentries is critical - they may break the complete system / SD-Card (I already had such a situation in the past)). The actual log output is an *ERROR* message which seems to didn't exist before 18.1. (I didn't search for it in the code so far).

The relation between the amount of ERROR lines (or DEBUG lines before in the same error situation) and the amount of Registers done before is highly suspicious to me. From my point of view, there aren't all old entries removed from an old register after a new reRegister has been performed.

By: Friendly Automation (friendly-automation) 2021-01-11 09:28:12.955-0600

Change 15292 merged by Friendly Automation:
Revert "res_pjsip_outbound_registration.c:  Use our own scheduler and other stuff"

[https://gerrit.asterisk.org/c/asterisk/+/15292|https://gerrit.asterisk.org/c/asterisk/+/15292]

By: Friendly Automation (friendly-automation) 2021-01-11 09:28:47.790-0600

Change 15290 merged by Friendly Automation:
Revert "res_pjsip_outbound_registration.c:  Use our own scheduler and other stuff"

[https://gerrit.asterisk.org/c/asterisk/+/15290|https://gerrit.asterisk.org/c/asterisk/+/15290]

By: Friendly Automation (friendly-automation) 2021-01-11 09:30:30.756-0600

Change 15291 merged by Joshua Colp:
Revert "res_pjsip_outbound_registration.c:  Use our own scheduler and other stuff"

[https://gerrit.asterisk.org/c/asterisk/+/15291|https://gerrit.asterisk.org/c/asterisk/+/15291]