[Home]

Summary:ASTERISK-16479: [patch] Strange booting behavior depending on nice value
Reporter:Gabriel Ortiz Lour (elbriga)Labels:
Date Opened:2010-07-30 07:29:51Date Closed:2011-07-27 13:14:01
Priority:CriticalRegression?No
Status:Closed/CompleteComponents:Core/General
Versions:Frequency of
Occurrence
Related
Issues:
Environment:Attachments:( 0) backtrace.txt
( 1) GSD-res_config_pgsql-fix-notConnError.patch
Description:We're having some weird issues here with asterisk 1.6.2.10 (wasn't on the list of versions)

After many try-n-fail attempts we concluded that the problem is around the res_config_pgsql and the nice value when starting asterisk trough safe_asterisk.

when the nice value is 0 (default of safe_asterisk) or lower it wont boot correctly (I assume that not all modules get loaded, since core commands like "reload" doesn't work), but when we use a higher nice value (less priority) like 10 or more it will boot correctly. Curious is that when it is in the "hanged" CLI state and we perform a "/etc/init.d/postgresql-8.3 restart" asterisk will "un-hang" and continue normal boot, eventually reconnecting with postgres in normal operation.

Other thing we did was poke around with the res_config_pgsql, because it appeared to be executing querys without being connected (probably the cause of the hang). What we did was put a call to "pgsql_reconnect" in the "find_table" method (because this method will do querys without calling the reconnect method that makes sure we're connected), and that seemed to resolve the problem.

****** ADDITIONAL INFORMATION ******

Eventually it will crash instead of hanging, with this CLI output:
(But funny is that if I turn debug ON with the -d flag it will not crash!)

.
.
.
 == Registered custom function 'SIPCHANINFO'
 == Registered custom function 'CHECKSIPDOMAIN'
 == Manager registered action SIPpeers
 == Manager registered action SIPshowpeer
 == Manager registered action SIPqualifypeer
 == Manager registered action SIPshowregistry
 == Manager registered action SIPnotify
message type 0x54 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x44 arrived from server while idle
message type 0x43 arrived from server while idle
message type 0x5a arrived from server while idle
[Jul 30 09:25:20] WARNING[13220]: res_config_pgsql.c:351 realtime_pgsql: PostgreSQL RealTime: Failed to query 'sip@asterisk'. Check debug for more info.
core dumped

Comments:By: Paul Belanger (pabelanger) 2010-07-30 07:41:34

We'll need a backtrace (see below) to triage the issue.  I suggest dropping safe_asterisk from your test and see if you can reproduce when calling asterisk directly.

---
Thank you for your bug report. In order to move your issue forward, we require a backtrace from the core file produced after the crash. Please see the doc/backtrace.txt file in your Asterisk source directory.

Also, be sure you have DONT_OPTIMIZE enabled in menuselect within the Compiler Flags section, then:

make install

after enabling, reproduce the crash, and then execute the instructions in doc/backtrace.txt.

When complete, attach that file to this issue report. Thanks!

By: Gabriel Ortiz Lour (elbriga) 2010-07-30 08:11:26

backtrace generated with the instructions provided

By: Leif Madsen (lmadsen) 2010-08-05 14:21:39

I'm acknowledging this issue. Thanks for the report!

By: Gabriel Ortiz Lour (elbriga) 2010-08-06 09:18:15

we've made a patch here where we moved the call to the "pgsql_reconnect()" function up, before the call to the "find_table()" function on every other function that uses the "find_table()" [ update_pgsql() - update2_pgsql() - require_pgsql() ]

We are now using this modded version for testing, but it seams to have solved the issue (We restart every day to test).

By: Gabriel Ortiz Lour (elbriga) 2011-01-13 12:10:52.000-0600

The bug persists on the 1.6.2.15 version

By: Gabriel Ortiz Lour (elbriga) 2011-03-15 13:19:16

Just uploaded a patch that fixes the issue on the 1.6.2.15 version

By: Russell Bryant (russell) 2011-07-27 13:13:55.009-0500

Per the Asterisk maintenance timeline page at http://www.asterisk.org/asterisk-versions maintenance (bug) support for the 1.4 and 1.6.x branches has ended. For continued maintenance support please move to the 1.8 branch which is a long term support (LTS) branch. For more information about branch support, please see https://wiki.asterisk.org/wiki/display/AST/Asterisk+Versions

If this is still an issue, please open a new issue so it can be re-triaged appropriately. Thanks!