[Home]

Summary:ASTERISK-29119: res_odbc: Hang in libodbc when connecting
Reporter:Dennis Haney (CineCine)Labels:
Date Opened:2020-10-09 11:39:30Date Closed:2020-11-04 12:00:02.000-0600
Priority:MinorRegression?No
Status:Closed/CompleteComponents:Resources/res_odbc
Versions:17.5.1 Frequency of
Occurrence
Frequent
Related
Issues:
Environment:Linux 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15) x86_64 GNU/Linux Attachments:( 0) config.rar
Description:We recently upgraded our servers and we are now getting hangs in startup of asterisk.

Stack trace shows that it is stuck in ODBC. It does not happen every time, but about 70% of the time. kill -9 and restarting over and over until success solves the problem

{noformat}
Thread 55 (Thread 0x7f6c7146f700 (LWP 24622)):
#0  0x00007f6c9c80229c in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1  0x00007f6c9c7fb7d1 in __GI___pthread_mutex_lock (mutex=0x7f6c9d152968 <_rtld_global+2312>) at ../nptl/pthread_mutex_lock.c:115
#2  0x00007f6c9c813400 in __dlsym (handle=<optimized out>, name=<optimized out>) at dlsym.c:68
#3  0x00007f6c9962f61c in  () at /lib/x86_64-linux-gnu/libltdl.so.7
#4  0x00007f6c9962e8c5 in lt_dlsym () at /lib/x86_64-linux-gnu/libltdl.so.7
#5  0x00007f6c996414a7 in  () at /lib/x86_64-linux-gnu/libodbc.so.2
#6  0x00007f6c996442b0 in SQLConnect () at /lib/x86_64-linux-gnu/libodbc.so.2
#7  0x00007f6c9d12416e in odbc_obj_connect (obj=0x7f6c7c007610) at res_odbc.c:1059
#8  0x00007f6c9d12416e in _ast_odbc_request_obj2 (name=name@entry=0x7f6c7146c3d0 "catsql-sqlserver", flags=..., flags@entry=..., file=file@entry=0x7f6c9d126a40 "res_odbc.c", function=function@entry=0x7f6c9d127030 <__PRETTY_FUNCTION__.17755> "ast_odbc_find_table", lineno=lineno@entry=263) at res_odbc.c:944
#9  0x00007f6c9d1247df in _ast_odbc_request_obj (name=name@entry=0x7f6c7146c3d0 "catsql-sqlserver", check=check@entry=0, file=file@entry=0x7f6c9d126a40 "res_odbc.c", function=function@entry=0x7f6c9d127030 <__PRETTY_FUNCTION__.17755> "ast_odbc_find_table", lineno=lineno@entry=263) at res_odbc.c:992
#10 0x00007f6c9d12487a in ast_odbc_find_table (database=database@entry=0x7f6c7146c3d0 "catsql-sqlserver", tablename=tablename@entry=0x7f6c7146c4d0 "Asterisk_Login") at res_odbc.c:263
#11 0x00007f6c998aaaa2 in update_odbc (database=0x7f6c7146c3d0 "catsql-sqlserver", table=0x7f6c7146c4d0 "Asterisk_Login", keyfield=0x7f6c7bfc2c30 "name", lookup=0x55f3b5e727c0 "mysipprovider-out", fields=0x7f6c7c007540) at res_config_odbc.c:533
#12 0x000055f3b53f7c0d in ast_update_realtime_fields (family=family@entry=0x7f6c7bfc28c7 "sippeers", keyfield=keyfield@entry=0x7f6c7bfc2c30 "name", lookup=lookup@entry=0x55f3b5e727c0 "mysipprovider-out", fields=0x7f6c7c007540) at config.c:3478
#13 0x000055f3b53f7d09 in ast_update_realtime (family=0x7f6c7bfc28c7 "sippeers", keyfield=keyfield@entry=0x7f6c7bfc2c30 "name", lookup=lookup@entry=0x55f3b5e727c0 "mysipprovider-out") at config.c:3502
#14 0x00007f6c7bfa8f96 in handle_response_peerpoke (resp=<optimized out>, req=0x7f6c7146cb90, p=0x7f6c7c003430) at chan_sip.c:25012
#15 0x00007f6c7bfa8f96 in handle_response (p=p@entry=0x7f6c7c003430, resp=<optimized out>, rest=<optimized out>, req=req@entry=0x7f6c7146e260, seqno=<optimized out>) at chan_sip.c:25265
#16 0x00007f6c7bfb7990 in handle_incoming (p=0x7f6c7c003430, req=0x7f6c7146e260, addr=0x7f6c7146e1d0, recount=0x7f6c7146e198, nounlock=0x7f6c7146e19c) at chan_sip.c:29183
#17 0x00007f6c7bfb9538 in handle_request_do (req=req@entry=0x7f6c7146e260, addr=addr@entry=0x7f6c7146e1d0) at chan_sip.c:29493
#18 0x00007f6c7bfbaef2 in sipsock_read (id=<optimized out>, fd=<optimized out>, events=<optimized out>, ignore=<optimized out>) at chan_sip.c:29424
#19 0x000055f3b5318dc2 in ast_io_wait (ioc=0x55f3b62a98f0, howlong=<optimized out>) at io.c:297
#20 0x00007f6c7bf90e29 in do_monitor (data=data@entry=0x0) at chan_sip.c:30071
#21 0x000055f3b53c5fec in dummy_start (data=<optimized out>) at utils.c:1249
#22 0x00007f6c9c7f8fa3 in start_thread (arg=<optimized out>) at pthread_create.c:486
#23 0x00007f6c9c3cf4cf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
{noformat}

{noformat}
Thread 1 (Thread 0x7f6c9a18b780 (LWP 24555)):
#0  0x00007f6c9c80229c in __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103
#1  0x00007f6c9c7fb7d1 in __GI___pthread_mutex_lock (mutex=mutex@entry=0x55f3b5b35a58) at ../nptl/pthread_mutex_lock.c:115
#2  0x000055f3b5324678 in __ast_pthread_mutex_lock (filename=filename@entry=0x7f6c9d126a40 "res_odbc.c", lineno=lineno@entry=834, func=func@entry=0x7f6c9d126ed0 <__PRETTY_FUNCTION__.17932> "ast_odbc_release_obj", mutex_name=mutex_name@entry=0x7f6c9d126aec "&class->lock", t=t@entry=0x55f3b5b35a58) at lock.c:326
#3  0x00007f6c9d123cd4 in ast_odbc_release_obj (obj=obj@entry=0x55f3b5b36230) at res_odbc.c:834
#4  0x00007f6c9d1248fc in ast_odbc_find_table (database=database@entry=0x7fff5ebc9bb0 "catsql-sqlserver", tablename=tablename@entry=0x7fff5ebc9cb0 "Asterisk_Login") at res_odbc.c:337
#5  0x00007f6c998a8560 in require_odbc (database=0x7fff5ebc9bb0 "catsql-sqlserver", table=0x7fff5ebc9cb0 "Asterisk_Login", ap=0x7fff5ebc9b98) at res_config_odbc.c:1078
#6  0x000055f3b53f798c in ast_realtime_require_field (family=0x7f6c7bfc28c7 "sippeers") at config.c:3394
#7  0x00007f6c7bfbbe77 in load_module () at chan_sip.c:35699
#8  0x000055f3b5322196 in start_resource (mod=mod@entry=0x55f3b5d815c0) at loader.c:1711
#9  0x000055f3b5322cf8 in start_resource_attempt (mod=mod@entry=0x55f3b5d815c0, count=count@entry=0x7fff5ebca184) at loader.c:1887
#10 0x000055f3b5323b0a in start_resource_list (mod_count=0x7fff5ebca184, resources=0x7fff5ebca1a0) at loader.c:1984
#11 0x000055f3b5323b0a in load_resource_list (load_order=0x7fff5ebca190, load_order=0x7fff5ebca190, mod_count=<synthetic pointer>) at loader.c:2166
#12 0x000055f3b5323b0a in load_modules () at loader.c:2379
#13 0x000055f3b525e266 in asterisk_daemon (isroot=<optimized out>, rungroup=<optimized out>, runuser=<optimized out>) at asterisk.c:4155
#14 0x000055f3b525e266 in main (argc=<optimized out>, argv=<optimized out>) at asterisk.c:3923
{noformat}
Comments:By: Asterisk Team (asteriskteam) 2020-10-09 11:39:31.112-0500

Thanks for creating a report! The issue has entered the triage process. That means the issue will wait in this status until a Bug Marshal has an opportunity to review the issue. Once the issue has been reviewed you will receive comments regarding the next steps towards resolution. Please note that log messages and other files should not be sent to the Sangoma Asterisk Team unless explicitly asked for. All files should be placed on this issue in a sanitized fashion as needed.

A good first step is for you to review the [Asterisk Issue Guidelines|https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines] if you haven't already. The guidelines detail what is expected from an Asterisk issue report.

Then, if you are submitting a patch, please review the [Patch Contribution Process|https://wiki.asterisk.org/wiki/display/AST/Patch+Contribution+Process].

Please note that once your issue enters an open state it has been accepted. As Asterisk is an open source project there is no guarantee or timeframe on when your issue will be looked into. If you need expedient resolution you will need to find and pay a suitable developer. Asking for an update on your issue will not yield any progress on it and will not result in a response. All updates are posted to the issue when they occur.

Please note that by submitting data, code, or documentation to Sangoma through JIRA, you accept the Terms of Use present at [https://www.asterisk.org/terms-of-use/|https://www.asterisk.org/terms-of-use/].

By: Kevin Harwell (kharwell) 2020-10-12 15:09:49.549-0500

When you say "upgraded our servers" do you mean you upgraded Asterisk, or the OS itself, or both?

If Asterisk what version did you upgrade from?

By: Dennis Haney (CineCine) 2020-10-16 04:48:09.795-0500

We took an existing VM from another hosting environment and copied to this. Then it started hanging.
I have tried both the Microsoft ODBC driver from https://docs.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server?view=sql-server-ver15#debian17
and the FreeTDS driver, they behave in the exact same way.
When asterisk is blocking it is also still possible to run the exact same query asterisk runs using the isql tool with no issue.
I have since removed the usage of ODBC, since I found out it was only used for authentication and later AMI was used for another kind of authentication, so can skip first stage by just setting allowGuest=>yes

By: Joshua C. Colp (jcolp) 2020-10-16 05:04:04.051-0500

What version of UnixODBC was present, and what was the res_odbc.conf configuration?

The issue itself doesn't seem to be in Asterisk, but in UnixODBC.

By: Dennis Haney (CineCine) 2020-10-18 21:16:55.507-0500

This should be the relevant odbc config files.
Originally we just used freetds, and didnt have the ms driver installed, but tried it to see if it was a driver issue, but no difference.
the sanity sql runs fine using the command line unixodbc tools.
unixodbc=2.3.7

By: Dennis Haney (CineCine) 2020-10-18 23:30:52.736-0500

Forgot to mention that the SQL is a 2014 latest SP (12.0.6118.4), setup with an Always On Failover Cluster Instance (FCI).
Tried connecting both to the virtual IP and the name, no difference.

By: Joshua C. Colp (jcolp) 2020-10-19 04:21:08.719-0500

You also mentioned that you upgraded and this started happening. What version was previously in use? Did anything else change?

By: Dennis Haney (CineCine) 2020-10-20 04:25:58.169-0500

The physical machines the VM was running on was upgraded, and we changed from VMware to Nutanix.
It was working the first couple of days after that change though.

By: Joshua C. Colp (jcolp) 2020-10-20 04:35:36.557-0500

Did you change Asterisk versions, or strictly underlying infrastructure?

By: Dennis Haney (CineCine) 2020-10-20 22:28:04.232-0500

We took a functioning VM from one hosting environment and restored in another hosting environment.
This was working perfectly for the first few days, until suddenly we got these hangs at startup.
No settings seems to have been changed on either the asterisk, or the database between it was working in the new environment until it didnt, but they seem very likely to have causality.

By: Joshua C. Colp (jcolp) 2020-10-21 04:07:50.740-0500

The only thing that comes to mind is native optimization. Do you deselect "BUILD_NATIVE" in the Compiler Flags of menuselect? It's possible changing environments changed the disclosed CPU in some way altering the environment and causing problems with any native optimizations of the previous build.

Failing that I'm not sure there's anything we can do, as the issue would seem to be environmental outside of Asterisk itself. Our usage of UnixODBC is really simple and even in your backtrace we just ask UnixODBC to connect, that's it. Going down to that layer to investigate would likely be the next step but we're not experts on it.

By: Asterisk Team (asteriskteam) 2020-11-04 12:00:01.349-0600

Suspended due to lack of activity. This issue will be automatically re-opened if the reporter posts a comment. If you are not the reporter and would like this re-opened please create a new issue instead. If the new issue is related to this one a link will be created during the triage process. Further information on issue tracker usage can be found in the Asterisk Issue Guidlines [1].

[1] https://wiki.asterisk.org/wiki/display/AST/Asterisk+Issue+Guidelines