Tuesday, January 23, 2007

Debugging mysql5 on Nexenta

Due to some very favorable benchmarking results, I am planning to migrate some of our production databases to Myqsl5 on Nexenta. Previously the database was running on a Debian Linux server, and the I/O subsystem performed much worse on that system.

I ran into a very strange problem with mysql5 under Nexenta, however. After a certain number of clients connected, sometimes the server would begin to refuse connections in a very strange way. It would accept the connection to the mysql port, and then immediately close the connection.

Fortunately, one of the other reasons I want to move to Nexenta is the more robust toolchain for troubleshooting just these kinds of problems. I started out by using 'truss' on thread 1 of the mysql daemon under the assumption that it was the thread responsible for managing incoming client connections - not a bad guess. Here is a trace of a mysql connection that works correctly vs one that breaks:

Works OK:

root@perftest-db01:~# truss -w all -p 6650/1
/1: pollsys(0x080473C0, 2, 0x00000000, 0x00000000) (sleeping...)
/1: pollsys(0x080473C0, 2, 0x00000000, 0x00000000) = 1
/1: fcntl(11, F_SETFL, FWRITE|FNONBLOCK) = 0
/1: accept(11, 0x08047948, 0x08047958, SOV_DEFAULT) = 57
/1: fcntl(11, F_SETFL, FWRITE) = 0
/1: sigaction(SIGCLD, 0x08047420, 0x080474A0) = 0
/1: getpid() = 6650 [6589]
/1: getpeername(57, 0xFEF67A90, 0x080474B8, SOV_DEFAULT) = 0
/1: getsockname(57, 0xFEF67A80, 0x080474B8, SOV_DEFAULT) = 0
/1: open("/etc/hosts.allow", O_RDONLY) = 58
/1: fstat64(58, 0x08046B20) = 0
/1: fstat64(58, 0x08046A50) = 0
/1: ioctl(58, TCGETA, 0x08046AEC) Err#25 ENOTTY
/1: read(58, " # / e t c / h o s t s".., 8192) = 677
/1: read(58, 0x504EA88C, 8192) = 0
/1: llseek(58, 0, SEEK_CUR) = 677
/1: close(58) = 0
/1: open("/etc/hosts.deny", O_RDONLY) = 58
/1: fstat64(58, 0x08046B20) = 0
/1: fstat64(58, 0x08046A50) = 0
/1: ioctl(58, TCGETA, 0x08046AEC) Err#25 ENOTTY
/1: read(58, " # / e t c / h o s t s".., 8192) = 901
/1: read(58, 0x504EA88C, 8192) = 0
/1: llseek(58, 0, SEEK_CUR) = 901
/1: close(58) = 0
/1: getsockname(57, 0x08047938, 0x08047958, SOV_DEFAULT) = 0
/1: fcntl(57, F_SETFL, (no flags)) = 0
/1: fcntl(57, F_GETFL) = 2
/1: fcntl(57, F_SETFL, FWRITE|FNONBLOCK) = 0
/1: setsockopt(57, ip, 3, 0x0804748C, 4, SOV_DEFAULT) = 0
/1: setsockopt(57, tcp, TCP_NODELAY, 0x0804748C, 4, SOV_DEFAULT) = 0
/1: time() = 1169599669
/1: lwp_kill(73, SIG#0) Err#3 ESRCH
/1: lwp_create(0x08047240, LWP_DETACHED|LWP_SUSPENDED, 0x08047464) = 243
/1: lwp_continue(243) = 0
/1: pollsys(0x080473C0, 2, 0x00000000, 0x00000000) (sleeping...)

Immediately closes connection:

root@perftest-db01:~# truss  -w all  -p 6650/1
/1: pollsys(0x080473C0, 2, 0x00000000, 0x00000000) (sleeping...)
/1: pollsys(0x080473C0, 2, 0x00000000, 0x00000000) = 1
/1: fcntl(11, F_SETFL, FWRITE|FNONBLOCK) = 0
/1: accept(11, 0x08047948, 0x08047958, SOV_DEFAULT) = 255
/1: fcntl(11, F_SETFL, FWRITE) = 0
/1: sigaction(SIGCLD, 0x08047420, 0x080474A0) = 0
/1: getpid() = 6650 [6589]
/1: getpeername(255, 0xFEF67A90, 0x080474B8, SOV_DEFAULT) = 0
/1: getsockname(255, 0xFEF67A80, 0x080474B8, SOV_DEFAULT) = 0
/1: open("/etc/hosts.allow", O_RDONLY) = 257
/1: close(257) = 0
/1: fxstat(2, 256, 0x08045DF8) = 0
/1: time() = 1169599778
/1: getpid() = 6650 [6589]
/1: putmsg(256, 0x080467B8, 0x080467C4, 0) = 0
/1: open("/var/run/syslog_door", O_RDONLY) = 257
/1: door_info(257, 0x08045C10) = 0
/1: getpid() = 6650 [6589]
/1: door_call(257, 0x08045C48) = 0
/1: close(257) = 0
/1: fxstat(2, 256, 0x080459B8) = 0
/1: time() = 1169599778
/1: getpid() = 6650 [6589]
/1: putmsg(256, 0x08046378, 0x08046384, 0) = 0
/1: open("/var/run/syslog_door", O_RDONLY) = 257
/1: door_info(257, 0x080457D0) = 0
/1: getpid() = 6650 [6589]
/1: door_call(257, 0x08045808) = 0
/1: close(257) = 0
/1: fxstat(2, 256, 0x08046A88) = 0
/1: time() = 1169599778
/1: getpid() = 6650 [6589]
/1: putmsg(256, 0x08047448, 0x08047454, 0) = 0
/1: open("/var/run/syslog_door", O_RDONLY) = 257
/1: door_info(257, 0x080468A0) = 0
/1: getpid() = 6650 [6589]
/1: door_call(257, 0x080468D8) = 0
/1: close(257) = 0
/1: shutdown(255, SHUT_RDWR, SOV_DEFAULT) = 0
/1: close(255) = 0
/1: pollsys(0x080473C0, 2, 0x00000000, 0x00000000) (sleeping...)

Looks like the main difference starts here:
/1:     open("/etc/hosts.allow", O_RDONLY)              = 257
/1: close(257) = 0
That explains a lot -- the "hosts.allow" file is part of the tcpwrappers system, which controls access to various daemons on the system based on access control rules set by the system administrator. No wonder I am getting a connection but then immediately getting booted. It is trying to open the hosts.allow file, but then is immediately closing it, vs actually reading and processing the file as seen in the working connection. Does the process not have enough filehandles?
root@perftest-db01:~# pfiles  6650 |head -2
6650: /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql
Current rlimit: 8192 file descriptors
Nope, doesn't look that way -- it's configured to use 8192 filehandles. My next clue was the file descriptor number that was returned by the "open" system call, 257. That's awfully near one of those magic "power of 2" boundaries. I started snooping around in google.

It turns out that under Solaris, and maybe *BSD also, the tcpwrappers library (libwrap) uses the "stdio" library to manage IO. This library does not understand file handles above 255, therefore, as the mysql server continues to collect client processes and open tables for reading, eventually this file descriptor boundary is crossed and calls to open "hosts.allow"appear to fail because they return too high a file descriptor number. tcpwrappers appears to fail closed, so since it cannot read the "hosts.allow" file, it denies access to the service by immediately closing the communication channel.

Fortunately, there is a fix. Giri Mandalika has a blog entry that references the issue and is a good resource on the problem. The solution is to use the extendedFILE library that's provided in Solaris Express 06/06 or later (So this is included in Nexenta Alpha 6, and possibly earlier):

root@perftest-db01:~# export LD_PRELOAD_32=/usr/lib/extendedFILE.so.1
root@perftest-db01:~# /etc/init.d/mysql restart

(Obviously I will also need to modify the /etc/init.d/mysql startup script to include the LD_PRELOAD_32). Now, I start up a test program to artificially create a bunch of connections to the database, and see what a truss looks like now:

root@perftest-db01:~# truss -w all -p 6846/1
/1: pollsys(0x08047390, 2, 0x00000000, 0x00000000) (sleeping...)
/1: pollsys(0x08047390, 2, 0x00000000, 0x00000000) = 1
/1: fcntl(11, F_SETFL, FWRITE|FNONBLOCK) = 0
/1: accept(11, 0x08047918, 0x08047928, SOV_DEFAULT) = 294
/1: fcntl(11, F_SETFL, FWRITE) = 0
/1: sigaction(SIGCLD, 0x080473F0, 0x08047470) = 0
/1: getpid() = 6846 [6785]
/1: getpeername(294, 0xFEF47A90, 0x08047488, SOV_DEFAULT) = 0
/1: getsockname(294, 0xFEF47A80, 0x08047488, SOV_DEFAULT) = 0
/1: open("/etc/hosts.allow", O_RDONLY) = 295
/1: fstat64(295, 0x08046AF0) = 0
/1: fstat64(295, 0x08046A20) = 0
/1: ioctl(295, TCGETA, 0x08046ABC) Err#25 ENOTTY
/1: read(295, " # / e t c / h o s t s".., 8192) = 677
/1: read(295, 0x5122F9D4, 8192) = 0
/1: llseek(295, 0, SEEK_CUR) = 677
/1: close(295) = 0
/1: open("/etc/hosts.deny", O_RDONLY) = 295
/1: fstat64(295, 0x08046AF0) = 0
/1: fstat64(295, 0x08046A20) = 0
/1: ioctl(295, TCGETA, 0x08046ABC) Err#25 ENOTTY
/1: read(295, " # / e t c / h o s t s".., 8192) = 901
/1: read(295, 0x5122F9D4, 8192) = 0
/1: llseek(295, 0, SEEK_CUR) = 901
/1: close(295) = 0
/1: getsockname(294, 0x08047908, 0x08047928, SOV_DEFAULT) = 0
/1: fcntl(294, F_SETFL, (no flags)) = 0
/1: fcntl(294, F_GETFL) = 2
/1: fcntl(294, F_SETFL, FWRITE|FNONBLOCK) = 0
/1: setsockopt(294, ip, 3, 0x0804745C, 4, SOV_DEFAULT) = 0
/1: setsockopt(294, tcp, TCP_NODELAY, 0x0804745C, 4, SOV_DEFAULT) = 0
/1: time() = 1169601081
/1: lwp_kill(273, SIG#0) Err#3 ESRCH
/1: lwp_create(0x08047210, LWP_DETACHED|LWP_SUSPENDED, 0x08047434) = 274
/1: lwp_continue(274) = 0
/1: pollsys(0x08047390, 2, 0x00000000, 0x00000000) (sleeping...)
As you can see above - the "open" command on the "hosts.allow" file is returning a filehandle greater than 255, but reading and processing the hosts.allow file proceeds normally, and the connection is accepted.

Yay for truss!

Technorati Tags: , , ,

No comments: