ruby_connect's error handling

From: Martin Dorey <mdorey@...>
Date: 2004-05-06 02:01:22 UTC
List: ruby-core #2851
This site:

http://www.eleves.ens.fr:8080/home/madore/computers/connect-intr.html

and the illuminating thread that it links to:

http://www.google.com/groups?threadm=tON4XI9hD5aY%24comp.unix.programmer%40c
lipper.ens.fr

discuss how to call connect(2) on blocking and non-blocking sockets
according to the Single Unix Specification and various implementations.  The
consensus seems to be that the portable way to do it is to:

connect(fd)
  if EINPROGRESS (non-blocking only) or EINTR
    select(fd)
      - until bored or select indicates the fd is ready

My linux box's man 2 connect page (dating from Linux 2.2, 1998-10-03)
expands on how to then check for errors:

  EINPROGRESS
    The  socket  is  non-blocking  and the connection cannot be com-
    pleted immediately.  It is possible to select(2) or poll(2)  for
    completion  by  selecting  the  socket for writing. After select
    indicates writability, use getsockopt(2) to  read  the  SO_ERROR
    option  at  level  SOL_SOCKET  to determine whether connect com-
    pleted  successfully  (SO_ERROR  is  zero)   or   unsuccessfully
    (SO_ERROR  is one of the usual error codes listed here, explain-
    ing the reason for the failure).

The Single Unix Specification, as quoted by the web page above, states that
a subsequent connect attempt on a socket which is already being connected
should return EALREADY.  However, the code in ext/socket/socket.c says that
this isn't always true in practice on non-blocking sockets (the only kind
that should return EINPROGRESS):

		    /*
		     * connect() after EINPROGRESS returns EINVAL on
		     * some platforms, need to check true error
		     * status.
		     */

I can add to that from my experience today - when my linux box returned
EPIPE in this situation.  Here's hopefully all of the relevant parts of an
strace of a ruby program running on:

Linux doozer 2.6.5 #1 SMP Tue Apr 6 16:27:39 BST 2004 i686 GNU/Linux
glibc 2.3.2.ds1-12
ruby 1.8.1 (2004-02-03) [i386-linux]

12336 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 25
12336 fcntl64(25, F_GETFL)              = 0x2 (flags O_RDWR)
12336 fcntl64(25, F_SETFL, O_RDWR|O_NONBLOCK) = 0
12336 connect(25, {sa_family=AF_INET, sin_port=htons(2003),
sin_addr=inet_addr("10.1.6.22")}, 16) = -1 EINPROGRESS (Operation now in
progress)
...
12336 gettimeofday({1083765243, 419797}, NULL) = 0
12336 select(27, [], [25], [], {0, 0})  = 0 (Timeout)
...
12336 select(27, [], [25], [], {0, 0})  = 1 (out [25], left {0, 0})
...
12336 connect(25, {sa_family=AF_INET, sin_port=htons(2003),
sin_addr=inet_addr("10.1.6.22")}, 16) = -1 EPIPE (Broken pipe)
12336 fcntl64(25, F_SETFL, O_RDWR)      = 0
12336 close(25)                         = 0

I'm by no means a sockets expert but I haven't found any suggestions in any
man pages or in google that connect needs to be called again on a socket for
which connect is already in-progress - whether the connect returned
EINPROGRESS or EINTR.  From the comment above and my results today, there's
some suggestion that, indeed, connect shouldn't be called a second time, at
least on a non-blocking socket.  The thread linked to above was originating
by a guy who thinks that he should be able to call connect multiple times on
a blocking socket if he gets EINTR.  He has been unable to find any portable
way of making that work.

There are three other things that I wonder about in the existing
ruby_connect:

Firstly, I wonder if this whole case would be necessary if connect weren't
called more than once on non-blocking sockets.  Perhaps the only reason that
the wait was necessary was that the first connect hadn't yet finished when a
second connect was tried?

#if WAIT_IN_PROGRESS > 0
	      case EINVAL:
		if (wait_in_progress-- > 0) {
		    /*
		     * connect() after EINPROGRESS returns EINVAL on
		     * some platforms, need to check true error
		     * status.
		     */
		    sockerrlen = sizeof(sockerr);
		    status = getsockopt(fd, SOL_SOCKET, SO_ERROR, (void
*)&sockerr, &sockerrlen);
		    if (!status && !sockerr) {
			struct timeval tv = {0, 100000};
			rb_thread_wait_for(tv);
			continue;
		    }
		    status = -1;
		    errno = sockerr;
		}
		break;
#endif

Secondly, I wonder if this case was intended to break out of the for(;;)
loop as well as the switch?

#ifdef EISCONN
	      case EISCONN:
		status = 0;
		errno = 0;
		break;
#endif

Thirdly, the current ruby_connect code doesn't do getsockopt for SO_ERROR on
Linux saying:

#ifdef __linux__
/* returns correct error */
#define WAIT_IN_PROGRESS 0
#endif

The quote from the manual page above, however, indicates that SO_ERROR is
the supported way of getting the error code on Linux.

Perhaps the code in ruby_connect could be simplified if the connect call
were outside the loop.  Perhaps it could even be simplified this far:

#ifdef O_NDELAY
# define NONBLOCKING O_NDELAY
#else
#ifdef O_NBIO
# define NONBLOCKING O_NBIO
#else
# define NONBLOCKING O_NONBLOCK
#endif
#endif

static int
ruby_set_non_blocking_fd_mode(fd)
    int fd;
{
#ifdef HAVE_FCNTL
    int mode;

    mode = fcntl(fd, F_GETFL, 0);
    fcntl(fd, F_SETFL, mode|NONBLOCKING);
    return mode;
#else
    return 0;
#endif
}

static void
ruby_restore_fd_mode(fd, mode)
    int fd;
    int mode;
{
#ifdef HAVE_FCNTL
    fcntl(fd, F_SETFL, mode);
#endif
}

static int
ruby_get_so_error(fd)
    int fd;
{
    int status;
/*
 * Although this is non-standard, perhaps it is reasonable to assume
 * that, if defined, SO_ERROR works the same way on all systems.
 * Previously we assumed this for __CYGWIN__ and __APPLE__ and it
 * appears true for __linux__ and BSD too.
 */
#ifdef SO_ERROR
    int sockerr, sockerrlen;

    sockerrlen = sizeof(sockerr);
    status = getsockopt(fd, SOL_SOCKET, SO_ERROR, (void *)&sockerr,
&sockerrlen);
    if (status == 0 && sockerr != 0) {
        status = -1;
        errno = sockerr;
    }
#else
    /*
     * http://cr.yp.to/docs/connect.html suggests that we can't do
     * better than just letting io fail on older systems.
     */
    status = 0;
#endif
    return status;
}

static int
ruby_connect(fd, sockaddr, len, socks)
    int fd;
    struct sockaddr *sockaddr;
    int len;
    int socks;
{
    int nonblocking;
    int mode;
    int status;

#if defined(HAVE_FCNTL) && defined(EINPROGRESS)
    nonblocking = 1;
#else
    nonblocking = 0;
#endif

#ifdef SOCK5
    if (socks) {
        nonblocking = 0;
    }
#endif

    if (nonblocking) {
        mode = ruby_set_non_blocking_fd_mode(fd);
    }
#if defined(SOCKS) && !defined(SOCKS5)
    if (socks) {
        status = Rconnect(fd, sockaddr, len);
    }
    else
#endif
    {
        status = connect(fd, sockaddr, len);
    }
    while (status < 0 && (
#ifdef EINPROGRESS
      errno == EINPROGRESS ||
#endif
      errno == EINTR)) {
        thread_write_select(fd);
        status = ruby_get_so_error(fd);
    }
    if (nonblocking) {
        ruby_restore_fd_mode(fd, mode);
    }
    return status;
}

-- 
Martin, BlueArc Engineering 


In This Thread

Prev Next