[ruby-core:112919] [Ruby master Bug#19412] Socket starts queueing and not responding after a certain amount of requests
From:
"brodock (Gabriel Mazetto) via ruby-core" <ruby-core@...>
Date:
2023-03-16 15:45:06 UTC
List:
ruby-core #112919
Issue #19412 has been updated by brodock (Gabriel Mazetto).
Some other updates. I've been trying to undestand the `TIME_WAIT` and why it does behave the way it doesn on MacOS (but not on Linux).
This response got me ideas to try setting `linger` on:
https://stackoverflow.com/questions/3757289/when-is-tcp-option-so-linger-0-required
This modified code now does "release" the socket immediately, which allows me to respond to that huge ammount of requests.
My understanding is that if I set linger to `1` that would mean 1 tick/cycle or 1 second depending on the OS (worst case would be 1 second) but setting that on MacOS brings the old behavior back of having `TIME_WAIT` all over the place.
```ruby
require 'socket'
PORT = 8080
BACKLOG = 64
LISTEN = '127.0.0.1'
LINGER = 0
@processed = 0
def handle_connection(connection, _addrinfo)
request = connection.gets
connection.write "HTTP/1.1 200\r\n"
connection.write "Content-Type: text/html\r\n"
connection.write "\r\n"
connection.write "Hello world! Current time is #{Time.now}"
@processed += 1
puts "#{request} - #{@processed}"
ensure
connection.setsockopt(Socket::Option.linger(true, 0))
connection.close
end
begin
socket = TCPServer.new(LISTEN, PORT)
socket.setsockopt(:SOCKET, :REUSEADDR, true)
socket.setsockopt(Socket::IPPROTO_TCP, Socket::TCP_NODELAY, 1)
socket.listen(BACKLOG)
puts "Requested queue size: #{BACKLOG} bigger then limit: #{Socket::SOMAXCONN}" if BACKLOG > Socket::SOMAXCONN
loop do
handle_connection(socket.accept, nil)
end
ensure
socket.close
end
```
----------------------------------------
Bug #19412: Socket starts queueing and not responding after a certain amount of requests
https://bugs.ruby-lang.org/issues/19412#change-102438
* Author: brodock (Gabriel Mazetto)
* Status: Open
* Priority: Normal
* ruby -v: ruby 3.2.1 (2023-02-08 revision 31819e82c8) [arm64-darwin22]
* Backport: 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN, 3.2: UNKNOWN
----------------------------------------
Here is a an example code:
```ruby
require 'socket'
PORT = 8080
BACKLOG = 50
LISTEN = '127.0.0.1'
def handle_connection(connection, _addrinfo)
request = connection.gets
puts request
connection.write "HTTP/1.1 200\r\n"
connection.write "Content-Type: text/html\r\n"
connection.write "\r\n"
connection.write "Hello world! Current time is #{Time.now}"
ensure
connection.close
end
begin
socket = Socket.new(:INET, :STREAM)
socket.bind(Addrinfo.tcp(LISTEN, PORT))
socket.listen(BACKLOG)
puts "Requested queue size: #{BACKLOG} bigger then limit: #{Socket::SOMAXCONN}" if BACKLOG > Socket::SOMAXCONN
loop do
listening, = IO.select([socket])
io, = listening
connection, addrinfo = io.accept
handle_connection(connection, addrinfo)
end
ensure
socket.close
end
```
This tries to simulate a TCP server that responds as if it was an HTTP server.
The amount of requests it can sustain seems to depend on the OS. On a Linux machine running ubuntu 20.04 I get something around 7.6K to 7.8K until it stops responding.
```
$ uname -a
Linux ... 5.17.5-x86_64-linode154 #1 SMP PREEMPT Mon May 2 15:07:22 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
$ ab -n 20000 -c 50 http://127.0.0.1:8080/
This is ApacheBench, Version 2.3 <$Revision: 1843412 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 127.0.0.1 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
apr_pollset_poll: The timeout specified has expired (70007)
Total of 7883 requests completed
```
on MacOS Ventura I get around 16K:
```
$ uname -a
Darwin ... 22.2.0 Darwin Kernel Version 22.2.0: Fri Nov 11 02:04:44 PST 2022; root:xnu-8792.61.2~4/RELEASE_ARM64_T8103 arm64
$ ab -n 20000 -c 50 http://127.0.0.1:8080/
This is ApacheBench, Version 2.3 <$Revision: 1901567 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking 127.0.0.1 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
Completed 8000 requests
Completed 10000 requests
Completed 12000 requests
Completed 14000 requests
Completed 16000 requests
apr_socket_recv: Operation timed out (60)
Total of 16375 requests completed
```
in both cases when that limit reaches, if I abort the program (CTRL-C) and try to run it again it takes a while until the port is released:
```
...
GET / HTTP/1.0
GET / HTTP/1.0
GET / HTTP/1.0
^Csocketserver.rb:29:in `select': Interrupt
from socketserver.rb:29:in `block in <main>'
from socketserver.rb:28:in `loop'
from socketserver.rb:28:in `<main>'
$ ruby socketserver.rb
socketserver.rb:23:in `bind': Address already in use - bind(2) for 127.0.0.1:8080 (Errno::EADDRINUSE)
from socketserver.rb:23:in `<main>'
```
After killing the process it seems no process is holding the port:
```
lsof -wni tcp:8080
```
Running the command above does not return anything (it does when the program is still running).
I think we may be failing to release something when interacting with the Socket on syscalls/kernel level and we endup filling up some queue/buffer that eventually gets freed.
---Files--------------------------------
clipboard-202303161214-zhq7o.png (1.02 MB)
clipboard-202303161214-7xx91.png (683 KB)
clipboard-202303161215-5wjbo.png (1.02 MB)
--
https://bugs.ruby-lang.org/
______________________________________________
ruby-core mailing list -- ruby-core@ml.ruby-lang.org
To unsubscribe send an email to ruby-core-leave@ml.ruby-lang.org
ruby-core info -- https://ml.ruby-lang.org/mailman3/postorius/lists/ruby-core.ml.ruby-lang.org/