V
V
Vayladion Gognazdiak2020-11-18 18:07:38
ruby
Vayladion Gognazdiak, 2020-11-18 18:07:38

What is the correct way to use http persistent connection in ruby?

Good day.

There is a task: Interrogate an external API that returns a specific set of data to a depth of 180+ days from the current date. POST-request is sent in XML format, it contains 3 parameters "FROM CITY", "TO CITY", "DATE".
The API supports processing multiple destinations (up to 40) on the same date.
The API supports multiple connections (up to 10 simultaneous) and multiple requests (up to 100 / sec)

Since there are many directions - approximately 75_000, then the number of requests = 13_500_000 if each is polled for every day at a depth of 180 days from the current date.
To save time on polling directions, it was decided to use http persistent connection to reduce the time to reopen the connection for each request + request 40 directions at once. And also disperse it using threads / forks.
The gem 'net-http-persistent', '4.0.0' is used as an http client.

Question - how to do it correctly? Polling with a Python script shows much better performance.
In ruby, it's done like this:

dates = ['2020-11-10', '2020-11-11', '2020-11-12', .... ]
directions.each_slice(7500) do |slice|
  fork do
    fork_counter = 10
    Signal.trap("CLD")  { fork_counter += 1 } 
    #
    uri = URI("https://api.url/")
    http = Net::HTTP::Persistent.new(name: "slice_#{rand(1..10000)}")
    #
    http.max_requests = 100000000
    http.keep_alive = 600
    http.read_timeout = 5
    http.reuse_ssl_sessions = true
    #
    post = Net::HTTP::Post.new(uri)
    #
    slice.each_slice(40) do |directions_to_ask|
      Process.wait if fork_counter <= 0
      fork_counter -= 1
          headers = { 'Content-Type' => 'text/xml', 'Accept-Encoding' => 'gzip', 'Connection' => 'Keep-Alive' }
          headers.map{ |k,v| post.add_field(k,v) }
          segments = directions_to_ask.map{ |x|
            "<OriginDestination date='#{date}' origin='#{x[0]}' destination='#{x[1]}'></OriginDestination>"
          }
          data = <<-SOAP
            <SOAP-ENV:Envelope>
             .....
             .....
              <SOAP-ENV:Body><SchedulesAvailability">#{segments.join("\n")}</SchedulesAvailability></SOAP-ENV:Body>
            </SOAP-ENV:Envelope>
          SOAP

          post.body = data
          response = http.request(uri, post)
    end
    http.shutdown
  end
end
Process.waitall


What can be fixed/improved?
Perhaps there is an error. Thanks in advance for your reply.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
Z
Zaporozhchenko Oleg, 2020-11-18
@c3gdlk

Try to organize a connection pool first, and then take a free one in cycles. Well, thread is faster than a fork. The meaning of a fork is only if there are a lot of trends inside, and you have a maximum of 10.

R
Roman Mirilaczvili, 2020-11-30
@2ord

In ruby ​​it goes like this:
Interesting garden.
Try concurrent-ruby, async-http.
If I find the code, I'll add it.
As for fork, it was rightly noted, Oleg Zaporozhchenko .

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question