L
L
lucifer_jr2020-08-07 12:44:43
Python
lucifer_jr, 2020-08-07 12:44:43

Why does request per second performance drop?

In general, there is a service rewritten in jappronto, if there is nothing in it except for the returned "Hello world" conditionally, then with 48 threads I get ~ 132,000 req \ sec. Then I add two requests to api (tensorflow serving), and tensorflow serving itself can easily withstand 3000-4000 req \ sec. Requests are implemented through the requests lib. I get 150 requests per second!!! Such a wild drawdown. Rewrote the API request via pycurl, the situation has improved slightly - 170 req/sec.

How can this happen? How to speed up the service? What could be the architectural error?

Now a lot of the code does not make sense and is clogged with hardcode, but the essence of the problem is performance. Tried flask, sanic.

def calulate_response(request):
    X_main = <большой np.array, пока хардкодом>
    X_rnn_addi = <небольшой np.array, пока хардкодом>
    curl = pycurl.Curl()
    curl.setopt(pycurl.URL, 'http://localhost/v1/models/model_name:predict')
    curl.setopt(pycurl.HTTPHEADER, ['Accept: application/json',
                                'Content-Type: application/json'])
    curl.setopt(pycurl.POST, 1)
    data = json.dumps({
        "inputs": {
            "X_main": X_main,
            "X_rnn_addi": X_rnn_addi
        }
    })
    body_as_file_object = StringIO(data)
    response_as_file_object = BytesIO()
    curl.setopt(curl.WRITEDATA, response_as_file_object)
    curl.setopt(pycurl.READDATA, body_as_file_object) 
    curl.setopt(pycurl.POSTFIELDSIZE, len(data))
    curl.perform()
    resp = response_as_file_object.getvalue()

    X_spmts = <небольшой np.array, пока хардкодом>
    lnzspmts = np.array()
    lnzimobs = np.array()
    data = json.dumps({
        "inputs": {
            "X_rnn_addi": np.append(lnzspmts, lnzimobs)[np.newaxis, :].tolist(),
            "X_spmts": X_spmts[np.newaxis, :, np.newaxis].tolist()
        }
    })
   
    curl.setopt(pycurl.URL, 'http://localhost/v1/models/model_name:predict')
    body_as_file_object = StringIO(data)
    response_as_file_object = BytesIO()
    curl.setopt(curl.WRITEDATA, response_as_file_object)
    curl.setopt(pycurl.READDATA, body_as_file_object) 
    curl.setopt(pycurl.POSTFIELDSIZE, len(data))
    curl.perform()
    resp = response_as_file_object.getvalue()
    curl.close()
    response = request.Response(json={
            "rec_id": 'lsdkf1213',
        }, code=200 if ((0.5 > 0) | (0.5 < 1)) else 202)
    return response


Here are the benchmark results with an empty service:
(base) [[email protected] tmp]# bombardier -c 48 -n 10000 --method=POST --body-file=/tmp/body_file.txt http://localhost:4229
Bombarding http://localhost:4229 with 10000 request(s) using 48 connection(s)
 10000 / 10000 [============================================================================================================================] 100.00% 49025/s 0s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec    132600.72   63939.29  182236.34
  Latency      357.94us   709.83us    17.01ms
  HTTP codes:
    1xx - 0, 2xx - 10000, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   363.68MB/s


Results with two API requests (codes above)
(base) [[email protected] tmp]# bombardier -c 48 -n 10000 --method=POST --body-file=/tmp/body_file.txt http://localhost:4229
Bombarding http://localhost:4229 with 10000 request(s) using 48 connection(s)
 10000 / 10000 [=============================================================================================================================] 100.00% 170/s 58s
Done!
Statistics        Avg      Stdev        Max
  Reqs/sec       171.05     125.58    1150.73
  Latency      280.16ms    92.62ms      1.27s
  HTTP codes:
    1xx - 0, 2xx - 10000, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:   497.85KB/s

Answer the question

In order to leave comments, you need to log in

[[+comments_count]] answer(s)
S
Sergey Gornostaev, 2020-08-07
@lucifer_jr

You need to use asynchronous handlers and an asynchronous library for requests to external services. You just stop the event loop with a blocking call, and you get a drawdown.

D
Dimonchik, 2020-08-07
@dimonchik2013

replace Japronto with Flask/django
or client with aiohttp

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question