Answer the question
In order to leave comments, you need to log in
Why does request per second performance drop?
In general, there is a service rewritten in jappronto, if there is nothing in it except for the returned "Hello world" conditionally, then with 48 threads I get ~ 132,000 req \ sec. Then I add two requests to api (tensorflow serving), and tensorflow serving itself can easily withstand 3000-4000 req \ sec. Requests are implemented through the requests lib. I get 150 requests per second!!! Such a wild drawdown. Rewrote the API request via pycurl, the situation has improved slightly - 170 req/sec.
How can this happen? How to speed up the service? What could be the architectural error?
Now a lot of the code does not make sense and is clogged with hardcode, but the essence of the problem is performance. Tried flask, sanic.
def calulate_response(request):
X_main = <большой np.array, пока хардкодом>
X_rnn_addi = <небольшой np.array, пока хардкодом>
curl = pycurl.Curl()
curl.setopt(pycurl.URL, 'http://localhost/v1/models/model_name:predict')
curl.setopt(pycurl.HTTPHEADER, ['Accept: application/json',
'Content-Type: application/json'])
curl.setopt(pycurl.POST, 1)
data = json.dumps({
"inputs": {
"X_main": X_main,
"X_rnn_addi": X_rnn_addi
}
})
body_as_file_object = StringIO(data)
response_as_file_object = BytesIO()
curl.setopt(curl.WRITEDATA, response_as_file_object)
curl.setopt(pycurl.READDATA, body_as_file_object)
curl.setopt(pycurl.POSTFIELDSIZE, len(data))
curl.perform()
resp = response_as_file_object.getvalue()
X_spmts = <небольшой np.array, пока хардкодом>
lnzspmts = np.array()
lnzimobs = np.array()
data = json.dumps({
"inputs": {
"X_rnn_addi": np.append(lnzspmts, lnzimobs)[np.newaxis, :].tolist(),
"X_spmts": X_spmts[np.newaxis, :, np.newaxis].tolist()
}
})
curl.setopt(pycurl.URL, 'http://localhost/v1/models/model_name:predict')
body_as_file_object = StringIO(data)
response_as_file_object = BytesIO()
curl.setopt(curl.WRITEDATA, response_as_file_object)
curl.setopt(pycurl.READDATA, body_as_file_object)
curl.setopt(pycurl.POSTFIELDSIZE, len(data))
curl.perform()
resp = response_as_file_object.getvalue()
curl.close()
response = request.Response(json={
"rec_id": 'lsdkf1213',
}, code=200 if ((0.5 > 0) | (0.5 < 1)) else 202)
return response
(base) [[email protected] tmp]# bombardier -c 48 -n 10000 --method=POST --body-file=/tmp/body_file.txt http://localhost:4229
Bombarding http://localhost:4229 with 10000 request(s) using 48 connection(s)
10000 / 10000 [============================================================================================================================] 100.00% 49025/s 0s
Done!
Statistics Avg Stdev Max
Reqs/sec 132600.72 63939.29 182236.34
Latency 357.94us 709.83us 17.01ms
HTTP codes:
1xx - 0, 2xx - 10000, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 363.68MB/s
(base) [[email protected] tmp]# bombardier -c 48 -n 10000 --method=POST --body-file=/tmp/body_file.txt http://localhost:4229
Bombarding http://localhost:4229 with 10000 request(s) using 48 connection(s)
10000 / 10000 [=============================================================================================================================] 100.00% 170/s 58s
Done!
Statistics Avg Stdev Max
Reqs/sec 171.05 125.58 1150.73
Latency 280.16ms 92.62ms 1.27s
HTTP codes:
1xx - 0, 2xx - 10000, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 497.85KB/s
Answer the question
In order to leave comments, you need to log in
You need to use asynchronous handlers and an asynchronous library for requests to external services. You just stop the event loop with a blocking call, and you get a drawdown.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question