Answer the question
In order to leave comments, you need to log in
How to optimize API for Python parser?
Friends, hello! As a refresher, I am writing a small multi-threaded parser, as well as an API for accessing it via the web. The resource from which I will parse provides its own API, which is very limited for security purposes. And I want to parse everything and give data in xml / json. I will use request + lxml + postgresql + nginx + uwsgi + standard threading module for multi-threaded request and page parsing as tools. The question is how to cache data in the database so that when a similar request is made, the data is taken from the cache. Do I need to take the Last-Modified headers from the server response and compare them with each new request in order to cache without errors?
Thank you.
Answer the question
In order to leave comments, you need to log in
rethink your outlook on life. why step on a rake that someone has already stepped on. look at ready-made solutions. for example Scrapy . A great parsing library with its own server
As tools, I will use ... the standard threading module for multi-threaded requests and page parsing.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question