Answer the question
In order to leave comments, you need to log in
How to parse tags?
Hello.
There is a third-party resource, I write a parser + rest api for it.
The question is how to parse tags correctly, the first thing that comes to mind is to stupidly parse tags from the page, check if each tag is in the table, if not, write it down, if there is, use the one in the table, repeat on the next page.
Everything is simple, but I do not like a large number of requests, I work very little with the web and this solution seems crooked.
As an option, to reduce the number of database accesses, keep a dictionary of tags in memory of about a thousand of them
Answer the question
In order to leave comments, you need to log in
You need to look at the load and speed of the rest, if everything else is very fast, then you can use the dictionary. Otherwise, you can also twitch the database, if everything is ok with the database settings, then it will cache frequent requests anyway. If this is a unique value and there will be a unique index, then the selection will not be difficult, if suddenly there is a lot of data, then you can create a memory table and pull from there, well, or use any additional caching mechanism ... In any case, I think that the speed tag processing, like parsing itself, will not be a bottleneck .. The main brake will be on page loading.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question