P
P
Pisarev_OS2018-07-13 14:49:09
linux
Pisarev_OS, 2018-07-13 14:49:09

What is the best way to do browser emulation (linux)?

Hello.
There is a site on the page of which there are 200 iframes, which are drawn from 200 different domains. Inside each iframe is a small generated news feed. Do not ask why so many, why iframe, and why all this - but it is necessary, there is no other option here, only this one.
When opening such a page just in desktop Chrome ( win7x64, 32 GB RAM, i5-3.5 GHz ), the full load takes about 1 minute. For convenience, let it be so - 60 seconds .
All this time, of course, the whole system slows down. visits (non-living, of course). And the question arose - how best to do this in terms of optimal performance ...
The main point is that it is now planned to send a total of 5,000 to this page through the proxy list per day
Now we are discussing the option of such a presentation of attendance through emulation in a virtual browser, using proxies and user agents. Will be written in c++ using the official qt webengine browser engine and the webengine-based js V8 processing engine.
Server config:
CPU: Intel 2x Xeon E5-2630v3 - 16c/32t - 2.4GHz /3.2GHz
RAM: 128GB DDR4 ECC 1866 MHz

CPU: Intel 2x Xeon E5-2630v3 -individually - to the page with the news feed.
When calculating, we figured that in this way the program will transfer 1 visit to all 200 sites in about 10 minutes. Those. 144 visitors in 24 hours, with the 1st stream. And you need 5,000 - i.e. 35 threads.
And the subject of the dispute itself: which option is more optimal in general in such a situation can you come up with? Will the server itself as a whole not slow down with these 35 threads, will it have enough capacity. Or maybe it’s easier to figure something out in general in the form of a desktop version with the same multithreading? After all, if it takes 60 seconds to load those 200 frames into one site, instead of accessing them separately, then 1440 visits per day with one stream, which is 10 times more. And for the goal of 5,000, only 4 threads will be needed.
I would be grateful if someone could suggest in which direction to move, what to focus on.
Task: bring 5,000 visitors either to each of the 200 sites separately to the desired page, or to one, but then those 200 need to be connected to it in iframes.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
9
922j, 2018-07-13
@922j

PhantomJS, SlimerJS

A
alekssamos, 2018-07-13
@alekssamos

Either use, as already answered above, a Headless browser, that is, a browser without a user interface,
or I also came up with the idea of ​​loading not all frames at once, but as far as possible, focusing on scrolling (scroll), but this probably will not fit the current task.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question