S
S
Semisonic2016-04-17 18:01:02
JavaScript
Semisonic, 2016-04-17 18:01:02

How to automate website crawling with HTML5 canvas?

Hello!
For the sake of fun and self-development, I'm trying to automate the process of playing one web-based game. The game is a website with an HTML5 canvas element on which the playing field is drawn.
The required sequence of actions is as follows:

  1. Log in to the gaming site.
  2. Open game page.
  3. Scan the input game data displayed on the canvas.
  4. Simulate user activity on the canvas (mouse movements, clicks, etc.).
  5. Isolate the result of the match, issued as a regular HTML page, and adjust the game strategy based on it.
  6. Return to step 2 and repeat.

I’ll make a reservation right away that methods in which the browser and the data displayed in it are interpreted as an external object are not of interest. That is, analyzing the state of the site by machine vision and simulating user activity through window messages are not suitable. I would like to work directly with the client code of the site, focusing as much as possible on the business logic (where to get what data, how to process it, what actions to perform as a result).
I would also like the chosen solution to this problem to allow you to watch the game in progress, that is, to automate the work of a real browser. However, options with headless browsers are also ready to be considered as self-education.
I would be grateful if you could recommend tools that satisfy all of the above conditions and are guaranteed to be able to solve the problem. Answers of the level “I haven’t tried it myself, but I heard something about toolXYZ, try it, maybe it will work out” are not suitable, first-hand experience is interesting.
Thanks to everyone who will respond!

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
Alexander Aksentiev, 2016-04-17
@Sanasol

Your question looks like a freelance job.
It seems to me that the easiest and surest way to make a bot as you described is to study the api through which the client works.
Then imitate the client in any convenient language, the browser in this case is completely optional, even headless.
Because the browser is not a full-fledged client as such, there is no kind of protection against emulation by anything. In other words, the server is deeply concerned with where the data comes from, the browser or the script in the console.

Y
Yuri, 2016-11-14
@riky

I did this by searching for a picture in a picture. quite sporty, machine vision and pattern recognition are not needed here. it is convenient that you do not need to study the inner workings of the game and do not depend on code changes / new versions. also handy for closed source (flash) games.
but if you want to do it at the level of business logic, then the choice is small: it’s
easier: to peep what requests go to the server and write a script that sends the same ones, from the minuses - you won’t see changes visually on the screen until you reboot, although the browser is essentially not required here at all .
a little more difficult but cooler: extract the js code and look at the internal api (the difficulty is that the code is most likely minified and possibly obfuscated, then it can be more difficult without naming variables). after parsing the code, you can change it (add your own business logic) and replace it in the browser, you can add new buttons and functions to the interface.
I use all methods depending on the task.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question