A
A
Alexander2017-11-08 14:55:48
Neural networks
Alexander, 2017-11-08 14:55:48

How difficult is it to write a system for counting the number of similar actions of a person in a video?

There is a database of short videos lasting about 3 minutes (180,000 pieces) where people do the same type of actions, for example, squats, for each video the number of actions is known.
How difficult and how much would it cost to develop a system that could automatically process the video and calculate how many actions of the same type a person performed.

Answer the question

In order to leave comments, you need to log in

5 answer(s)
P
Philipp, 2017-11-08
@zoonman

https://www.youtube.com/watch?v=pW6nZXeWlGM

A
Alexander Taratin, 2017-11-08
@Taraflex

{delusional way}
If the actions in one way or another cause a person to make head movements with sufficient amplitude. It is enough to take any face recognition lib in the photo (the positions of faces have long been learned to accurately track) and measure the cyclical fluctuations of a person's faces.
{/crazy way}

A
asd111, 2017-11-08
@asd111

You can try to take an ordinary image classifier trained on imagenet, which can tell from the picture what is there, for example, to determine a squat or that a person is standing, etc.
Then split the video into frames and check each frame through this classifier. Then calculate the results of the classifier.
For example, Google defines this picture as "static squats". 5a035ba2c25fb883814062.jpeg
Try to take a screenshot of the video and upload it to Google, if Google determines what is happening in the picture, then the method that I described may work.
Ask about the price on ods.ai , where you can also search for performers. This is a Russian community of ai specialists and among them there are those who ranked high on kaggle.

A
Alexander Popov, 2017-11-09
@popov654

I may write complete nonsense now, but perhaps there are mathematical methods how to determine the skeletons of people in the video (start by searching for contrasting areas, separating people from the background, and then try to build limb vectors), and set squats as certain movements of the human skeleton in space ? In the late nineties, this was how pornography was recognized on video, there was very good accuracy. There is even an article about this on Habré. IMHO, it will turn out to be easier, faster to develop than a neural network with its training, and most importantly, it will not require a huge database of samples for learning to find.

A
Andrey Dugin, 2017-11-09
@adugin

The simplest hypothesis I would test first would be to calculate the average brightness and/or color ripple. For simplicity, you can reduce the frame size and even further blur the image. Then no neurons are required.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question