How to create a program to capture video from the screen?

L

Legebocker2019-03-09 21:03:04

GPGPU

Legebocker, 2019-03-09 21:03:04

Hello, I decided to try my hand at GPGPU and write a program for capturing video from the screen, and subsequent encoding through the CUDA kernel. Since I have never worked with video files, I will have a lot of questions.
1. How to create a video file? No, it's easy to create, but how to write something there? As I understand it, video is a stream, but a stream of what? What stream should I do for video files and what type of data should I write there?
2. How to capture an image? As far as I know, OBS uses DirectX to capture the image, the simple screenshot capture uses a function from user32.dll, which is better and what do you advise me?
3. How should I encode the video? I don’t really understand how codecs work, and I’m afraid that a simple one: “Throw frames into a bunch, pass them to CUDA cores, they will translate them, and pass them back” will not work, I will be glad if you explain how the encoders work. And how to influence compression? I don't really want a 5-minute file per gigabyte
That's it, 3 questions, but at the moment without answers to them, I can't get off the ground

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

G

GavriKos, 2019-03-09
@GavriKos

1. Read the specifications. There are different formats. The general answer is that you have to write bytes there.
2. Depends on the platform. But apparently Windows - so use DirectX, yes
3. It depends on the format, but in general you can not encode, but use ready-made libraries.

R

rPman, 2019-03-10
@rPman

GpGpu for a programmer at the moment looks like this - in some language (everyone supports opencl, and nvidia in particular its cuda, but it is ideologically similar), very similar to c ++, code is written, with one callback method that will be called sequentially and in parallel ( this will be decided by the video card driver, you have almost no control over it) for the elements of your array, which is in fact a texture in memory (at a low level, you don’t bother here) and stores the result in another.
Transferring data between ordinary conventional RAM and a video card is objectively the most expensive operation, and usually they try to keep such copies to a minimum (i.e. at the start and at the end, to get the result). And so much so that, for example, getting the contents of the screen into RAM and copying it back in a format understandable to your cuda application will take 99% of the time, if not 1099% (you want fullhd / 4k 60 fps? but the speed may not be enough stupidly) , not to mention the format conversion itself.
Therefore, you will have to solve an engineering problem, how to share data between the texture in the memory of the video card in which the screen is located and your cuda application, and I'm afraid this will be another quest, there, for sure, access rights will come out fun, and binding to the video card chip and something anything else.
If you succeed, you will have to develop (or find a ready-made) efficient algorithm that encodes video using the multi-core processor of the video card (thousands of not very fast processors), which means diving very deep into video encoding, so much so that the questions are in the wording that you asked will not appear, because a different level of knowledge (higher) is required for this.
ps nvidia already seemed to be promoting a few years ago on the fact that it developed such algorithms for efficient streaming of the application screen and games over the network with minimal delay, incl. in iron.
https://developer.nvidia.com/nvidia-video-codec-sdk