Answer the question
In order to leave comments, you need to log in
How to parallelize functions in cuda?
Hello, I am writing a neural network for digit recognition in cuda. I want to get maximum acceleration. On python, 10,000 workouts are completed in 20 seconds. In C++ in 10 seconds. Now it's cuda's turn. There is a neuralNet class and it has 3 functions: constructor, training and polling. How can I simultaneously call several training functions at once? I understand that control flows and blocked memory will be needed there, but I haven’t worked with this much and I can’t put it all together.
Thanks in advance for your reply)
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question