Answer the question
In order to leave comments, you need to log in
What sections of mathematics are relevant in the field of neural networks and AI?
Hello.
I got a job in a laboratory dealing with neural networks.
In the next semester, take the state in mathematical analysis.
I would like to combine business with pleasure and pay special attention to those sections of mathematics that are actively used in the analysis and design of neural networks.
In the asset: 3 semesters of matan, 2 diffurov, 1 tfkp, 1 urmatov (will be 2nd), linear algebra (1) and difgem (1). There was no discrete.
I won’t lie - somewhere there is a chorus, somewhere it’s satisfying (there are more triplets).
English allows you to read foreign articles without straining too much.
Please tell me:
1) Actually what to learn and what to pay attention to?
2) What sections are the minimum required?
3) What mathematical books to read for professional development in the field?
4) What literature do you generally recommend for studying in this area?
Ultimate goal: to lay the foundations for professionalism.
Minimum goal: understand the math in articles.
Areas of interest: swarm intelligence, machine learning and computer vision.
Special thanks to those who point out the relationship of the various sections.
Answer the question
In order to leave comments, you need to log in
None and all.
Some lyrics. The fact is that neural networks are a very incomprehensible thing. She classifies well, seems to know how to study. But she doesn't know how to make decisions. That is, in a sense, she makes decisions during the classification, correlating the image to the pattern, but in reality everything is somehow sadder. And she is completely incapable of performing algorithms in the classical sense of the word. Well, it turns out that the whole essence of the neural network, everything that it will do is determined by its parameters and input data. It seems to be similar to an algorithm, but the algorithm can be defined 1) without input parameters (classic), 2) not with parameters, but with a set of actions. The neural network is more... Analogue.
From here we come to the fact that we will definitely need a differential andfunctional devices. Unfortunately, modern computers are not able to adequately work with infinitesimal and large, so we will definitely need approximation methods: these are computational methods themselves , and some optimization methods (they will be required for other reasons as well). Yes, and not at all superfluous will be number theory , statistics , mathematical logic and definitely more general theories of probability and random processes .
But we must not forget that neural networks are related disciplines. It will definitely require narrower sections that are used in cybernetics (this is mainly signals) and information theory (this is mainly discrete mathematics with probability theory). Finally, a narrow and deep knowledge of neurophysiology is required. There are a lot of mats everywhere. physics.
But it should be understood that all these sections are very closely intertwined. It is impossible to lift any one part separately from the other. Everything is interconnected. In general, I highly recommend khan academy - the coolest thing. Apparently, there is everything that I described here.
Pressed ctrl+enter too early...
As you can see, in general, all the math is needed. If some section is not specified, then it is likely to be affected in other sections.
However, neural networks are very new... Practice. In fact, they work like this: we take a bunch of integral and differential equations, take some kind of approximation and implement "quantization" and "discrete", and then run. Moreover, at the moment of counting, everything seems to be fine - the processes are running, the error propagates back, however ... There is a small problem that between these updates our model does not seem to exist. And if in the real world the sampling rate is incredibly high, so much so that the world seems smooth (however, there is a possibility that it is so - now it’s not entirely clear what kind of world it really is), but in our model this frequency turns out to be well very low. Therefore, scientists simply pray that nothing bad happens between the stages of the existence of their model.
So, if you do not want to advance science, I strongly advise you to quit this business. You can build a couple of simple perceptrons, but I advise you not to count on more. Damp.
For ANNs, first of all, you need:
- linear algebra (working with matrices: most ANNs are very compactly written in matrix form)
- mathematical analysis (differentiation: many popular learning algorithms are based on gradients; the concept of extremum, convergence, etc.)
- theory probabilities (many ANNs have a probabilistic interpretation, without which it is problematic to understand their architecture)
- numerical methods (for example, when it is impossible to calculate the derivative or integral analytically, and such situations often occur, there is nothing left but their numerical evaluation, and you need to know how to do this do it right);
For more advanced or specific use of ANNs, you need:
- graph theory (firstly, the ANN itself is a graph, although at the basic level of use this does not need to be taken into account; secondly, it is convenient to represent a number of ANN architectures in the form of probabilistic generative models, more specifically, graphical models);
- theory of differential equations (recurrent ANNs correspond to systems of difurs);
- the theory of dynamic chaos (a number of recurrent ANNs, for example, Hopfield networks, are well described in terms of a phase portrait, attractors, bifurcations, but in general this area of mathematics is not needed very often);
- predicate calculus (traditionally, networks with binary neurons were considered as a way to represent predicates);
- theory of algorithms (finite automata, formal grammars, Turing machines, etc.) neurosymbolic networks are also popular, requiring access to the indicated sections of mathematics)
The listed areas (with the exception of the theory of dynamic chaos) are strictly necessary in machine learning in general. The basics of functional analysis and the calculus of variations are often needed for a deeper understanding of existing methods or for developing your own. Measure theory is needed in a probabilistic approach to machine learning, although they often try to avoid it (more than once I met statements like: well, with measure theory this could be shown more strictly and generally, but we can do without it to make it clearer) .
If we talk about AI in general, then almost all of mathematics is needed there (although this is an exaggeration in a certain sense: pure mathematicians solve their internal problems with very sophisticated methods; in AI, the results of any section of mathematics can be useful, but these sections are extremely rarely needed in their entirety). Harmonic analysis (Fourier, wavelets), which can be more or less dispensed with in machine learning, is strictly necessary in computer vision or speech recognition. Combinatorics, very rarely needed for ANNs, is the basis of classical AI methods (search in the state space). The theory of algorithms, which finds only occasional applications in ANNs, is strictly necessary (both partially and almost completely) in other subfields of machine learning and AI (for example, in automatic programming or machine translation). Set theory is strictly needed in the field of knowledge representation, fuzzy logic, etc. Even category theory can come in handy (for example, in probabilistic programming languages - a subfield of machine learning - when using functional languages like Haskell, its use turns out to be quite appropriate). Well, and so on.
In general, each sub-field of AI has its own minimum required set of mathematical knowledge, without which it is very difficult to start studying this sub-field. For ANNs, this is a ruler, matan, numerical methods, and probability theory is highly desirable. In other areas it can be combinatorics, graphs, theory of algorithms, mat. logics. There are branches of mathematics that will prove unavoidable to be studied/used only if one chooses a specific sub-field or even specific methods within it; otherwise they are either not needed at all, or only the most basic information of them is needed.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question