Answer the question
In order to leave comments, you need to log in
String as input for neural network?
There are words with a length of 4 to 7 characters, consisting of lowercase letters of the English alphabet. Each word is assigned 1 or 0 in accordance. It is necessary to train the TIN so that she herself can evaluate new words.
I tried to distill each letter into a 26-dimensional vector and forward these vectors to the input to the perceptron. The network has not learned anything, writes an almost zero error, but in fact, even on the training set, it gives a huge error.
In which direction to dig? Maybe a different network architecture is needed? Or another method of mapping a word to an input?
UPD
I will describe the task itself to make it clearer. I want to teach ANN to determine the subjective assessment of the "beauty" of a set of letters. That is, there are sets of letters, the combination of which looks good, there are sets of letters, the combination of which looks bad. For example, I like the way the word "cool" looks, but I absolutely dislike the way "kkrt" looks. This is subjective, so it is unlikely that anything other than ANN can help here.
Answer the question
In order to leave comments, you need to log in
The text is usually encoded in two ways:
1) bag-of-words (or chars, or something else) - encoded with a vector of length N (dictionary size), for each example, the i-th element of the vector is equal to the number of these words / characters in the text. Plus various variations like TF-IDF (when we use not just a quantity, but a more complex metric), n-grams (when we use not individual words, but sequences of 2, 3 or more), skipgrams (when we use not individual words, but combinations words separated from each other at some distance - to throw out prepositions, articles, etc.) and so on. The text encoded in this way can be run through a Naive Bayes classifier, SVM (usually shows the best result on the order of 10000-100000 examples), MLP
2) encode with a vector of length N (the maximum allowable length of the text), where each element is the index of a word\character. If the text is shorter than N, then pad with zeros. Convolutional, recurrent and recursive networks work with this encoding option.
26 dimensional vector
It is necessary to train the TIN so that she herself can evaluate new words.
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question