What project on neural networks with text (linguistics or around) can be implemented by a student?

A

Alexander Fedorov2017-10-03 23:25:53

Python

Alexander Fedorov, 2017-10-03 23:25:53

Colleagues, welcome.
The kids are in school and one of them wants to take something "technical" as a project.
They know how and love to learn (they know how to code a little in python), and I would like to suggest that they dig towards neural networks.
I saved up material for the educational program, links to literature for study, too.
Please advise how to start such training (what task to take as a project). Preferably in working with texts (the second of the children is fond of philology).
And, in fact, any link to adequate tutorials that a 10th grade student (with a good level of mathematics) can understand, I will also be grateful.

Reply

Answer the question

In order to leave comments, you need to log in

5 answer(s)

V

Vyacheslav Shindin, 2017-10-04
@pro_co_ru

You can try to do something like predicting the grade for an essay based on the age (class) of the author and the text of the essay itself.
Well, to make it even more interesting, you can sharpen the neural network to predict grades for essays on topics related only to the works of War and Peace, Tolstoy. Just right for 10th grade.
True, it will be necessary to get somewhere a large amount of data for training, essays with different grades, both doubles and those that are excellent.

X

xdgadd, 2017-10-04
@xdgadd

First, explain to students what machine learning is and how it works. It would be wise to start with the simplest linear regression and gradient descent, then move on to the problem of classification and logistic regression, explain why linear models do not always (almost never) cope. Next, talk about regular fully connected meshes and better optimization methods (sgd, momentum etc.)
After that, your students will be ready to meet with convolutional and recurrent networks. You can talk about word embeddings (w2d, bag-of-words, tf-idf etc.) right along the way.
Links:
1) https://github.com/goto-ru/Basic_ML, tasks are designed for students in grades 10-11 and students in grades 1-2.
2) word2vec
3) RNN labs: 1, 2 .
4) karpathy.github.io/2015/05/21/rnn-effectiveness - very clear about recurrent networks.
5) CNN in NLP: 1 , 2 , 3 .
6) https://distill.pub/ - simple language about complex things.

V

Vladimir Olohtonov, 2017-10-04
@sgjurano

Of those known to me, the easiest way to solve the problem of recognizing the language of a document is by comparing the frequencies of characters (according to MSE) with those known from the training corpus.

E

evrog, 2017-10-04
@evrog

My first-year students analyze spam, there are many examples on the Internet. Like they understand.
You can also upload anecdotes and compare them, for example, with random excerpts from literature and news. Download and random cut snippets they can try themselves. At the same time, they will taste all the hardships of linguistic work :)
In the 11th grade, if they don’t run away, you can introduce them to word embeddings and already feed neural networks not just keywords, but words “with meaning” (vectors, that is).

X

xmoonlight, 2017-10-04
@xmoonlight

I would like to invite them to dig towards neural networks.

Do you HEAR yourself?!
PS: neural networks are vyshmat formulas.
Subject: definition of all possible characteristics of a word: part of speech, number, case, etc.