O
O
oxaoo2017-03-10 01:24:26
Algorithms
oxaoo, 2017-03-10 01:24:26

Existing data domain classification algorithms?

What are the algorithms for classifying data domains. A domain is a certain finite set of properties, each of which has a certain weight.
One of the applications of such an algorithm can be the classification of sentences in a text. For example, a sentence of such a plan can be attributed to the "time" domain:

When was the last time Vesuvius erupted?

And the following sentence - to the "geographical":
Dublin is the capital of Ireland

It was possible to determine the domain for the first example using the subordinating union "when", referring to the temporary category, and the phrase "last time".
At the same time, it is worth noting that the same properties can belong to different domains and have completely different weights in each of them. A set of properties is received as input and it is necessary to determine its belonging to a specific domain using a classifier, for example, based on the total weight of properties, but it would also be wrong to limit ourselves to this parameter.
Algorithm requirements. It would be desirable that its algorithmic complexity does not exceed O(kn^2), where k is the number of domains, n is the input selection of properties. It is assumed that the number of domains will not exceed 20, each of which has no more than 7 properties.
Unfortunately, there are no training data in large sizes, so the options for using supervised classifier learning algorithms (eg SVM) immediately disappear. I am inclined to such options, for example, as regularization according to Tikhonov. Perhaps there are other approaches.
I would like to have an algorithm, the software implementation of which would not be very resource-intensive, ideally, there would already be a ready-made solution (preferably in java).

Answer the question

In order to leave comments, you need to log in

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question