I
I
ironalibay2014-02-17 15:00:47
Clustering
ironalibay, 2014-02-17 15:00:47

How to isolate product features from description text?

Hey!
We, under the affiliate program for online stores, receive products through the API. Products range from baby strollers to telephones. As a rule, products have different sellers and each one fills in information about products in his own way. Most often we come across something like this:
"PackageDimensions: Height: 60, Length: 63, Weight: 35, Width: 20",
or
"Screen 15.6" (1366x768) HD LED, glossy / AMD Quad-Core A4-5000M (1.5 GHz) / RAM 4 GB / HDD 500 GB".
Are there ways/methods/algorithms to convert this plain/text data into some key-value representation?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
X
xandox, 2014-02-17
@ironalibay

It is unlikely that there are ready-made algorithms (well, more precisely, there are already those that work, but you are unlikely to find open ones).
But one way or another, you need to use machine learning :)
I would suggest something like this
. First, you find the separator ( , or / others probably exist), divide the string and for each substring determine what it is about and what characteristics. The task itself is not trivial - but very interesting :)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question