Answer the question
In order to leave comments, you need to log in
How to isolate product features from description text?
Hey!
We, under the affiliate program for online stores, receive products through the API. Products range from baby strollers to telephones. As a rule, products have different sellers and each one fills in information about products in his own way. Most often we come across something like this:
"PackageDimensions: Height: 60, Length: 63, Weight: 35, Width: 20",
or
"Screen 15.6" (1366x768) HD LED, glossy / AMD Quad-Core A4-5000M (1.5 GHz) / RAM 4 GB / HDD 500 GB".
Are there ways/methods/algorithms to convert this plain/text data into some key-value representation?
Answer the question
In order to leave comments, you need to log in
It is unlikely that there are ready-made algorithms (well, more precisely, there are already those that work, but you are unlikely to find open ones).
But one way or another, you need to use machine learning :)
I would suggest something like this
. First, you find the separator ( , or / others probably exist), divide the string and for each substring determine what it is about and what characteristics. The task itself is not trivial - but very interesting :)
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question