V
V
Vasily Vasilyev2021-03-31 22:39:34
Parsing
Vasily Vasilyev, 2021-03-31 22:39:34

How to check the occurrence of a string in a large csv file?

There is a csv file with a total weight of 50mb+ and there is a list of keys (list of strings). It is necessary to determine the occurrence of each key in csv at the lowest cost. At the same time, csv is located on a remote server (github) and is regularly updated.
Example:
There is a key "O. Henry". You need to determine if there is at least one occurrence of this key in the csv file.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
I
Igor Makhov, 2021-04-01
@Igorgro

Well, there is only one adequate solution: stop using csv for other purposes, study any SQL database and use it.

V
Vasily Vasilyev, 2021-04-01
@Basil_Dev

Perhaps it will help someone:
So far I have found one option - parsing in a stream using scramjet . It does not work at lightning speed, but with the current file size it is quite tolerable, with about a hundred keys and ~ 50MB of csv databases, processing took less than a minute.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question