O
O
Otakukz172022-01-13 15:27:54
Java
Otakukz17, 2022-01-13 15:27:54

How to find out how long it will take to search for a value in a txt list?

There is a txt list consisting of 4 million values.
It is required to determine if the given value is in this list.
How to understand how long it will take to search for a given value and what is the most efficient way to parse a file?

Answer the question

In order to leave comments, you need to log in

3 answer(s)
R
Roman Kitaev, 2022-01-13
@deliro

How to understand how long it will take to search for a given value and what is the most efficient way to parse a file?

What is known about the file besides what it is?
If the values ​​are in random order, then the complexity will always be O (N) in the worst case.
If the values ​​\u200b\u200bin it are sorted by some attribute and there is an explicit separator (carriage return, for example), then you can do an analogue of binary search (for O (log N) ). Well, that is, we poke in the middle of the file (seek(file_size_in_bytes / 2)), look forward to the nearest delimiter (for example, carriage return), read to the next delimiter (we get a string), compare well and then as usual. But you need to take into account that if this is an HDD, then the movement of the disk head is not free and the random seek will be slower than the sequential one, so comparing the "number of lines per second" in the forehead will not work.
If the RAM allows, then you can subtract the entire file there and throw it into a structure that is more suitable. Be it a tree, a sorted array, or a hash table. If only the answer to the question "is there?" is required, then a hash table will be the most performant option with O(1) complexity
. exists" and never scan if it is "definitely not" (actually, these answers are given by the Bloom filter)
If the RAM does not allow, but the file can be changed, then you can sort it once and then binary search
Yes, you can even upload the contents of the file to sqlite and solve all problems, let the driver figure out what to store in RAM

J
Jacen11, 2022-01-13
@Jacen11

f54446a54f3d52d20e95ba5c5495644f.png
b911bcca9ca9f9d8b0fa781a49118553.png
https://habr.com/ru/post/188010/

D
Dmtm, 2022-01-14
@Dmtm

this is a substring search task and there are fast specialized algorithms for it
https://ru.wikipedia.org/wiki/%D0%9F%D0%BE%D0%B8%D...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question