M
M
m1t9.(;⌣̀_⌣́)2019-10-02 22:14:47
Mathematics
m1t9.(;⌣̀_⌣́), 2019-10-02 22:14:47

What is the fastest algorithm for finding the maximum value in a large file?

Hello.
Let's say there is a file that contains a number on each line. In this case, the file itself can be extremely large (for example, several hundred gigabytes in size). What is the fastest way to find the maximum (minimum) value in a given file?
Thank you.

Answer the question

In order to leave comments, you need to log in

5 answer(s)
R
Rsa97, 2019-10-02
@Rsa97

Looking at all values. Your K.O.

E
Evgeny Samsonov, 2019-10-02
@bitniks

MapReduce

D
Dmitry, 2019-10-02
@dimoff66

The main optimization, provided that the numbers can be of any size, is to read the dimension of the numbers, storing the numbers with the largest at the moment in the array, or to read their positions if the file is available for reading completely and not sequentially. As soon as a number of greater bit depth is found, all other previously saved data can be reset without comparison.
If there are too many numbers of the highest dimension to store them in the RAM, then you can loop through them and leave the largest one ... And so on.

A
Alexey, 2019-10-02
@le2xx

I think the best option is to go through the file once using reduce, so the algorithm's running time will be linear to the number of elements. O(n). Sorting and other methods require many passes through the list, which will require more time for the algorithm to run. It's probably better to loop though, since reduce is recursive in nature, and you have +100500 items in the list, which will result in a huge memory consumption.

X
xmoonlight, 2019-10-03
@xmoonlight

1. Divide the file by offset, adjusted for the end of the line, into as many parts as the number of threads we will use to search for values ​​in asynchronous mode.
2. Run the required number of threads asynchronously: 1 (or several!) threads for each part of the file.
3. After everything works out, we compare the maximum and minimum values ​​​​between threads.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question