K
K
kr_ilya2019-09-15 10:47:14
Analytics
kr_ilya, 2019-09-15 10:47:14

Parsing an integer array of data?

Good afternoon.
I need to analyze an array of numbers 0-1000000 and identify the following: -
how the next number behaves in relation to the current one: increases or decreases
- in which range of values ​​​​there are more numbers -
from which range to which numbers most often move (for example: from 0-100k in 400k-500k, etc.)
, etc. data
Are there ready-made services? Not to reinvent the wheel.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
D
dmshar, 2019-09-15
@dmshar

You don't need to invent anything. Specifically, on the questions:
1. Build the difference (next number - previous number). Then build a histogram of the resulting series of differences.
2. Build a histogram of the original series.
3. Divide the range of possible values ​​into the fragments you need. Build a two-dimensional array in which each element is a pair (the fragment number of the previous number, the fragment number of the next number). Build a heat map of the resulting array.
Total - two services are necessary.
1. Building histograms - present in ANY tool - from EXCEL to SPSS, from MatLab to SAS, from R to Tableau.
2. Building a heat map (heatmap) - similar to the previous one. An abbreviated list of possible tools, for example, is here:
https://ru.wikipedia.org/wiki/Heat_Map

#
#, 2019-09-15
@mindtester

... first there was a comment ..
let's start with

how the next number behaves in relation to the current one: increases or decreases
you a priori have N-1 answers (999 999) .. and the graph ... something tells me that this is a graph of a derivative (it is not necessary to have an analytical form for plotting, a run on raw data is more than enough) .. and further in that same spirit
- in which range of values ​​​​there are more numbers
called a histogram .. which means the answer is easy to google
-from which range the numbers most often go (for example: from 0-100k to 400k-500k, etc.)
solvable in one pass, as well as the very first point.. essentially a kind of histogram plotting.. but for the derivative .. no, exactly, it's a dumb histogram of the derivative .. upd no. got excited. nevertheless, it still cuts in the same one pass, with the same level of complexity ))
.. as a result, everything can be rammed in one pass , which means there is no sin in building a bicycle :
- clear your mind and better understand what you are working with (and whatever you want ..
- you can win in performance (if your coding is good enough .. although the question itself does not inspire such optimism

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question