6
6
655202012-10-16 15:33:19
SQL
65520, 2012-10-16 15:33:19

Algorithm for processing information from sensors (noise filtering)

There is the following task.

I receive information from some sensors. The number of sensors is not fixed - they can be added and removed, but each in any case has its own unique identifier. Information arrives unevenly in time - that is, the sensor can be silent for a week, and then give out two values ​​\u200b\u200bwith an interval of a second and then disappear altogether; or a sensor may suddenly appear that will regularly send data. For simplicity, we will assume that each sensor produces a value from some fixed set of values ​​- that is, the sensors are discrete. Plus, the sensors are brought to each other = calibrated - that is, if any sensor produces A, then this is A. The data from the sensors is written to the log in the format (time, identifier, value).

The task is to filter information. I need to select ONE entry from this log, which I consider to be the current truth. That is, for example, if I receive a fresh record from one sensor that my value is A, but before that 10 sensors said that the value is actually B, then the truth at the moment (!) is B. Values ​​A-c- There can be no half - everything is discrete. If further other sensors begin to confirm that yes, after all, A, then over time A will become true. At the same time, the task is not reduced to the usual filtering of interference, because if there was one sensor that kept repeating every second all week that the value = A, and then five sensors appear that recently say that the value is actually B, then the truth immediately becomes B, despite the whole story - the number of independent sources also has weight, in general.

That is, I get a certain function of two variables - time (obsolescence of information) and the number of unique sensors at a time. And again, everything is complicated by the fact that I cannot go into discreteness - I cannot take information in an hour and draw conclusions from it, because my infa comes unevenly, and the whole essence can be right behind this interval. I got to the point in my thoughts that the point here is clearly that the weight of the record (and in the end I will just need to select the record with which this weight is the largest) depends not only on the values ​​\u200b\u200bin this very record (time and value itself), and from the values ​​in neighboring records. That is, the table will need to be traversed N times for each of the N records. Well, that's what I think - I'm not sure.

I tried to explain as clearly as possible, if anything - ask any clarifications - I will answer. The data is written to the SQL database, so you can use SQL terms if it's more convenient. But in general, I'm interested in the algorithm itself, I'll master the implementation.

PS I do not rule out that there is a well-known algorithm for this...

Answer the question

In order to leave comments, you need to log in

3 answer(s)
N
nerudo, 2012-10-16
@nerudo

Dance from the median: come up with some kind of weight function that would “estimate” the distance from the median value along with obsolescence. From the results obtained, choose the maximum of the weight function. How all these algorithms fit into sql - I have no idea. I would load arrays (value + timestamp) into memory and work there ;)

B
bagyr, 2012-10-16
@bagyr

Realtime data or some kind of ready-made table?
>I can't get information for an hour
Last N readings? Interpolation?
It feels like the simplest quorum with a bunch of heuristics is just right here. Or simple training on some window with the choice of one most plausible sensor.
You can watch a report on time series from the last YAC, they are looking for various bursts, dips and drops on the charts with a highly simplified cross-correlation analysis, it can come up with something.

6
65520, 2012-10-16
@65520

Realtime. The last N counts may also not be indicative, because there may be N records from one sensor in a row, and then there will be 3 records beyond this limit, but all from different sensors - in the end they should be more important.
Education? Unlikely. Of course, there will be a selection of coefficients in the formula. Ok, I'll take a look, thanks!

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question