S
S
Sardar2014-01-22 19:15:31
Python
Sardar, 2014-01-22 19:15:31

Fast on record service/database for logs?

Good day. Perhaps I am reinventing the wheel, but I need to create a service to collect
and process frequent events. Events arrive as a REST/POST request and carry
some JSON data. Periodically, a group of work tasks runs through the collected
events and builds their calculations.
Some service/database optimized for logging is required. Ideally, such a
service should be a large canvas on which two
cursors are placed. One cursor only writes, behind it the cursor only reads. It
would seem that this is the simplest work with a file, but there are requirements:
* The cursor on the record must write extremely quickly, preferably without real I / O,
immediately returning control. Infrequent loss of events is not a problem. Some
in an asynchronous way, the database must regularly flush the accumulated data to disk.
* Literally hundreds of processes can write events at the same time.
Because the base is written only for addition, no locks are allowed.
* Reading is performed in large blocks and immediately after reading, the data is
automatically deleted. Read no more than a dozen processes at the same time.
One process always reads one block, or there must be a mechanism
whereby two processes can know that they have read overlapping data.
* There can be a lot of data. From the point of view of the future, if such a service
can be clustered, then it would be absolutely ideal. Otherwise, you will have to read
more often.
* Since there is a lot of data, there is absolutely nothing to keep them in memory for nothing,
except for the write buffer.
Maybe someone faced a similar problem and can advise how / where to
dump the canvas with logs?

Answer the question

In order to leave comments, you need to log in

6 answer(s)
E
egor_nullptr, 2014-01-22
@Sardar

I can recommend Scribe .

S
sajgak, 2014-01-22
@sajgak

elasticsearch + logstash.
If the number of messages is more than 10k per second, it is desirable to put a queue in front.
Pros: elastic - a search index on top of Apache's lyucine, with appropriate search and filtering capabilities. logstash is a service that allows you to filter the data passing through it by a dynamically generated set of filters. very handy for enriching/depleting messages

S
Sergey, 2014-01-22
@begemot_sun

You can write your own in Erlang, including distributed to several machines. What are your volume/performance requirements?

D
deep_orange, 2015-12-20
@deep_orange

InfluxDB

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question