M
M
MilkyCoder2015-01-29 18:18:48
.NET
MilkyCoder, 2015-01-29 18:18:48

How to increase the speed of writing random blocks?

In the specified code, if we remove the manipulations with the file cursor, the write speed of a megabyte is 20ms. But if you write blocks by accident, then a megabyte is written in not decent 800ms, almost a second, and this despite the fact that I have a fancy SSD. Even shamanism with FileOptions.RandomAccess does not help. When writing sequentially, the Position also changes, why if I change it manually, then the performance drops so much. The difference is 40 times. Please advise something.

static string path = @"...\ConsoleApplication10\bin\Debug\1.dat";
        static int count = 1;
        static int len = 1024 * 1024;

        static void Test()
        {
            var rnd = new Random();
            var sw = Stopwatch.StartNew();

            var b = new byte[4];
            var fs = new FileStream(path, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite, 4096, FileOptions.RandomAccess | FileOptions.SequentialScan);

            if (fs.Length == 0)
            {
                fs.SetLength(len);
            }
            
            Console.WriteLine("Test started");

            for (var i = 0; i < len / 4; ++i)
            {
                var ind = 20;

                b[0] = (byte)ind;
                b[1] = (byte)(ind >> 8);
                b[2] = (byte)(ind >> 16);
                b[3] = (byte)(ind >> 24);

                //fs.Position = i * 4;
                //fs.Seek(i * 4, SeekOrigin.Begin);
                fs.Seek(rnd.Next(0, len / 4) * 4, SeekOrigin.Begin);

                fs.Write(b, 0, 4);
            }

            fs.Close();

            Console.WriteLine("Test end time - " + sw.ElapsedMilliseconds);
        }

Answer the question

In order to leave comments, you need to log in

5 answer(s)
M
mayorovp, 2015-01-30
@MilkyCoder

SqLite uses WAL (write-ahead log) to speed up writing to the database. You can do the same, thereby turning random entries into sequential ones.

A
Alexey Nemiro, 2015-01-29
@AlekseyNemiro

You can try using MemoryMappedFile :

string path = @"1.dat";
int len = 1024 * 1024;

var rnd = new Random();
var sw = Stopwatch.StartNew();

var b = new byte[4];

Console.WriteLine("Test started");

using (var map = MemoryMappedFile.CreateFromFile(path, FileMode.Create, path, len))
{
  using (var accessor = map.CreateViewAccessor())
  {
    for (var i = 0; i < len / 4; ++i)
    {
      b[0] = (byte)rnd.Next(0, 255);
      b[1] = (byte)rnd.Next(0, 255);
      b[2] = (byte)rnd.Next(0, 255);
      b[3] = (byte)rnd.Next(0, 255);

      accessor.WriteArray(rnd.Next(0, len / 4) * 4, b, 0, 4);
    }
  }
}
      
Console.WriteLine("Test end time - " + sw.ElapsedMilliseconds);
Console.ReadKey();

D
Denis Antonenko, 2015-01-29
@dabrahabra

You can try to bypass the system cache using the WriteThrough flag: MSDN
BUT! SSD gives huge performance in random read, but not very friendly with random write.
Yes, compared to HDD it will be faster, but you will pay with its lifetime. Here's why (I don't guarantee 100% accuracy):
Accordingly, when you force the disk to work in random write - in fact, it operates with large blocks, even if you write by byte.
If I were you, I would rely on caching - subtract all data, modify, write to disk.

V
Vitaly Pukhov, 2015-01-30
@Neuroware

I don’t know exactly the essence of the task, but if you need high speed, you can mount a RAM disk of the right size and shit into it with a speed of 8GB per second and almost zero delays (because it hangs in RAM), after mocking the file ends, it can be dumped to disk (already consistently and as fast as possible to disk)

O
Oxoron, 2015-01-30
@Oxoron

Seek() backward is much slower than Seek() forward. Try replacing
fs.Seek(rnd.Next(0, len / 4) * 4, SeekOrigin.Begin);
on
fs.Seek(len-4*i, SeekOrigin.Begin);
you will be quite disappointed.
If you really need to "randomize" random blocks, we calculate the probability of "randomization", and sequentially run through the file, "randomizing" blocks with the calculated probability. The algorithm time will be 25 percent longer than the usual sequential run (in the worst case).

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question