How to low-level work with an SSD hard drive?

M

mitaichik2016-04-08 09:19:01

Highload

mitaichik, 2016-04-08 09:19:01

Hello!
There are 500 million records, each ~2Kb, that is, about a terabyte of data in total. The main type of operation is requesting data by id. We need a speed of 20,000 read operations per second.
After thinking, we decided that the best way is to take a separate SSD disk (without OS, without everything), and write all the data there so that by id you can immediately calculate the address at which the corresponding record is located.
Then a request comes, based on the id, we calculate the address, read 2Kb of data (we will make a record of a fixed size), and return it.
This requires low-level work with disks, and I have no idea where to dig in this area (the application itself is written in java, but we are thinking of writing this functionality in C)
If anyone knows how to do this, please tell me.
Thanks in advance.

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

M

Melkij, 2016-04-08
@mitaichik

Mmm. Question: what for do you need low-level work? More than a high level CRUD file level is enough.
Take any *nix. Everything is a file, therefore the task is solved. Are you opening /dev/sd? for reading-writing with ordinary file operations. fseek'om you move, write-read pieces on 2 kb. Leave the rest of the interaction with the drive to the kernel.
Just think of something with the system administrator to allow operations without running the entire application as root.
Question number is the following: do you and your colleagues really have enough experience in designing and operating DBMS and file systems to implement the entire layer of journaling, ensure disaster recovery and data consistency?

A

Artem @Jump, 2016-04-08
Tag curated by

How to low-level work with an SSD hard drive?

No way. The disk firmware works with the disk, and you can work with it only through standard interface commands.
Until you disassemble the SSD and throw out the memory, processor, and other electronics from there, it will work at a low level with memory, and only accept standard write and read commands from the computer.

Then a request comes, based on the id, we calculate the address, read 2Kb of data (we will make a record of a fixed size), and return it.

The disk cannot read 2kb of data, it reads a block that contains the 2kb of data you need, and the block itself is much larger than 2kb.

This requires low-level work with disks, and I have no idea where to dig in this area (the application itself is written in java, but we are thinking of writing this functionality in C

Judging by the description of the task, you need a database, and you do not need to go to the hardware.

G

GavriKos, 2016-04-08
@GavriKos

Dig towards writing drivers and your file system.

A

Alexander Chekalin, 2016-04-08
@achekalin

And surely there is not a single database (nosql at least) that would give you adequate speed, but still would allow you to avoid working with iron? Firstly, the data will still be hotter and colder, the database will cache the right one anyway. Secondly, you yourself will catch iron accidents, but will you handle it well? For this, the OS / driver and other layers are there to save you from crap. Moreover, an SSD is not just a field of bytes, it has its own logic and "work mechanics" (albeit electronic ones).
Moreover, if you already really want to, who prevents you from putting a 1.5 TB swap on the SSD, ordering an OS and starting working with a terabyte piece of memory - let the OS itself squeeze data onto the disk when necessary. Severely, of course, but all the same, the algorithms there are not the dumbest and not the most crooked hands prescribed - it will turn out no worse than writing it yourself).