How to implement storage of files in memory using a repository?

N

Nikita2021-12-06 19:17:24

OOP

Nikita, 2021-12-06 19:17:24

There is a C# lab that requires you to implement a mechanism for creating backups. Difficulties arose with the storage of copies. Here is what is written:

Copy storage

В лабораторной работе подразуемвается, что резервные копии будут создаваться локально на файловой системе. Но логика выполнения должна абстрагироваться от этого, должна быть введена абстракция - репозиторий (см. принцип DIP из SOLID). И, например, в тестах стоит реализовать хранение в памяти, иначе тесты будут создавать много мусора, будут требовать дополнительной конфигурации, а также могут начать внезапно падать. Ожидаемая структура:

- Корневая директория
- Директории джоб, которые лежат в корневой директории
- Файлы резервных копий, которые лежат в директории джобы

I still don’t understand why there will be problems if the tests use the local file system; and I don't understand how to implement a repository that will work with memory. I only realized that I need to use the MemoryMappedFile library.

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

G

Griboks, 2021-12-07
@Nikitos2002

I still don’t understand why there will be problems if the tests use the local file system;

Because you're testing the serialization logic, not the save-to-file logic. Those. your subject, environment and initial conditions are completely different.

backups will be created locally on the file system

in tests it is worth implementing storage in memory

Cool ... the program works with files, and they offer RAM for testing. Very useful activity, right?

I don't understand how to implement a repository that will work with memory.

Create a class to store data, but don't save that data to files. Write in the report that the class fields are the same virtual files.

F

freeExec, 2021-12-06
@freeExec

I dare to assume that the following is meant:
1) This implementation should use the file system to store files;
2) Yours (so as not to litter with files on the disk) stores all these bytes in memory.
Those. FileRepositoryyou do insteadMemoryRepositiory

R

rPman, 2021-12-06
@rPman

DIP from SOLID

I’m not sure how deep you are ready to get into the problem, since the approach to development will depend on this. The choice - support or not for almost any of these items will change the structure and algorithm almost completely.
1. filenames and paths, what about encodings?
different OSes have different rules, different delimiter characters, significance of large/small letters in names
2. symbolic and hardlink can go crazy
this is a huge headache for anyone involved in copying data, the behavior is different, for example, does the path fall within the directories included in the copy or not
3. features like sparce files or reflink (a kind of hardlink, but not for a file, but for its sectors)
there are tasks in which not saving and taking into account these things can incredibly complicate data recovery (for example, if data is stored in leaky files, logically petabytes in size, but in reality it takes orders of magnitude less, it will be almost impossible to restore and copy without taking this into account)
4. few people use extended attributes
(but if they use it so deeply that it will be fatal not to make a backup copy), but you need to remember this, especially when you need to abstract from their implementation in OS
5. access rights
very few people bother with backing up this information, and it is often no less important than the data itself, because otherwise, when restoring data with a complex structure of rights and a large number of users, it can turn into hell, and even carry the risk of leaking important data
6. incremental storage backups
are of course not necessary, but backup storage systems without this feature are inconvenient or too
expensive
And also, the probability that you will need to restore the entire backup is so low compared to other scenarios, and forcing a person to extract petabyte archives for the sake of a megabyte file that was deleted by mistake and decided to restore from the backup...
ps don't invent it yourself, ask your manager, how deep is the rabbit hole, since, for example, doing everything right can lead to a diploma or even cooler.
pps advice, do not invent data storage formats, store everything in files, let the file system itself be the container (do not try to store files in the database, for example), but someone else will have to answer for the file names (here is the database), and not it is recommended to completely exclude the names of files and directories from the archive, it is enough to make a list of allowed characters (common for most operating systems and the main encoding), but this may impose a limit on the data structure (for example, different operating systems have different nesting depth limits or character lengths in the file path), immediately store extended attributes (also in the form of files with their own names)
Everything else (settings, the structure of incremental backups, access rights, the presence of holes, symlink/hardlik, reflink, etc.) is also stored in the database, it may not be as convenient as it seems at first glance, but it will be easier to restore.