S
S
Sergey Melodin2018-09-25 22:22:51
linux
Sergey Melodin, 2018-09-25 22:22:51

Is it possible to get a unique file id?

I want to create file navigation (just cross-references in the document), so that you can change the contents of the file, rename it, move it to other directories, but so that the links remain working.
As I imagined it: when creating a file, it becomes possible to set a certain "method data" or get an identifier (ID) issued by the system, which remains attached to it throughout the entire "lifetime". When creating links, I specify the ID, and after editing is completed, for example, before pushing to the repository, I run a script that scans the files by ID and sets the correct links to them in the text of the main document.
I spent a lot of time googling today, but I couldn't find a suitable solution to this problem, because it all came down to inodes and UUIDs.
inode proved to be very unreliable - it changed after editing the file, and if someone then deploys my project on another virtual machine, then the inodes will be recalculated for that file system. When a file is deleted, the inode is reassigned to the new file. Unreliable, that's not it.
UUID looks like a great alternative to unique identification, however, I have not found a way to see the UUID of specific files.
I thought that I might be able to set some custom metadata (a la data-attributes in HTML), but it turned out that this is a very limited area and not all tools can work with metadata. An additional complication is that I wanted to write the automation script in a familiar language - JS or PHP, but in the documentation of the latter, I also did not find any ways to mark files with some unique identifiers.
Of course, you can write the ID as the first / last line in the file, but this is so-so. You can just as well just give the files unique names, it's not interesting. One of the workarounds is to hang up a daemon that will track changes in real time, but it will only work if a person has deployed a virtual machine, and they may not do this for editing documents. You can tie something to the git to hang this task on it, but again you need to know what and how. Yes, and I do not believe that my Wishlist is unique, all this should have already been decided a hundred times.
I also tried hard / symbolic links, but when you move the file, the hard link starts to give information from the previous version when reading. It looks like some kind of caching, I don’t know if it makes sense to deal with it.
In general, something like this image was formed:
A unique identifier that is independent of the file system, written at the level where it can be obtained without dancing with tambourines, but not in a "visible place" that does not change when moving, renaming the file and changing its contents . It is desirable that it does not require an active demon.
Where to get, how to form? If there are already ready-made solutions, then share the links, but you are also interested in sorting it out yourself)

Answer the question

In order to leave comments, you need to log in

5 answer(s)
A
Alexander Taratin, 2018-09-25
@melodyn

Of course, you can write the ID as the first / last line in the file, but this is so-so. You can just as well just give the files unique names, it's not interesting.

https://habr.com/post/46935/

R
Roman Mirilaczvili, 2018-09-26
@2ord

I believe that if you build your own IS (inf. sys.) on top of a layer that works through FUSE, then everything will be simplified.
In your underlying FS, you can assign UUIDs to files. A file is an object that can be associated with such service information as a file name or URI, in general terms, which are subject to frequent changes. Store a set of such objects in a certain DBMS (for example, SQLite).
When mounting the storage via FUSE to some directory, they will be visible to the outside as ordinary files. When the file name is changed, only service information about the object in the repository changes. The storage can be either local or remote. When deleting a document file in the repository, you can mark the object as being recycled or simply deleted. When you change the version of the document file, the contents of the object in the repository change. In service information (meta data) you can also store a hash of the content.

D
Dimonchik, 2018-09-25
@dimonchik2013

A unique identifier that * does not change when * its content changes

you here
on the problem - sha 256 from the first gigabyte with all the consequences
, either the file name
or (your case) - 42

S
Stalker_RED, 2018-09-26
@Stalker_RED

There is no ready-made solution for your requests, and it is not expected in the near future.
Now there are a lot of filesystems that will not save your super-labels, and the whole beautiful idea will fall apart.
To warm up, you can imagine what will happen if the file is archived and then unpacked.
If you create multiple copies of the same file, which one will your superlink point to?
If you copy files to a USB flash drive with fat32 and then back?
Send over the network? In general, a URI
was invented to solve this problem , but, again, it does not work everywhere and not always.

S
Sergey Melodin, 2018-09-26
@melodyn

Since different people in different places write about the same topic, I will shorten the original text:
There are hard links in Linux. Regardless of the action on the file, a hard link continues to refer to it. If I understand correctly, then a hard link is bound to an inode, which behaves somewhat unpredictably. For example, on a home PC, when a file is changed, the hard link works correctly:
but on a work PC, for some reason, file changes lead to a change in the inode and the hard link already displays the wrong file contents:
Apparently, this is due to the fact that in the case of a virtual machine, the system works with files otherwise, relying on the file system of the parent machine.
If I can ensure that a hard link behaves regardless of location, then this will allow me to create, for example, a links directory, put hard links there under unique names, specify a link in the code like [next file] (#my_awesome_hard_link) and the task is solved.
Therefore, to say that no one needs it, does not exist in nature, or “what did you come up with, lol”, to be honest, it’s such a thing.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question