S
S
Sergey Karbivnichy2018-02-16 19:22:20
Java
Sergey Karbivnichy, 2018-02-16 19:22:20

How to read file headers in c# or java?

How to read information about a file? For example, there are file formats - exe, elf, bmp, wbmp, etc. They have their own file structure or header. Under windows I have seen many utilities for editing PE(exe) files, for example: for which processor this executable is, import table, sizeofcode, etc.
5a87048cc5bf9617900337.png
How to access these header fields in c# or java, provided that I have a specification file. Can be a simple example. Thank you!

Answer the question

In order to leave comments, you need to log in

4 answer(s)
D
Dmitry Alexandrov, 2018-02-16
@hottabxp

If there is a file specification, then the offsets \ sequences \ data structures inside are described there.
For example, it is described that there is a header at the very beginning of the file,
translate it into bytes and read in that amount, and then translate the read into int long char. Here you read it.

A
Alexey Cheremisin, 2018-02-17
@leahch

There is a project in Java - apache tika tika.apache.org
Just for these purposes.
There is a tutorial here https://www.tutorialspoint.com/tika/index.htm
The tika itself supports parsing text and metadata from about 15k different formats.

#
#, 2018-02-16
@mindtester

ahem ... a "simple" example needs to be WRITTEN ... (and for lazy people like me - in scrap)
the file spec (even on the screen) is a description of how many bytes (bits) and WHAT they mean
, that is, google "reading binary files" for a start in the language you need,
but the next stage of magic is that some byte there can have the values ​​​​1,2,3 (or others), but each value, according to the spec (of some specific format ), can be matched with a whole string (and then a paragraph of text) explanations
you will have to sew up in the program (what configuration file, or embedded database) - but in any case, on your own

D
d-stream, 2018-02-17
@d-stream

Actually concepts "heading" - virtual. And the general outline of the implementation could be as follows:
1. based on the file extension, we make an assumption about the format
2. we read the block from the beginning of the file with the size of the proposed format header
3. we check the signature, if any, is implied (for example, "MZ" in various versions of executable files, " jpeg" in jpeg, "fLaC" in flac, "ID3" in mp3, etc.)
4. interpret the rest with validity control so that, quite by accident, the course work text file starting with MZ and having the .dll extension is not misleading ...

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question