Answer the question
In order to leave comments, you need to log in
How to get the whole database as csv?
There is some kind of database with a set of tables, for example:
id instrument
1 gitar
2 piano
id name instrument_id
1 Vasya 1
2 Petya 1
3 Serezha 2
I need to make a csv file out of it, here it will be something like this
Name instrument
Vasya, gitar
Petya, gitar
Serezha ,piano
And now the interesting thing: -
There are several DBMSs and they are of different types (oracl, mongo, mysql, etc.) -
I don’t know their architecture and table names in advance
- If there is a Many to Many connection, then there will be several lines in scv with a difference in one column
. It is necessary to invent / find / buy a universal tool that could convert the database to a csv file.
If you write it yourself, then in C # (or at least python)
Answer the question
In order to leave comments, you need to log in
With joins it will be difficult, but with everything else:
1. You determine the type of database (since you have already connected, it means you know it)
2. In most DBMS, you can dynamically get a list of all tables / collections (this information can be obtained from service tables, the name and structure of which is known in advance)
3. According to the data from p2, you make N queries SELECT * from {tableName} (I won’t tell you how to do this in Monge)
And through the DataReader you read the table and write to csv.
And for some subds, there may be ready-made tools for exporting data.
For example mysqldump, pg_dump, mongodump, etc., which do exactly the above.
At the expense of joins - I have never seen anything like this, and in the general case this will not work.
The csv format is the most inappropriate for this task, information about the structure is lost, especially when the number of columns changes on the next line and even worse - if it doesn’t change, what lies there, what is in which column, it is not known, the machine will not read and the person will curse the inventor of this nonsense .
The second problem is denormalization, this is the union of records. The reason is simple - relational databases, by definition, do not store enough information to understand what the data is.. In some cases, you can pull something out of the type of indexes (fk and pk) and constraints constraints, but in general, no. How to understand the M-1-M connection? which table to take for the base and which for the secondary, i.e. what to choose left join, right join or inner join? Yes, for simple directories, when the table is a leaf in the graph of 1-M links, you can 'safely' link such a table, duplicating the data of the directory by foreign key indexes, but again - why? after all, when reading, it will no longer be visible that the reference book was used.
ps I can assume that the author's ultimate goal is to work with random data (many small projects written by different people with very different approaches to development and data storage) and extract meaningful ones from them, for example, in a predetermined format
Once upon a time, in my graduation thesis or nearby, there was a project in which there was a simple self-written utility as an additional tool, you set it on another database with an unknown structure, it carried out a simple analysis of the structure and issued fields in the interface table and showed a short briefing for each selected field (the type of connection and an example of the data stored there), the purpose of the utility is to indicate the tables and fields from which the data will be extracted further (i.e. to give an interpretation to this data). Without this utility, the work of determining which field is what is rather dreary, in existing applications for working with the database, you need to click a lot, even run pre-written queries, etc. Maybe you'd better cook it up? I searched and did not find, ready-made universal solutions are too complicated (and the point is in the simplicity of the interface).
in c# there is a unified tool for connecting to databases - ado.net (all the fuss is in the connection string builder, plus carrying more drivers from different databases with you), plus there is a system odbc (already outdated, but for old databases this is sometimes the only connection method) that is supported by ado.net
The task is underdetermined.
For example, the term CSV can be understood as a file of a single structure (both plain structures and with serialized data), as well as several concatenated (both during the output process and explicitly at the end of the output) files, each of which has its own structure (moreover, additional information can be stored there, such as table names and field names / types).
Each of the options involves its own code for solving the problem (this is if you do not pay attention to such a trifle that the CSV entry itself in each DBMS is also done in a very different way).
So, I don't see any problem. If only there was a code on the other side of processing this CSV that correctly interprets the data and correctly decomposes it back into tables. Moreover, if we are talking about creating a universal tool, then only at this very last moment (outputting the query result in CSV) can at least some difficulties arise. The rest is simple and flat as a pancake.
I don't know their architecture and table name beforehand
There are several tables and they are of different types (oracl, mongo, mysql, etc.)
I don't know their architecture and table name beforehand
If there is a Many to Many relationship, then there will be several rows in scv with a difference in one column
Need to invent/find/buy a universal tool
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question