Graph algorithm?

M

mrgloom2015-02-02 18:42:40

C++ / C#

mrgloom, 2015-02-02 18:42:40

Graph algorithm?

Let's admit there is a binary file in it to be stored the graph without cycles.
The read structure looks something like this:
Sequentially I read the structures into std::list
str1-str2-str3-..-strN
after reading any str I read its refs.
The str structure is of the form
std::string name;
std::list refs;
Ref is of the form
std::string sname;
data type data;
sname is the name of str in the original list, and I want to get rid of the name, and just put a reference to the structure in the list, i.e. make
Ref
Str* pstr;
data type data;
moreover, different Refs in the list may have the same sname and I want to combine them, in theory this will save some memory (there will be no duplicate links).
Ref
Str* pstr;
std::vectordatavec;
So that's more profitable, immediately count everything in std::list and then work with it somehow or add iteratively one by one.
The algorithm reads and puts iteratively:
1. Considered str was added to the end of the main list.
2. We begin to read ref.
2.1 look through the ref list of the current str, if found with the same name, add it to std::vector, if not found, then go through the list of all str, create a new ref , put down a link to the found str.
First we read, then we work the algorithm:
We read everything, we get a list of str and each str has a list of refs, we must get rid of the lines and put down links.
1. Sort the lists of refs for each by line, combine them with the same names (what's the best way?).
2. We go through the new ref list for each str and for each ref we go through the list str and find the desired str by name and put down the link.
The abundance of string searches, as it were, hints that you need to use std:: map (or what is better?) instead of std:: list.
Or maybe there is some special algorithm for this.
The graph (tree) is rather wide than deep and, according to my observations, is rather sparse.
The number of str structures is about 10k, and the number of ref structures is about 90kk.
The number of str'ov and ref'ov is not known in advance, they are read sequentially.
A simple example for clarity:
1. "structure1" was considered, put into the list std::list strs;
2. Next, read refs from "structure1",
"s_structure2" , "s_structure2" , "s_structure5",
putting std::list str.refs
3. Repeat the same steps for the following structures.
So, first of all, it bothers me that std::list requires an extra for each element. memory + I still need to get rid of links by lowercase names, and put down real links, and for this I have to search at least by strs, and possibly also by refs of the current str, because ref can have repeated names in links.

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

M

Makaleks, 2015-05-18
@Makaleks

Since this huge graph is stored on disk and you are worried about the memory for std::list , then you can, for example, save to disk (or try to memory) the adjacency matrix for which basic algorithms are written. Of course, the task is not entirely clear, but this is a head-on option, for basic things.