How to organize the work of b-tree structure, data file and index file?

C

ccc352015-12-09 18:38:20

C++ / C#

ccc35, 2015-12-09 18:38:20

There is a ready-made b-tree in C++. There are insert, delete and search functions.
You must do the following:
1. The data element stored in the file is a record that has a unique key value.
2. A record in a file is represented by an index, i.e. pair (k,p), where k is a key value, p is a file pointer to the start of a record in the file.
3. The index file structure specified in the job option is supported for the data file.
4. The index file has a page structure. Pages contain indexes of records and have a fixed size.
5. Reading and writing to the index file is carried out page by page.
6. Testing of the file structure is carried out for various values of the parameters:
N is the number of records in the data file, N = 103, 104, 105, 106,
M is the number of indexes on the page of the index file, M = 10,100, 1000.
7. The number of accesses to blocks of the index file in the process of performing operations must correspond to: for the B-tree of the file - 2 + logt (N/M), where t is the power of B - tree.
Question. How to organize the work of b-tree structure, data file and index file?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

S

Stanislav Makarov, 2015-12-09
@Nipheris

There is a search, insertion and deletion.
Now we need to organize the work with files. Index file and data file.

1) replace pointers to descendants in tree nodes with abstract pointers or numbers. This is necessary because the nodes of the tree can now be both in main memory and on disk, so when accessing any of the nodes, you must first check whether it is loaded or not, and if necessary, load from disk (i.e., organize a simple cache );
2) in leaf nodes you will store the same abstract pointers, only not to tree nodes (which are in the index file), but to real records (which are in the data file). You can do the same with records - check whether they are loaded into the cache or not, if necessary, download and give to the user.
3) change operations should now learn to mark nodes as "dirty", i.e. need to be saved to disk. In the process of modifying the B-tree, you mark as dirty all changed nodes, and then save them to disk (at the end of the operation or at regular intervals), after saving each node, reset the flag.
Actually, the main changes: organizing a cache for data and index records and rewriting the tree to work with nodes through the cache, and not directly by pointer to BTreeNode.