Answer the question
In order to leave comments, you need to log in
C# How to free RAM of completed Tasks?
Good afternoon.
General information: I have written an application, in the process of which quite a lot of Tasks are created. Each task takes input parameters from a certain set, creates a new instance of the class and starts (the class consists of a series of methods that represent data processing. No data is returned to the application, the result of the work is simply saved to disk).
Problem: Tasks that are already guaranteed to complete in terms of my application logic, i.e. formed and saved the result of their work, continue to occupy RAM until the entire set of running tasks ends in success or failure.
For example: In the main thread (in a loop) I create and issue a launch command to 2 thousand tasks. I enter information about them into the List collection and after setting these tasks, I expect them to be completed in: Task.WaitAll(taskList.ToArray());
~10 tasks are executed simultaneously on average (judging by the processing log and temp files) and about 700MB of RAM is consumed at the beginning, but gradually this value grows to full utilization of all available RAM (up to 10Gb for ~50 minutes of application operation). There is definitely no amount of data that could really use so much RAM. Because of this, I concluded that this is most likely a problem in freeing up used RAM.
Question: How to free RAM of completed Tasks?Or, perhaps, suggest another direction of optimization that can help me in this situation. It can be dragging a cancellation token along with it, or doing dispose or etc from the body of the task.
PS: I am not professionally a programmer and have rather modest knowledge. In a particular case, I am rewriting a legacy craft from a corporate environment to gain knowledge and experience.
Answer the question
In order to leave comments, you need to log in
in your case, a large consumption of memory (and possibly other resources) is not due to the problems of utilization of task instances.
In a typical scenario, the default TaskSheduler executes the Task via the ThreadPool.
When initialized, the ThreadPool creates a small number of worker threads for itself (the initial number is equal to the number of cores in the system). Further, the ThreadPool monitors their use, and if it sees that the current number of worker threads cannot service all incoming tasks, it creates a new worker thread.
Formally, the criterion for selecting a thread is as follows: the queue of incoming tasks is not empty AND the time that has passed since the task was taken from the queue for execution is more than 500ms AND the processor utilization is less than 80%.
Thus, when you put on execution 2000 Tasks that take a long time to complete, the ThreadPool begins to slowly stamp out threads. But they are by no means as lightweight as Task.
From an architectural point of view, this situation is wrong. You can not throw 2000 tasks to be executed simultaneously without any balancing. But there is no memory leak here.
In your case, you can write a custom TaskSheduler that will limit the degree of parallelism of tasks that are given to the ThreadPool.
Or you can use Parallel.ForEach, which you can explicitly specify the degree of parallelism.
VisualStudio has a good profiling tool - you can see where the memory is leaking.
In general, as long as there is a variable that refers to an instance of the class, the memory will not be released. From what you wrote, you need to remove it from the taskList at the end of each task.
Task itself is a class with ~10 fields of 4 bytes. You can calculate how much 2000 pieces will take in memory. Obviously they are not the problem. As noted above, the problem is not in the tasks themselves, but in the code that is executed through the tasks. And not even in streams. Try to create 2000 real threads (new System.Threading.Thread(...)) and execute trivial code there, they are unlikely to be able to gobble up 10GB of memory.
The problem is in the code you run in these tasks. If you are not familiar with automatic garbage collectors, then you need to read about them.
Any C# code is a plus/minus method. When creating a task, you pass it a link to the method to be executed, the method can be named or anonymous, it doesn't matter. The problem is that when this method completes, the resources that were used in this method are not freed. It does not matter where the method is executed: in a task, in a thread, just like that in the main thread of the application. You need to ensure that when you exit the method that you pass to the task, all the resources used in this method are released.
If you are unfamiliar with the garbage collector in the CLR, then be sure to read, and indeed about garbage collection.
In this situation, without seeing the code, I can only advise the following. Further, to simplify understanding, we will assume that we have only two sections of code: a certain method (this is what you pass to the task), and the main thread (the rest of the program):
1) If you create something (in general, everywhere, but in the method especially), any class, if it has a .Close() or Dispose() method, then be sure to call this method after you no longer need the class.
2) If there is a return result from the method, check if it returns more than what you need. For example, a class is returned, with two fields, one number, the other an array. You only need a number from this. Accordingly, the array field must be removed from the returned value.
3) Simplify the returned result as much as possible. For example, you need to calculate the sum of elements in N arrays. You start N threads and return N arrays, i.e. from each method in an array, and then in the main thread you sum up the lengths of all arrays. In this case, there will just be a feeling of a memory leak. You need to return the length of the array at once. Etc.
4) If there is an addition of elements from a method to a collection that is declared in the main thread. Check if this collection is cleared when the method exits. Or not adding too much to this collection. Or Perhaps too large arrays are added to this collection, etc.
5) Almost the same as the previous paragraph. If there is any static collection or static fields, make them non-static wherever possible. And where it is impossible, make sure that no elements from the method are added to such a static collection. Or if they are added, then check the size of the elements, it should be minimal.
6) Make sure you don't create large arrays larger than 80kb. If you create, Resize to a smaller size if possible. For example, if the task is to count the number of characters in a file, then you do not need to read it into memory. It is enough to read 8kb in a cycle and summarize the result.
7) Last. Before exiting the method, insert:
System.Runtime.GCSettings.LargeObjectHeapCompactionMode = System.Runtime.GCLargeObjectHeapCompactionMode.CompactOnce;
System.GC.Collect();
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question