Heap size reached in Apex Batch
Heap memory is a runtime memory which keeps the state of the variables until the program is not finished. Being a multi-tenant architecture, Salesforce can't give away unlimited heap memory to its users, so the limit is set to 12MB as it is an asynchronous job. If the heap memory usage exceeds more than the specified limit, then "Heap Size Limit Reached" error occurs during runtime. The main cause behind hitting this limit is Database.Stateful handling a very large collection of data. The Apex batch as a whole can process up to 50 million records, 10 thousand records per batch and only 10 thousand batches per 24 hours with a state heap limit of 12 MBs. Reaching close to these limits is not ideal.
Database.stateful is a marker which can be defined during defining a batch class to maintain the states or values of certain variables in heap memory. Batch class may call the execute method more than once depending upon the batches identified. Now each execute method call results in a separate apex transaction. During rolling over from one apex transaction i.e- Finishing one execute method call and starting a new call, all the collections or variables of the previous call are cleared by the garbage collector. But if you use Stateful, then the variable or collection doesn't lose the values in it during rolling over to different apex transactions or are marked not to be cleared by the garbage collector. A Batch will generally run slower when the Database.Stateful marker interface is used. The serializing and de-serializing of state between each execute call also affects performance.
Suggestions:
- Please try not to carry more than 10 thousand records in each collection (list or map) during a transition between states.
- Often use the Clear method to clear out data when it is not required anymore from Stateful collections.
- If you see that the collection is getting very large, you can try segregating data into groups and process one group in one batch class and remaining groups in some other batch classes. You can also use batch chaining for processing all groups one by one.
- You can use relation build data i.e- Fetch one record from queryLocator and then fetch other related data on the basis of that record in the execute method. As each batch behaves as a separate program with heap size being used as a database for them.
- Keep track of heap size with every ending execute and maintain accordingly using this code:
Integer HeapSizeWhenStartingExecute = Limits.getHeapSize();
Integer AdditionalHeapSizeUsed = Limits.getHeapSize() - HeapSizeWhenStartingExecute;
and save it temporarily to some temporary Object when you are near to hit limits.