How does Python deal with memory management? Learn the ins and outs of Python's garbage collection system and how to avoid its pitfalls.
Python grants its users many conveniences, and one of the largest is (nearly) hassle-free memory management. You donโt need to manually allocate, track, and dispose of memory for objects and data structures in Python. The runtime does all of that for you, so you can focus on solving your actual problems instead of wrangling machine-level details.
Still, itโs good for even modestly experienced Python users to understand how Pythonโs garbage collection and memory management work. Understanding these mechanisms will help you avoid performance issues that can arise with more complex projects. You can also use Pythonโs built-in tooling to monitor your programโs memory management behavior.
In this article, weโll take a look at how Python memory management works, how its garbage collection system helps optimize memory in Python programs, and how to use the modules available in the standard library and elsewhere to control memory use and garbage collection.
How Python manages memory
Every Python object has a reference count, also known as a refcount.ย The refcount is a tally of the total number of other objects that hold a reference to a given object. When you add or remove references to an object, the number goes up or down. When an objectโs refcount goes to zero, that object is deallocated and its memory is freed up.
What is a reference? Anything that allows an object to be accessed by way of a name, or by way of an accessor in another object.
Hereโs a simple example:
x = "Hello there"
When we give Python this command, two things happen under the hood:
- The string
"Hello there"is created and stored in memory as a Python object. - The name
xis created in the local namespace and pointed at that object, which increases its reference count by 1, to 1.
If we were to say y = x, then the reference count would be raised once again, to 2.
Whenever x and y go out of scope or are deleted from their namespaces, the reference count for the string goes down by 1 for each of those names. Once x and y are both out of scope or deleted, the refcount for the string goes to 0 and is removed.
Now, letโs say we create a list with a string in it, like this:
x = ["Hello there", 2, False]
The string remains in memory until either the list itself is removed or the element with the string in it is removed from the list. Either of these actions will cause the only thing holding a reference to the string to vanish.
Now consider this example:
x = "Hello there"
y = [x]
If we remove the first element from y, or delete the list y entirely, the string is still in memory. This is because the name x holds a reference to it.
Reference cycles in Python
In most cases, reference counts work fine. But sometimes you have a case where two objects each hold a reference to each other. This is known as aย reference cycle. In this case, the reference counts for the objects will never reach zero, and theyโll never be removed from memory.
Hereโs a contrived example:
x = SomeClass()
y = SomeOtherClass()
x.item = y
y.item = x
Since x and y hold references to each other, they will never be removed from the systemโeven if nothing else has a reference to either of them.
Itโs actually fairly common for Pythonโs own runtime to generate reference cycles for objects. One example would be an exception with a traceback object that contains references to the exception itself.
In very early versions of Python, this was a problem. Objects with reference cycles could accumulate over time, which was a big issue for long-running applications. But Python has since introduced the cycle detection and garbage collection system, which manages reference cycles.
The Python garbage collector (gc)
Pythonโs garbage collector detects objects with reference cycles. It does this by tracking objects that are โcontainersโโthings like lists, dictionaries, custom class instancesโand determining what objects in them canโt be reached anywhere else.
Once those objects are singled out, the garbage collector removes them by ensuring their reference counts can be safely brought down to zero. (For more about how this works, see the Python developerโs guide.)
The vast majority of Python objects donโt have reference cycles, so the garbage collector doesnโt need to run 24/7. Instead, the garbage collector uses a few heuristics to run less often and to run as efficiently as possible each time.
When the Python interpreter starts, it tracks how many objects have been allocated but not deallocated. The vast majority of Python objects have a very short lifespan, so they pop in and out of existence quickly. But over time, more long-lived objects hang around. Once more than a certain number of such objects stacks up, the garbage collector runs. (The default number of allowed long-lived objects is 700 as of Python 3.10.)
Every time the garbage collector runs, it takes all the objects that survive the collection and puts them together in a group called a generation. These โgeneration 1โ objects get scanned less often for reference cycles. Any generation 1 objects that survive the garbage collector eventually are migrated into a second generation, where theyโre scanned even more rarely.
Again, not everything is tracked by the garbage collector. Complex objects like a user-created class, for instance, are always tracked. But a dictionary that holds only simple objects like integers and strings wouldnโt be tracked, because no object in that particular dictionary holds references to other objects. Simple objects that canโt hold references to other elements, like integers and strings, are never tracked.
How to use the gc module
Generally, the garbage collector doesnโt need tuning to run well. Pythonโs development team chose defaults that reflect the most common real-world scenarios. But if you do need to tweak the way garbage collection works, you can use Pythonโs gc module. The gc module provides programmatic interfaces to the garbage collectorโs behaviors, and it provides visibility into what objects are being tracked.
One useful thing gc lets you do is toggle off the garbage collector when youโre sure you wonโt need it. For instance, if you have a short-running script that piles up a lot of objects, you donโt need the garbage collector. Everything will just be cleared out when the script ends. To that end, you can disable the garbage collector with the command gc.disable(). Later, you can re-enable it with gc.enable().
You can also run a collection cycle manually with gc.collect(). A common application for this would be to manage a performance-intensive section of your program that generates many temporary objects. You could disable garbage collection during that part of the program, then manually run a collection at the end and re-enable collection.
Another useful garbage collection optimization is gc.freeze(). When this command is issued, everything currently tracked by the garbage collector is โfrozen,โ or listed as exempt from future collection scans. This way, future scans can skip over those objects. If you have a program that imports libraries and sets up a good deal of internal state before starting, you can issue gc.freeze() after all the work is done. This keeps the garbage collector from having to trawl over things that arenโt likely to be removed anyway. (If you want to have garbage collection performed again on frozen objects, use gc.unfreeze().)
Debugging garbage collection with gc
You can also use gc to debug garbage collection behaviors. If you have an inordinate number of objects stacking up in memory and not being garbage collected, you can use gcโs inspection tools to figure out what might be holding references to those objects.
If you want to know what objects hold a reference to a given object, you can use gc.get_referrers(obj) to list them. You can also use gc.get_referents(obj) to find any objects referred to by a given object.
If youโre not sure if a given object is a candidate for garbage collection, gc.is_tracked(obj) tells you whether or not that object is tracked by the garbage collector. As noted earlier, keep in mind that the garbage collector doesnโt track โatomicโ objects (such as integers) or elements that contain only atomic objects.
If you want to see for yourself what objects are being collected, you can set the garbage collectorโs debugging flags with gc.set_debug(gc.DEBUG_LEAK|gc.DEBUG_STATS). This writes information about garbage collection to stderr. It preserves all objects collected as garbage in the read-only list, gc.garbage.
Avoid pitfalls in Python memory management
As noted, objects can pile up in memory and not be collected if you still have references to them somewhere. This isnโt a failure of Pythonโs garbage collection as such; the garbage collector canโt tell if you accidentally kept a reference to something or not.
Letโs end with a few pointers for preventing objects from never being collected.
Pay attention to object scope
If you assign Object 1 to be a property of Object 2 (such as a class), Object 2 will need to go out of scope before Object 1 will:
obj1 = MyClass()
obj2.prop = obj1
Whatโs more, if this happens in a way thatโs a side-effect of some other operation, like passing Object 2 as an argument to a constructor for Object 1, you might not realize Object 1 is holding a reference:
obj1 = MyClass(obj2)
Another example: If you push an object into a module-level list and forget about the list, the object will remain until removed from the list, or until the list itself no longer has any references. But if that list is a module-level object, itโll likely hang around until the program terminates.
In short, be conscious of ways your object might be held by another object that doesnโt always look obvious.
Use weakref to avoid reference cycles
Pythonโs weakref module lets you create weak references to other objects. Weak references donโt increase an objectโs reference count, so an object that has only weak references is a candidate for garbage collection.
One common use for weakref would be an object cache. You donโt want the referenced object to be preserved just because it has a cache entry, so you use a weakref for the cache entry.
Manually break reference cycles
Finally, if youโre aware that a given object holds a reference to another object, you can always break the reference to that object manually. For instance, if you have instance_of_class.ref = other_object, you can set instance_of_class.ref = None when youโre preparing to removeย instance_of_class.


