Python: How to estimate / calculate memory footprint of data structures?

What's a good way to estimate the memory footprint of an object?

Conversely, what's a good way to measure the footprint?

For example, say I have a dictionary whose values are lists of integer,float tuples:

d['key'] = [ (1131, 3.11e18), (9813, 2.48e19), (4991, 9.11e18) ]

I have 4G of physical memory and would like to figure out approximately how many rows (key:values) I can store in memory before I spill into swap. This is on linux/ubuntu 8.04 and OS X 10.5.6 .

Also, what's the best way to figure out the actual in-memory footprint of my program? How do I best figure out when it's exhausting physical memory and spilling?

Asked by: Kelsey915 | Posted: 06-12-2021

Answer 1

Guppy has a nice memory profiler (Heapy):

>>> from guppy import hpy
>>> hp = hpy()
>>> hp.setrelheap() # ignore all existing objects
>>> d = {}
>>> d['key'] = [ (1131, 3.11e18), (9813, 2.48e19), (4991, 9.11e18) ]
>>> hp.heap()
 Partition of a set of 24 objects. Total size = 1464 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      2   8      676  46       676  46 types.FrameType
     1      6  25      220  15       896  61 str
     2      6  25      184  13      1080  74 tuple

Heapy is a little underdocumented, so you might have to dig through the web page or source code a little, but it's very powerful. There are also some articles which might be relevant.

Answered by: Catherine804 | Posted: 07-01-2022

Answer 2

You can do this with a memory profiler, of which there are a couple I'm aware of:

  1. PySizer - poissibly obsolete, as the homepage now recommends:

  2. Heapy.

This is possibly a duplicate of this question.

Answered by: Grace587 | Posted: 07-01-2022

Similar questions

How to calculate deltas between two numeric nodes in two different data structures in Python?

I have a complicated data structure that is a roughly hierarchical dict, but with some lists thrown in a some levels (the serverStatus output of MongoDB) How could I traverse the data structure of the current data sample and, whenever number is found at a certain point in the data structure, calculate the difference between that number, and the number from the same point in the data structure of the previous sample?...

data structures - How to calculate sum of path's weight in Python

How do i calculate the edge? For example (look image): I started in the city of Bojongsoang and the destination I'm going to is Dago then the calculation result of the edge should be 220. class Graph: def __init__(self): self.graph = {} self.vertices_no = 0 def insertVertex(self,vertex): if vertex in self...

data structures - Multiple Tuple to Two-Pair Tuple in Python?

What is the nicest way of splitting this: tuple = ('a', 'b', 'c', 'd', 'e', 'f', 'g', 'h') into this: tuples = [('a', 'b'), ('c', 'd'), ('e', 'f'), ('g', 'h')] Assuming that the input always has an even number of values.

data structures - Is a list or dictionary faster in Python?

How much of a difference are these two as far as performance? tmp = [] tmp.append(True) print tmp[0] And tmp = {} tmp[0] = True print tmp[0]

data structures - How does Python store lists internally?

How are lists in python stored internally? Is it an array? A linked list? Something else? Or does the interpreter guess at the right structure for each instance based on length, etc. If the question is implementation dependent, what about the classic CPython?

data structures - Python: Set with only existence check?

I have a set of lots of big long strings that I want to do existence lookups for. I don't need the whole string ever to be saved. As far as I can tell, the set() actually stored the string which is eating up a lot of my memory. Does such a data structure exist? done = hash_only_set() while len(queue) > 0 : item = queue.pop() if item not in done : process(item) done....

data structures - Inverse Dict in Python

I am trying to create a new dict using a list of values of an existing dict as individual keys. So for example: dict1 = dict({'a':[1,2,3], 'b':[1,2,3,4], 'c':[1,2]}) and I would like to obtain: dict2 = dict({1:['a','b','c'], 2:['a','b','c'], 3:['a','b'], 4:['b']}) So far, I've not been able to do this in a very clean way. Any suggestions?

python - Live UI update of model changes when the model contains plain data structures only

Please consult me with your opinions on the following topic: I have a model - a structure of the objects. Such as: Event, containing participants Current task Assignee of each task The model is going to be pickled on the server and transferred over the network to the client GUI application. Because of the pickle I'd want to keep the model classes a...

Best way to save complex Python data structures across program sessions (pickle, json, xml, database, other)

Looking for advice on the best technique for saving complex Python data structures across program sessions. Here's a list of techniques I've come up with so far: pickle/cpickle json jsonpickle xml database (like SQLite) Pickle is the easiest and fastest technique, but my understanding is that there is no guarantee that pickle output will work across v...

data structures - csv to sparse matrix in python

I have a big csv file which lists connections between nodes in a graph. example: 0001,95784 0001,98743 0002,00082 0002,00091 So this means that node id 0001 is connected to node 95784 and 98743 and so on. I need to read this into a sparse matrix in numpy. How can i do this? I am new to python so tutorials on this would also help.

Python data structures, dictionary?

I hope somebody can help. I am using Python and I would like to be able to do the following. I have a set of objects (shapes for example) and a series of commands to act on these objects. The commands have the a format of a command string followed by a variable number of parameters which can be strings or integers For example the shape 'Rectangle' may have the following commands 'ChangeColor' 'green'

data structures - Options for read-only binary flat-file storage using Python

I have been tasked with setting up a flat-file SKU database for use on embedded devices with limited storage and processor speed. Basically the data I need to store consists of the following: SKU Description Location Price Qty The file will consist of several million records. The most important considerations are storage space and retrieval time. Records will only need to be retrieved by SK...

Still can't find your answer? Check out these communities...

PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python