Measuring the CPU and Memory Usage for Python subprocess
The article talks about a method for measuring the resources used by a process invoked by subprocess.run
or subprocess.Popen
.
We can use psutil
to get realtime usage data of a process. Using psutil
may not provide accurate resource measurements for short-lived processes, as it samples usage at intervals and can miss brief spikes. We want a method to get the final resource usage of a process after it finishes.
The method leverages multiprocessing.Process
to wrap the calling of subprocess.run
or subprocess.Popen
, and get the resource usage by calling resource.getrusage(resource.RUSAGE_CHILDREN)
of the wrapper process. For example,
import resource
import subprocess
from multiprocessing import Process, Manager
def run(cmd):
'''
Run a command, for example
run('stress --timeout 5 -c 2 --vm-bytes 128M')
'''
def _run(cmd, result):
p = subprocess.Popen(cmd,
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdout, stderr = p.communicate()
result['returncode'] = p.returncode
result['stdout'] = stdout
result['stderr'] = stderr
rusage = resource.getrusage(resource.RUSAGE_CHILDREN)
result['rusage'] = rusage
with Manager() as manager:
result = manager.dict()
process = Process(target=_run, args=(cmd, result))
process.start()
process.join()
return dict(result)
The result and usage data can be transferred back to main process by multiprocessing.Manager
.
The multiprocessing.Process
creates an isolated child process that becomes the direct parent of the subprocess launched via subprocess.Popen
. This intermediate layer ensures that the wrapper process exclusively captures resource statistics for the subprocess it spawned. This isolation prevents interference from unrelated processes and guarantees the measured usage corresponds solely to the target command.
The method is like time
command, but without affecting stdout
/stderr
output of the target subprocess.