目录

python代码CPU耗时及内存使用量统计


1. CPU耗时统计

利用修饰器统计函数使用时间。

snippet.python
import time
from functools import wraps
 
# 1. 定义修饰器
def ctimer(function):
    @wraps(function)
    def ftimer(*args, **kwargs):
        t0 = time.time()
        result = function(*args, **kwargs)
        t1 = time.time()
        sys.stderr.write("[INFO] '%s' running cost: %s seconds\n" %(function.func_name, str(t1-t0)))
        return result
    return ftimer
# 2. 加载修饰器
@ ctimer
def targetfunc():
    a = np.arange(1000000)
    return 1000000
# 3. 测试函数耗时
targetfunc()

2. 使用cProfile 统计

可以使用cProfile统计每个函数or方法的耗时,以及函数被调用的次数。

snippet.bash
python -m cProfile -s cumulative test.py

3. 统计每行代码的耗时

使用line_profiler模块统计每行代码所占用的CPU时间。

snippet.bash
# 安装line_profiler
pip install line_profiler

定义脚本

snippet.python
# 调用修饰器
@profile
def test(n):
  l = np.arange(10000)
  l.sort()
  return l
test()

测试脚本

snippet.bash
kernprof -l -v test.py
 
#输出示例:
Wrote profile results to test.py.lprof # 可用python -m pstats test.py.lprof  调用
Timer unit: 1e-06 s
 
Total time: 0.001966 s
File: test.py
Function: storedata at line 8
 
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     8                                           @profile
     9                                           def storedata():
    10         1           26     26.0      1.3  	data = np.arange(10)
    11         1            8      8.0      0.4  	label = map(str,range(10))
    12         1          759    759.0     38.6  	datafile = h5py.File('store.h5','w')
    13         1          544    544.0     27.7  	datafile.create_dataset('data', data = data)
    14         1          331    331.0     16.8  	datafile.create_dataset('label', data = label)
    15         1          297    297.0     15.1  	datafile.close()
    16         1            1      1.0      0.1  	return 0
 
## 输出说明:
#Line #: The line number in the file.
 
#Hits: The number of times that line was executed.
 
#Time: The total amount of time spent executing the line in the timer's units. In the header information before the tables, you will see a line "Timer unit:" giving the conversion factor to seconds. It may be different on different systems.
 
#Per Hit: The average amount of time spent executing the line once in the timer's units.
 
#% Time: The percentage of time spent on that line relative to the total amount of recorded time spent in the function.
 
#Line Contents: The actual source code. Note that this is always read from disk when the formatted results are viewed, not when the code was executed. If you have edited the file in the meantime, the lines will not match up, and the formatter may not even be able to locate the function for display.

4. 内存耗费统计

Fabian Pedregosa模仿Robert Kern的line_profiler实现了1个很有用的内存分析器。

安装模块:

snippet.bash
pip install -U memory_profiler
pip install psutil

与lineprofiler一样,memoryprofiler也需要对目标函数装饰@profile:

snippet.bash
# 观察函数使用的内存,如下:
python -m memory_profiler test.py # 运行会消耗较多的时间

5. 内存泄漏问题

Python解释器使用引用计数做为记录内存使用的主要方法。这意味着每个对象包含一个计数器,当某处对该对象的引用被存储时计数器增加,当引用被删除时计数器递减。当计数器到达零时,Python解释器就会删除对象,释放占用的内存。如果程序中不再被使用的对象的引用一直被占有,则就会发生内存泄漏。

查找这种内存泄漏最快的方式是使用Marius Gedminas编写的objgraph。可以查看内存中对象的数量,并定位含有该对象的引用的所有代码的位置。非常有效的工具

snippet.bash
# 安装模块
pip install objgraph
snippet.python
# 使用方法
# 声明调用调试器:
import pdb
pdb.set_trace()
 
# 运行时,可通过执行下述指令查看程序中前20个最普遍的对象:
(pdb) import objgraph
(pdb) objgraph.show_most_common_types()
 
MyBigFatObject             20000
tuple                      16938
function                   4310
dict                       2790
wrapper_descriptor         1181
builtin_function_or_method 934
weakref                    764
list                       634
method_descriptor          507
getset_descriptor          451
type                       439
 
# 也可以查看两个时间点之间那些对象已经被添加或删除
(pdb) import objgraph
(pdb) objgraph.show_growth()
 
 
(pdb) objgraph.show_growth()   # this only shows objects that has been added or deleted since last show_growth() call
 
traceback                4        +2
KeyboardInterrupt        1        +1
frame                   24        +1
list                   667        +1
tuple                16969        +1
 
# 可以查看哪里包含给定对象的引用。
x = [1,2,3,4,5,6]
y = [x, [x], {"a":x}]
import pdb; pdb.set_trace()
 
# 想查看程序对变量x的引用,执行objgraph.show_backref()函数:
(pdb) import objgraph
(pdb) objgraph.show_backref([x], filename="x_mem.png")

即,通过objgraph 可以:

参考: http://mg.pov.lt/objgraph/