性能分析利器之perf浅析( 二 ) _perf

文章插图

有三种方法可以修复这个问题，这里不做展开，这些stack walking techniques后面可以写一篇单独的文章：

using dwarf data to unwind the stack, 实际上很多profile工具：gperftools, valgrind都是依赖于libunwind,通过dwarf来进行stack trace的
using last branch record (LBR) if available (a processor feature)
returning the frame pointers

5. 使用perf提供了一系列的命令来分析程序，如下：
sub command功能说明
annotate 读取perf.data(由perf record生成)显示注释信息，如果被分析的进程含义debug符号信息，则会显示汇编和对应的源码，否则只显示汇编代码
archive 根据perf.data(由perf record生成)文件中的build-id将相关的目标文件打包，方便在其他机器分析
bench perf提供的基准套件的通用框架，可以对当前系统的调度，IPC，内存访问进行性能评估
buildid-cache 管理build-id,管理对于的bin文件
buildid-list 列出perf.data中的所以buildids
data 把perf.data文件转换成其他格式diff读取多个perf.data文件，并给出差异分析
evlist 列出perf.data中采集的事件列表
kmem 分析内核内存的使用kvm分析
kvm 虚拟机上的guest os
list 列出当前系统支持的所有事件名,可分为三类：硬件事件、软件事件，检查点
lock 分析内核中的锁信息，包括锁的争用情况，等待延迟等
record 对程序运行过程中的事件进行分析和记录，并写入perf.data
report 读取perf.data(由perf record生成) 并显示分析结果
sched 针对调度器子系统的分析工具。
script 读取perf.data(由perf record生成)，生成trace记录，供其他分析工具使用
stat 对程序运行过程中的性能计数器进行统计
test perf对当前软硬件平台进行健全性测试，可用此工具测试当前的软硬件平台是否能支持perf的所有功能。
timechart 对record结果进行可视化分析输出，record命令需要加上timechart记录
top 对系统的性能进行分析，类型top命令，当然可以对单个进程进行分析
probe 用于定义动态检查点。
trace 类似于strace，跟踪目标的系统调用，但开销比strace小
perf的使用大体可以有三种方式：

Counting：统计的方式，统计事件发生的次数，这种方式不生成perf.data文件，例如perf stat, perf top
Sampling:采样的方式，采样事件，并写入到内核buffer中，并异步写入perf.data文件中，perf.data文件可以被perf report或者perf script 命令读取。
bpf programs on events(https://www.ibm.com/developerworks/cn/linux/l-lo-eBPF-history/index.html)

5.1. perf list
perf list命令可以列出当前perf可用的事件：
cpu-cycles OR cycles [Hardware event]instructions [Hardware event]cache-references [Hardware event]cache-misses [Hardware event]branch-instructions OR branches [Hardware event]branch-misses [Hardware event]bus-cycles [Hardware event]stalled-cycles-frontend OR idle-cycles-frontend [Hardware event]stalled-cycles-backend OR idle-cycles-backend [Hardware event]ref-cycles [Hardware event]alignment-faults [Software event]bpf-output [Software event]context-switches OR cs [Software event]cpu-clock [Software event]cpu-migrations OR migrations [Software event]dummy [Software event]emulation-faults [Software event]major-faults [Software event]minor-faults [Software event]page-faults OR faults [Software event]task-clock [Software event]msr/tsc/ [Kernel PMU event]rNNN [Raw hardware event descriptor]cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor](see 'man perf-list' on how to encode it)mem:<addr>[/len][:access] [Hardware breakpoint]这些事件可以分为三类(在文章开始介绍perf工作原理的时候也说了):Hardware Event, Software event, Tracepoint event.
每个具体事件的含义在perf_event_open的man page中有说明：