如何利用ftrace精确跟踪特定进程调度信息
1、Ftrace 是什么东西?Ftrace是一个直接内置在Linux内核中的跟踪工具。许多发行版在最近的发行版中已经启用了各种各样的Ftrace配置。Ftrace给Linux带来的好处之一是能够看到内核中发生了什么。因此,这使得发现问题区域或简单地跟踪奇怪的bug变得更容易管理。
上面那段话是直接翻译过来的,原文如下,我比较推荐大家直接看英文
2、怎么使用FtraceFtrace is a tracing utility built directly into the Linux kernel. Many distributions already have various configurations of Ftrace enabled in their most recent releases. One of the benefits that Ftrace brings to Linux is the ability to see what is happening inside the kernel. As such, this makes finding problem areas or simply tracking down that strange bug more manageable.
我的实验平台是MT8167、Linux4.4内核
2.1、配置内核宏
weiqifa@bsp-ubuntu1804:~/is10-sdk$ git diff kernel-4.4/
diff --git a/kernel-4.4/arch/arm/configs/xxx_defconfig
b/kernel-4.4/arch/arm/configs/xxx_defconfig
index 8da5210003..a88e53253f 100755
--- a/kernel-4.4/arch/arm/configs/xxx_defconfig
+++ b/kernel-4.4/arch/arm/configs/xxx_defconfig
@@ -400,3 +400,7 @@ CONFIG_USB_RTL8152=y
CONFIG_USB_NET_DRIVERS=y
CONFIG_GPIO_CONTROL=y
#CONFIG_POGO_PIN=y
+CONFIG_FUNCTION_TRACER=y
+CONFIG_FUNCTION_GRAPH_TRACER=y
+CONFIG_STACK_TRACER=y
+CONFIG_DYNAMIC_FTRACE=y
2.2、烧录boot.img 重新开机
2.3、查看是否生效
进入ftrace 目录
xxx:/ # cd /sys/kernel/debug/tracing
xxx:/sys/kernel/debug/tracing # ls
README enabled_functions saved_cmdlines set_graph_function trace_marker
available_events events saved_cmdlines_size set_graph_notrace trace_options
available_filter_functions free_buffer saved_tgids snapshot trace_pipe
available_tracers instances set_event stack_max_size tracing_cpumask
buffer_size_kb max_graph_depth set_event_pid stack_trace tracing_max_latency
buffer_total_size_kb options set_ftrace_filter stack_trace_filter tracing_on
current_tracer per_cpu set_ftrace_notrace trace tracing_thresh
dyn_ftrace_total_info printk_formats set_ftrace_pid trace_clock
xxx:/sys/kernel/debug/tracing #
查看内核支持的跟踪器列表
xxx:/sys/kernel/debug/tracing # cat available_tracers
function_graph function nop
xxx:/sys/kernel/debug/tracing #
Tracer有很多种,主要几大类:
-
函数类:function, function_graph, stack
-
延时类:irqsoff, preemptoff, preemptirqsoff, wakeup, wakeup_rt, waktup_dl
-
其他类:nop, mmiotrace, blk
使能function_graph跟踪器
echo function_graph > current_tracer
查看当前的跟踪器
xxx:/sys/kernel/debug/tracing # cat current_tracer
function_graph
xxx:/sys/kernel/debug/tracing #
使能ftrace 功能
xxx:/sys/kernel/debug/tracing # echo 1 > tracing_on
查看ftrace输出
1|Knowin inSight10:/sys/kernel/debug/tracing # cat trace | head -40
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) 4.000 us | } /* path_init */
2) | link_path_walk() {
2) | inode_permission2() {
2) | __inode_permission2() {
2) | generic_permission() {
2) 0.384 us | in_group_p();
2) 2.154 us | }
2) | security_inode_permission() {
2) | selinux_inode_permission() {
2) 0.231 us | __rcu_read_lock();
2) 0.308 us | avc_lookup();
2) 0.231 us | __rcu_read_unlock();
2) 5.923 us | }
2) 7.846 us | }
2) + 13.538 us | }
2) + 15.308 us | }
2) | walk_component() {
2) | lookup_fast() {
2) 0.307 us | __d_lookup_rcu();
2) 0.307 us | __lookup_mnt();
2) 4.231 us | }
2) 6.077 us | }
2) | inode_permission2() {
2) | __inode_permission2() {
3) 3.923 us | } /* down_trylock */
2) | generic_permission() {
3) 6.538 us | } /* console_trylock */
2) 0.308 us | in_group_p();
3) | console_unlock() {
3) 0.231 us | _raw_spin_lock_irqsave();
2) 2.231 us | }
2) | security_inode_permission() {
2) | selinux_inode_permission() {
3) 0.308 us | _raw_spin_unlock_irqrestore();
2) 0.231 us | __rcu_read_lock();
3) 0.231 us | _raw_spin_lock_irqsave();
Knowin inSight10:/sys/kernel/debug/tracing #
3、Ftrace 其他使用特点
直接cat trace 文件,内容太多了,我们可以通过设置filter来设置过滤函数。
为了方便使用,set_ftrace_filter 文件还支持简单格式的通配符。
-
value* 选择所有名字以 value字串开头的函数
-
*value* 选择所有名字中包含 value字串的函数
-
*value 选择所有名字以 value字串结尾的函数
设置filter
xxx:/sys/kernel/debug/tracing # echo raw* > set_ftrace_filter
xxx:/sys/kernel/debug/tracing #
再查看trace文件
xxx:/sys/kernel/debug/tracing # cat trace
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
2) 1.693 us | raw_local_deliver();
2) 0.308 us | raw_local_deliver();
2) 1.000 us | raw_notifier_call_chain();
2) 2.077 us | raw_notifier_call_chain();
2) 1.923 us | raw_local_deliver();
------------------------------------------
2) ntloop--1393 => <...>-16
------------------------------------------
2) 0.308 us | raw_local_deliver();
------------------------------------------
2) <...>-16 => <idle>-0
------------------------------------------
2) 0.307 us | raw_notifier_call_chain();
2) 0.846 us | raw_notifier_call_chain();
2) 0.385 us | raw_notifier_call_chain();
------------------------------------------
2) <idle>-0 => droid.b-4348
------------------------------------------
2) 1.000 us | raw_notifier_call_chain();
2) 1.384 us | raw_notifier_call_chain();
------------------------------------------
2) droid.b-4348 => Binder:-524
------------------------------------------
2) 0.923 us | raw_notifier_call_chain();
2) 1.077 us | raw_notifier_call_chain();
------------------------------------------
2) Binder:-524 => droid.b-4348
------------------------------------------
2) 1.000 us | raw_notifier_call_chain();
------------------------------------------
2) droid.b-4348 => Binder:-524
------------------------------------------
2) 1.231 us | raw_notifier_call_chain();
------------------------------------------
2) Binder:-524 => droid.b-4348
------------------------------------------
2) 1.077 us | raw_notifier_call_chain();
2) 1.692 us | raw_notifier_call_chain();
2) 2.000 us | raw_notifier_call_chain();
2) 1.000 us | raw_notifier_call_chain();
2) 0.923 us | raw_notifier_call_chain();
2) 0.462 us | raw_notifier_call_chain();
2) 1.000 us | raw_notifier_call_chain();
2) 0.846 us | raw_notifier_call_chain();
查看中间字符串
xxx:/sys/kernel/debug/tracing # echo *touch* > set_ftrace_filter
查看trace
xxx:/sys/kernel/debug/tracing # cat trace |head -40
# tracer: function_graph
#
# CPU DURATION FUNCTION CALLS
# | | | | | | |
0) | /* 6 [cpu_loading] not_reset_cpu_loading */
0) | /* 6 [cpu_loading] update cpu_loading */
0) | /* 6 [cpu_loading] cur_idle_time[0].time:933317139 cur_wall_time[0].time:1008009932
*/
0) | /* 6 [cpu_loading] cur_idle_time[1].time:936071150 cur_wall_time[1].time:1008009942
*/
0) | /* 6 [cpu_loading] cur_idle_time[2].time:917091836 cur_wall_time[2].time:1008009946
*/
0) | /* 6 [cpu_loading] cur_idle_time[3].time:890438730 cur_wall_time[3].time:1008009950
*/
0) | /* 6 [cpu_loading] cur_idle_time[0].time:933317139 cur_wall_time[0].time:1008009932
*/
0) | /* 6 [cpu_loading] cur_idle_time[1].time:936071150 cur_wall_time[1].time:1008009942
*/
0) | /* 6 [cpu_loading] cur_idle_time[2].time:917091836 cur_wall_time[2].time:1008009946
*/
0) | /* 6 [cpu_loading] cur_idle_time[3].time:890438730 cur_wall_time[3].time:1008009950
*/
0) | /* 6 [cpu_loading] tmp_cpu_loading:2 prev_cpu_loading:1 previous state:3
*/
0) | /* 6 [cpu_loading] sent uevent success:lower=2 */
0) | /* 6 [cpu_loading] current state:3
*/
0) | /* 6 [cpu_loading] enable timer */
1) 1.462 us | touch_atime();
1) 0.616 us | touch_atime();
1) 0.385 us | touch_atime();
1) 0.538 us | touch_atime();
1) 0.539 us | touch_atime();
1) 1.461 us | touch_atime();
1) 0.308 us | touch_atime();
1) 0.385 us | touch_atime();
1) 0.385 us | touch_atime();
1) 2.000 us | touch_atime();
1) 0.923 us | touch_atime();
1) 0.615 us | touch_atime();
xxx:/sys/kernel/debug/tracing #
通过该文件还可以指定属于特定模块的函数,这要用到 mod 指令。指定模块的格式为:
echo ':mod:[module_name]' > set_ftrace_filter
查看对应的内核模块
xxx:/sys/kernel/debug/tracing # cat /proc/devices
Character devices:
1 mem
2 pty
3 ttyp
4 ttyS
5 /dev/tty
5 /dev/console
5 /dev/ptmx
10 misc
13 input
14 sound
29 fb
使用过滤器只针对某个模块进行跟踪
xxx:/sys/kernel/debug/tracing # echo ':mod:sound' > set_ftrace_filter
1|xxx:/sys/kernel/debug/tracing #
4.参考资料需要注意的是,这三种形式不能组合使用,比如“beginmiddleend”实际的效果与“begin”相同。另外,使用通配符表达式时,需要用单引号将其括起来,如果使用双引号,shell 可能会对字符‘ * ’进行扩展,这样最终跟踪的对象可能与目标函数不一样。
[1]、 https://lwn.net/Articles/365835/
[2]、https://www.kernel.org/doc/Documentation/trace/ftrace.txt