Skip to content

Commit 2e04247

Browse files
committed
Merge tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace updates from Steven Rostedt: - Have fprobes built on top of function graph infrastructure The fprobe logic is an optimized kprobe that uses ftrace to attach to functions when a probe is needed at the start or end of the function. The fprobe and kretprobe logic implements a similar method as the function graph tracer to trace the end of the function. That is to hijack the return address and jump to a trampoline to do the trace when the function exits. To do this, a shadow stack needs to be created to store the original return address. Fprobes and function graph do this slightly differently. Fprobes (and kretprobes) has slots per callsite that are reserved to save the return address. This is fine when just a few points are traced. But users of fprobes, such as BPF programs, are starting to add many more locations, and this method does not scale. The function graph tracer was created to trace all functions in the kernel. In order to do this, when function graph tracing is started, every task gets its own shadow stack to hold the return address that is going to be traced. The function graph tracer has been updated to allow multiple users to use its infrastructure. Now have fprobes be one of those users. This will also allow for the fprobe and kretprobe methods to trace the return address to become obsolete. With new technologies like CFI that need to know about these methods of hijacking the return address, going toward a solution that has only one method of doing this will make the kernel less complex. - Cleanup with guard() and free() helpers There were several places in the code that had a lot of "goto out" in the error paths to either unlock a lock or free some memory that was allocated. But this is error prone. Convert the code over to use the guard() and free() helpers that let the compiler unlock locks or free memory when the function exits. - Remove disabling of interrupts in the function graph tracer When function graph tracer was first introduced, it could race with interrupts and NMIs. To prevent that race, it would disable interrupts and not trace NMIs. But the code has changed to allow NMIs and also interrupts. This change was done a long time ago, but the disabling of interrupts was never removed. Remove the disabling of interrupts in the function graph tracer is it is not needed. This greatly improves its performance. - Allow the :mod: command to enable tracing module functions on the kernel command line. The function tracer already has a way to enable functions to be traced in modules by writing ":mod:<module>" into set_ftrace_filter. That will enable either all the functions for the module if it is loaded, or if it is not, it will cache that command, and when the module is loaded that matches <module>, its functions will be enabled. This also allows init functions to be traced. But currently events do not have that feature. Because enabling function tracing can be done very early at boot up (before scheduling is enabled), the commands that can be done when function tracing is started is limited. Having the ":mod:" command to trace module functions as they are loaded is very useful. Update the kernel command line function filtering to allow it. * tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits) ftrace: Implement :mod: cache filtering on kernel command line tracing: Adopt __free() and guard() for trace_fprobe.c bpf: Use ftrace_get_symaddr() for kprobe_multi probes ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr Documentation: probes: Update fprobe on function-graph tracer selftests/ftrace: Add a test case for repeating register/unregister fprobe selftests: ftrace: Remove obsolate maxactive syntax check tracing/fprobe: Remove nr_maxactive from fprobe fprobe: Add fprobe_header encoding feature fprobe: Rewrite fprobe on function-graph tracer s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS tracing: Add ftrace_fill_perf_regs() for perf event tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs fprobe: Use ftrace_regs in fprobe exit handler fprobe: Use ftrace_regs in fprobe entry handler fgraph: Pass ftrace_regs to retfunc fgraph: Replace fgraph_ret_regs with ftrace_regs ...
2 parents 0074ade + 31f505d commit 2e04247

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+1504
-852
lines changed

Documentation/trace/fprobe.rst

Lines changed: 27 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,10 @@ Fprobe - Function entry/exit probe
99
Introduction
1010
============
1111

12-
Fprobe is a function entry/exit probe mechanism based on ftrace.
13-
Instead of using ftrace full feature, if you only want to attach callbacks
14-
on function entry and exit, similar to the kprobes and kretprobes, you can
12+
Fprobe is a function entry/exit probe based on the function-graph tracing
13+
feature in ftrace.
14+
Instead of tracing all functions, if you want to attach callbacks on specific
15+
function entry and exit, similar to the kprobes and kretprobes, you can
1516
use fprobe. Compared with kprobes and kretprobes, fprobe gives faster
1617
instrumentation for multiple functions with single handler. This document
1718
describes how to use fprobe.
@@ -91,12 +92,14 @@ The prototype of the entry/exit callback function are as follows:
9192

9293
.. code-block:: c
9394
94-
int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
95+
int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data);
9596
96-
void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
97+
void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data);
9798
98-
Note that the @entry_ip is saved at function entry and passed to exit handler.
99-
If the entry callback function returns !0, the corresponding exit callback will be cancelled.
99+
Note that the @entry_ip is saved at function entry and passed to exit
100+
handler.
101+
If the entry callback function returns !0, the corresponding exit callback
102+
will be cancelled.
100103

101104
@fp
102105
This is the address of `fprobe` data structure related to this handler.
@@ -112,19 +115,28 @@ If the entry callback function returns !0, the corresponding exit callback will
112115
This is the return address that the traced function will return to,
113116
somewhere in the caller. This can be used at both entry and exit.
114117

115-
@regs
116-
This is the `pt_regs` data structure at the entry and exit. Note that
117-
the instruction pointer of @regs may be different from the @entry_ip
118-
in the entry_handler. If you need traced instruction pointer, you need
119-
to use @entry_ip. On the other hand, in the exit_handler, the instruction
120-
pointer of @regs is set to the current return address.
118+
@fregs
119+
This is the `ftrace_regs` data structure at the entry and exit. This
120+
includes the function parameters, or the return values. So user can
121+
access thos values via appropriate `ftrace_regs_*` APIs.
121122

122123
@entry_data
123124
This is a local storage to share the data between entry and exit handlers.
124125
This storage is NULL by default. If the user specify `exit_handler` field
125126
and `entry_data_size` field when registering the fprobe, the storage is
126127
allocated and passed to both `entry_handler` and `exit_handler`.
127128

129+
Entry data size and exit handlers on the same function
130+
======================================================
131+
132+
Since the entry data is passed via per-task stack and it has limited size,
133+
the entry data size per probe is limited to `15 * sizeof(long)`. You also need
134+
to take care that the different fprobes are probing on the same function, this
135+
limit becomes smaller. The entry data size is aligned to `sizeof(long)` and
136+
each fprobe which has exit handler uses a `sizeof(long)` space on the stack,
137+
you should keep the number of fprobes on the same function as small as
138+
possible.
139+
128140
Share the callbacks with kprobes
129141
================================
130142

@@ -165,8 +177,8 @@ This counter counts up when;
165177
- fprobe fails to take ftrace_recursion lock. This usually means that a function
166178
which is traced by other ftrace users is called from the entry_handler.
167179

168-
- fprobe fails to setup the function exit because of the shortage of rethook
169-
(the shadow stack for hooking the function return.)
180+
- fprobe fails to setup the function exit because of failing to allocate the
181+
data buffer from the per-task shadow stack.
170182

171183
The `fprobe::nmissed` field counts up in both cases. Therefore, the former
172184
skips both of entry and exit callback and the latter skips the exit

arch/arm64/Kconfig

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -217,9 +217,11 @@ config ARM64
217217
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
218218
select HAVE_EFFICIENT_UNALIGNED_ACCESS
219219
select HAVE_GUP_FAST
220+
select HAVE_FTRACE_GRAPH_FUNC
220221
select HAVE_FTRACE_MCOUNT_RECORD
221222
select HAVE_FUNCTION_TRACER
222223
select HAVE_FUNCTION_ERROR_INJECTION
224+
select HAVE_FUNCTION_GRAPH_FREGS
223225
select HAVE_FUNCTION_GRAPH_TRACER
224226
select HAVE_FUNCTION_GRAPH_RETVAL
225227
select HAVE_GCC_PLUGINS

arch/arm64/include/asm/Kbuild

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ syscall-y += unistd_32.h
88
syscall-y += unistd_compat_32.h
99

1010
generic-y += early_ioremap.h
11+
generic-y += fprobe.h
1112
generic-y += mcs_spinlock.h
1213
generic-y += mmzone.h
1314
generic-y += qrwlock.h

arch/arm64/include/asm/ftrace.h

Lines changed: 34 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,8 @@ extern unsigned long ftrace_graph_call;
5252
extern void return_to_handler(void);
5353

5454
unsigned long ftrace_call_adjust(unsigned long addr);
55+
unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip);
56+
#define ftrace_get_symaddr(fentry_ip) arch_ftrace_get_symaddr(fentry_ip)
5557

5658
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
5759
#define HAVE_ARCH_FTRACE_REGS
@@ -129,6 +131,38 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
129131
arch_ftrace_regs(fregs)->pc = arch_ftrace_regs(fregs)->lr;
130132
}
131133

134+
static __always_inline unsigned long
135+
ftrace_regs_get_frame_pointer(const struct ftrace_regs *fregs)
136+
{
137+
return arch_ftrace_regs(fregs)->fp;
138+
}
139+
140+
static __always_inline unsigned long
141+
ftrace_regs_get_return_address(const struct ftrace_regs *fregs)
142+
{
143+
return arch_ftrace_regs(fregs)->lr;
144+
}
145+
146+
static __always_inline struct pt_regs *
147+
ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
148+
{
149+
struct __arch_ftrace_regs *afregs = arch_ftrace_regs(fregs);
150+
151+
memcpy(regs->regs, afregs->regs, sizeof(afregs->regs));
152+
regs->sp = afregs->sp;
153+
regs->pc = afregs->pc;
154+
regs->regs[29] = afregs->fp;
155+
regs->regs[30] = afregs->lr;
156+
return regs;
157+
}
158+
159+
#define arch_ftrace_fill_perf_regs(fregs, _regs) do { \
160+
(_regs)->pc = arch_ftrace_regs(fregs)->pc; \
161+
(_regs)->regs[29] = arch_ftrace_regs(fregs)->fp; \
162+
(_regs)->sp = arch_ftrace_regs(fregs)->sp; \
163+
(_regs)->pstate = PSR_MODE_EL1h; \
164+
} while (0)
165+
132166
int ftrace_regs_query_register_offset(const char *name);
133167

134168
int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
@@ -186,23 +220,6 @@ static inline bool arch_syscall_match_sym_name(const char *sym,
186220

187221
#ifndef __ASSEMBLY__
188222
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
189-
struct fgraph_ret_regs {
190-
/* x0 - x7 */
191-
unsigned long regs[8];
192-
193-
unsigned long fp;
194-
unsigned long __unused;
195-
};
196-
197-
static inline unsigned long fgraph_ret_regs_return_value(struct fgraph_ret_regs *ret_regs)
198-
{
199-
return ret_regs->regs[0];
200-
}
201-
202-
static inline unsigned long fgraph_ret_regs_frame_pointer(struct fgraph_ret_regs *ret_regs)
203-
{
204-
return ret_regs->fp;
205-
}
206223

207224
void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
208225
unsigned long frame_pointer);

arch/arm64/kernel/asm-offsets.c

Lines changed: 0 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -179,18 +179,6 @@ int main(void)
179179
DEFINE(FTRACE_OPS_FUNC, offsetof(struct ftrace_ops, func));
180180
#endif
181181
BLANK();
182-
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
183-
DEFINE(FGRET_REGS_X0, offsetof(struct fgraph_ret_regs, regs[0]));
184-
DEFINE(FGRET_REGS_X1, offsetof(struct fgraph_ret_regs, regs[1]));
185-
DEFINE(FGRET_REGS_X2, offsetof(struct fgraph_ret_regs, regs[2]));
186-
DEFINE(FGRET_REGS_X3, offsetof(struct fgraph_ret_regs, regs[3]));
187-
DEFINE(FGRET_REGS_X4, offsetof(struct fgraph_ret_regs, regs[4]));
188-
DEFINE(FGRET_REGS_X5, offsetof(struct fgraph_ret_regs, regs[5]));
189-
DEFINE(FGRET_REGS_X6, offsetof(struct fgraph_ret_regs, regs[6]));
190-
DEFINE(FGRET_REGS_X7, offsetof(struct fgraph_ret_regs, regs[7]));
191-
DEFINE(FGRET_REGS_FP, offsetof(struct fgraph_ret_regs, fp));
192-
DEFINE(FGRET_REGS_SIZE, sizeof(struct fgraph_ret_regs));
193-
#endif
194182
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
195183
DEFINE(FTRACE_OPS_DIRECT_CALL, offsetof(struct ftrace_ops, direct_call));
196184
#endif

arch/arm64/kernel/entry-ftrace.S

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -329,24 +329,28 @@ SYM_FUNC_END(ftrace_stub_graph)
329329
* @fp is checked against the value passed by ftrace_graph_caller().
330330
*/
331331
SYM_CODE_START(return_to_handler)
332-
/* save return value regs */
333-
sub sp, sp, #FGRET_REGS_SIZE
334-
stp x0, x1, [sp, #FGRET_REGS_X0]
335-
stp x2, x3, [sp, #FGRET_REGS_X2]
336-
stp x4, x5, [sp, #FGRET_REGS_X4]
337-
stp x6, x7, [sp, #FGRET_REGS_X6]
338-
str x29, [sp, #FGRET_REGS_FP] // parent's fp
332+
/* Make room for ftrace_regs */
333+
sub sp, sp, #FREGS_SIZE
334+
335+
/* Save return value regs */
336+
stp x0, x1, [sp, #FREGS_X0]
337+
stp x2, x3, [sp, #FREGS_X2]
338+
stp x4, x5, [sp, #FREGS_X4]
339+
stp x6, x7, [sp, #FREGS_X6]
340+
341+
/* Save the callsite's FP */
342+
str x29, [sp, #FREGS_FP]
339343

340344
mov x0, sp
341-
bl ftrace_return_to_handler // addr = ftrace_return_to_hander(regs);
345+
bl ftrace_return_to_handler // addr = ftrace_return_to_hander(fregs);
342346
mov x30, x0 // restore the original return address
343347

344-
/* restore return value regs */
345-
ldp x0, x1, [sp, #FGRET_REGS_X0]
346-
ldp x2, x3, [sp, #FGRET_REGS_X2]
347-
ldp x4, x5, [sp, #FGRET_REGS_X4]
348-
ldp x6, x7, [sp, #FGRET_REGS_X6]
349-
add sp, sp, #FGRET_REGS_SIZE
348+
/* Restore return value regs */
349+
ldp x0, x1, [sp, #FREGS_X0]
350+
ldp x2, x3, [sp, #FREGS_X2]
351+
ldp x4, x5, [sp, #FREGS_X4]
352+
ldp x6, x7, [sp, #FREGS_X6]
353+
add sp, sp, #FREGS_SIZE
350354

351355
ret
352356
SYM_CODE_END(return_to_handler)

arch/arm64/kernel/ftrace.c

Lines changed: 77 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,69 @@ unsigned long ftrace_call_adjust(unsigned long addr)
143143
return addr;
144144
}
145145

146+
/* Convert fentry_ip to the symbol address without kallsyms */
147+
unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip)
148+
{
149+
u32 insn;
150+
151+
/*
152+
* When using patchable-function-entry without pre-function NOPS, ftrace
153+
* entry is the address of the first NOP after the function entry point.
154+
*
155+
* The compiler has either generated:
156+
*
157+
* func+00: func: NOP // To be patched to MOV X9, LR
158+
* func+04: NOP // To be patched to BL <caller>
159+
*
160+
* Or:
161+
*
162+
* func-04: BTI C
163+
* func+00: func: NOP // To be patched to MOV X9, LR
164+
* func+04: NOP // To be patched to BL <caller>
165+
*
166+
* The fentry_ip is the address of `BL <caller>` which is at `func + 4`
167+
* bytes in either case.
168+
*/
169+
if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
170+
return fentry_ip - AARCH64_INSN_SIZE;
171+
172+
/*
173+
* When using patchable-function-entry with pre-function NOPs, BTI is
174+
* a bit different.
175+
*
176+
* func+00: func: NOP // To be patched to MOV X9, LR
177+
* func+04: NOP // To be patched to BL <caller>
178+
*
179+
* Or:
180+
*
181+
* func+00: func: BTI C
182+
* func+04: NOP // To be patched to MOV X9, LR
183+
* func+08: NOP // To be patched to BL <caller>
184+
*
185+
* The fentry_ip is the address of `BL <caller>` which is at either
186+
* `func + 4` or `func + 8` depends on whether there is a BTI.
187+
*/
188+
189+
/* If there is no BTI, the func address should be one instruction before. */
190+
if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
191+
return fentry_ip - AARCH64_INSN_SIZE;
192+
193+
/* We want to be extra safe in case entry ip is on the page edge,
194+
* but otherwise we need to avoid get_kernel_nofault()'s overhead.
195+
*/
196+
if ((fentry_ip & ~PAGE_MASK) < AARCH64_INSN_SIZE * 2) {
197+
if (get_kernel_nofault(insn, (u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2)))
198+
return 0;
199+
} else {
200+
insn = *(u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2);
201+
}
202+
203+
if (aarch64_insn_is_bti(le32_to_cpu((__le32)insn)))
204+
return fentry_ip - AARCH64_INSN_SIZE * 2;
205+
206+
return fentry_ip - AARCH64_INSN_SIZE;
207+
}
208+
146209
/*
147210
* Replace a single instruction, which may be a branch or NOP.
148211
* If @validate == true, a replaced instruction is checked against 'old'.
@@ -481,7 +544,20 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
481544
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
482545
struct ftrace_ops *op, struct ftrace_regs *fregs)
483546
{
484-
prepare_ftrace_return(ip, &arch_ftrace_regs(fregs)->lr, arch_ftrace_regs(fregs)->fp);
547+
unsigned long return_hooker = (unsigned long)&return_to_handler;
548+
unsigned long frame_pointer = arch_ftrace_regs(fregs)->fp;
549+
unsigned long *parent = &arch_ftrace_regs(fregs)->lr;
550+
unsigned long old;
551+
552+
if (unlikely(atomic_read(&current->tracing_graph_pause)))
553+
return;
554+
555+
old = *parent;
556+
557+
if (!function_graph_enter_regs(old, ip, frame_pointer,
558+
(void *)frame_pointer, fregs)) {
559+
*parent = return_hooker;
560+
}
485561
}
486562
#else
487563
/*

arch/loongarch/Kconfig

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,16 +129,18 @@ config LOONGARCH
129129
select HAVE_DMA_CONTIGUOUS
130130
select HAVE_DYNAMIC_FTRACE
131131
select HAVE_DYNAMIC_FTRACE_WITH_ARGS
132+
select HAVE_FTRACE_REGS_HAVING_PT_REGS
132133
select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
133134
select HAVE_DYNAMIC_FTRACE_WITH_REGS
134135
select HAVE_EBPF_JIT
135136
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN
136137
select HAVE_EXIT_THREAD
137138
select HAVE_GUP_FAST
139+
select HAVE_FTRACE_GRAPH_FUNC
138140
select HAVE_FTRACE_MCOUNT_RECORD
139141
select HAVE_FUNCTION_ARG_ACCESS_API
140142
select HAVE_FUNCTION_ERROR_INJECTION
141-
select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER
143+
select HAVE_FUNCTION_GRAPH_FREGS
142144
select HAVE_FUNCTION_GRAPH_TRACER
143145
select HAVE_FUNCTION_TRACER
144146
select HAVE_GCC_PLUGINS

arch/loongarch/include/asm/fprobe.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
#ifndef _ASM_LOONGARCH_FPROBE_H
3+
#define _ASM_LOONGARCH_FPROBE_H
4+
5+
/*
6+
* Explicitly undef ARCH_DEFINE_ENCODE_FPROBE_HEADER, because loongarch does not
7+
* have enough number of fixed MSBs of the address of kernel objects for
8+
* encoding the size of data in fprobe_header. Use 2-entries encoding instead.
9+
*/
10+
#undef ARCH_DEFINE_ENCODE_FPROBE_HEADER
11+
12+
#endif /* _ASM_LOONGARCH_FPROBE_H */

0 commit comments

Comments
 (0)