| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Replace the own implementation for wildcard (glob) matching with
a function call to fnmatch().
Also, change the return type to 'bool'.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mod->name is set to the ELF filename with the suffix ".o" stripped.
The current code calls strdup() and free() to manipulate the string,
but a simpler approach is to pass new_module() with the name length
subtracted by 2.
Also, check if the passed filename ends with ".o" before stripping it.
The current code blindly chops the suffix:
tmp[strlen(tmp) - 2] = '\0'
It will cause buffer under-run if strlen(tmp) < 2;
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
scripts/Makefile.build and scripts/link-vmlinux.sh have similar setups
for the objtool arguments.
It was difficult to factor out them because all the vmlinux build rules
were written in a shell script. It is somewhat tedious to touch the two
files every time a new objtool option is supported.
To reduce the code duplication, move the objtool for vmlinux.o into
scripts/Makefile.vmlinux_o. Then, move the common macros to Makefile.lib
so they are shared between Makefile.build and Makefile.vmlinux_o.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
| |
This is a preparation for moving the objtool rule in the next commit.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the "make clean" rule to remove all the .tmp_* files.
.tmp_objdiff is the only exception, which should be removed by
"make mrproper".
Rename the record directory of objdiff, .tmp_objdiff to .objdiff to
avoid the removal by "make clean".
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
|
|
|
| |
These are cleaned by the top Makefile.
vmlinux.o and .vmlinux.d matches the '*.[aios]' and '.*.d' patterns
respectively.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When CONFIG_LTO_CLANG or CONFIG_X86_KERNEL_IBT is enabled, objtool for
multi-object modules is postponed until the objects are linked together.
Make sure to re-run objtool and re-link multi-object modules when
objtool is updated.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Separate out the command execution part of if_changed, as we did
for if_changed_dep.
This allows us to reuse it in if_changed_rule.
define rule_foo
$(call cmd_and_savecmd,foo)
$(call cmd,bar)
endef
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like built-in.a, the command length of the *.mod rule scales with
the depth of the directory times the number of objects in the Makefile.
Add $(obj)/ by the shell command (awk) instead of by Make's builtin
function.
In-tree modules still have some room to the limit (ARG_MAX=2097152),
but this is more future-proof for big modules in a deep directory.
For example, you can build i915 as a module (CONFIG_DRM_I915=m) and
compare drivers/gpu/drm/i915/.i915.mod.cmd with/without this commit.
The issue is more critical for external modules because the M= path
can be very long as Jeff Johnson reported before [1].
[1] https://lore.kernel.org/linux-kbuild/4c02050c4e95e4cb8cc04282695f8404@codeaurora.org/
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Kbuild runs at the top of objtree instead of changing the working
directory to subdirectories. I think this design is nice overall but
some commands have a scalability issue.
The build command of built-in.a is one of them whose length scales with:
O(D * N)
Here, D is the length of the directory path (i.e. $(obj)/ prefix),
N is the number of objects in the Makefile, O() is the big O notation.
The deeper directory the Makefile directory is located, the more easily
it will hit the too long argument error.
We can make it better. Trim the $(obj)/ by Make's builtin function, and
restore it by a shell command (sed).
With this, the command length scales with:
O(D + N)
In-tree modules still have some room to the limit (ARG_MAX=2097152),
but this is more future-proof for big modules in a deep directory.
For example, you can build i915 as builtin (CONFIG_DRM_I915=y) and
compare drivers/gpu/drm/i915/.built-in.a.cmd with/without this commit.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 'static' specifier and EXPORT_SYMBOL() are an odd combination.
Commit 15bfc2348d54 ("modpost: check for static EXPORT_SYMBOL*
functions") tried to detect it, but this check has false negatives.
Here is the sample code.
Makefile:
obj-y += foo1.o foo2.o
foo1.c:
#include <linux/export.h>
static void foo(void) {}
EXPORT_SYMBOL(foo);
foo2.c:
void foo(void) {}
foo1.c exports the static symbol 'foo', but modpost cannot catch it
because it is fooled by foo2.c, which has a global symbol with the
same name.
s->is_static is cleared if a global symbol with the same name is found
somewhere, but EXPORT_SYMBOL() and the global symbol do not necessarily
belong to the same compilation unit.
This check should be done per compilation unit, but I do not know how
to do it in modpost. modpost runs against vmlinux.o or modules, which
merges multiple objects, then forgets their origin.
modpost cannot parse individual objects because they may not be ELF but
LLVM IR when CONFIG_LTO_CLANG=y.
Add a simple bash script to parse the output from ${NM}. This works for
CONFIG_LTO_CLANG=y because llvm-nm can dump symbols of LLVM IR files.
Revert 15bfc2348d54.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Parisc overrides 'nm' with a shell script. I was hit by a false-positive
error of $(NM) because this script returns the exit status of grep
instead of ${CROSS_COMPILE}nm. (grep returns 1 if no lines were selected)
I tried to fix it, but in the code review, Helge suggested to remove it
entirely. [1]
This script was added in 2003. [2]
Presumably, it was a workaround for old toolchains (but even the parisc
maintainer does not know the detail any more).
Hopefully, recent tools should work fine.
[1]: https://lore.kernel.org/all/1c12cd26-d8aa-4498-f4c0-29478b9578fe@gmx.de/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=36eaa6e4c0e0b6950136b956b72fd08155b92ca3
Suggested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When CONFIG_LTO_CLANG=y, additional intermediate *.prelink.o is created
for each module. Also, objtool is postponed until LLVM IR is converted
to ELF.
CONFIG_X86_KERNEL_IBT works in a similar way to postpone objtool until
objects are merged together.
This commit stops generating *.prelink.o, so the build flow will look
similar with/without LTO.
The following figures show how the LTO build currently works, and
how this commit is changing it.
Current build flow
==================
[1] single-object module
$(LD)
$(CC) +objtool $(LD)
foo.c --------------------> foo.o -----> foo.prelink.o -----> foo.ko
(LLVM IR) (ELF) | (ELF)
|
foo.mod.o --/
(LLVM IR)
[2] multi-object module
$(LD)
$(CC) $(AR) +objtool $(LD)
foo1.c -----> foo1.o -----> foo.o -----> foo.prelink.o -----> foo.ko
| (archive) (ELF) | (ELF)
foo2.c -----> foo2.o --/ |
(LLVM IR) foo.mod.o --/
(LLVM IR)
One confusion is that foo.o in multi-object module is an archive
despite of its suffix.
New build flow
==============
[1] single-object module
Since there is only one object, there is no need to keep the LLVM IR.
Use $(CC)+$(LD) to generate an ELF object in one build rule. When LTO
is disabled, $(LD) is unneeded because $(CC) produces an ELF object.
$(CC)+$(LD)+objtool $(LD)
foo.c ----------------------------> foo.o ---------> foo.ko
(ELF) | (ELF)
|
foo.mod.o --/
(LLVM IR)
[2] multi-object module
Previously, $(AR) was used to combine LLVM IR files into an archive,
but there was no technical reason to do so. Use $(LD) to merge them
into a single ELF object.
$(LD)
$(CC) +objtool $(LD)
foo1.c ---------> foo1.o ---------> foo.o ---------> foo.ko
| (ELF) | (ELF)
foo2.c ---------> foo2.o ----/ |
(LLVM IR) foo.mod.o --/
(LLVM IR)
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
*.prelink.o is created when CONFIG_LTO_CLANG or CONFIG_X86_KERNEL_IBT
is enabled.
Replace $(linked-object) with $(CONFIG_LTO_CLANG)$(CONFIG_X86_KERNEL_IBT)
so you will get a quick idea of when the --link option is passed.
No functional change is intended.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Guenter Roeck reported the build breakage for parisc and csky.
I confirmed nios2 and openrisc are broken as well.
The reason is that they borrow libgcc.a from the toolchains.
For example, see this line in arch/parisc/Makefile:
LIBGCC := $(shell $(CC) -print-libgcc-file-name)
Some objects in libgcc.a are linked to vmlinux.o, but they do not have
.*.cmd files.
Obviously, there is no EXPORT_SYMBOL in external objects. Ignore them.
(Most of the architectures import library code into the kernel tree.
Perhaps those 4 architectures can do similar, but I do not know how
challenging it is.)
Fixes: f292d875d0dc ("modpost: extract symbol versions from *.cmd files")
Link: https://lore.kernel.org/linux-kbuild/20220528224745.GA2501857@roeck-us.net/T/#mac65c20c71c3e272db0350ecfba53fcd8905b0a0
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
|
|
|
|
|
|
|
|
| |
Similar cleanup to commit 5c8166419acf ("kbuild: replace $(if A,A,B)
with $(or A,B)").
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if ((addr - sym->st_value) < distance) {
distance = addr - sym->st_value;
near = sym;
} else if ((addr - sym->st_value) == distance) {
near = sym;
}
is equivalent to:
if (addr - sym->st_value <= distance) {
distance = addr - sym->st_value;
near = sym;
}
(The else-if block can overwrite 'distance' with the same value).
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
|
|
|
|
|
|
|
|
| |
Move ARRAY_SIZE() from file2alias.c to modpost.h to reuse it in
section_mismatch().
Also, move the variable 'check' inside the for-loop.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
|
|
|
|
|
| |
check_sec_ref() does not use the first parameter 'mod'.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The return value of is_arm_mapping_symbol() is unpredictable when "$"
is passed in.
strchr(3) says:
The strchr() and strrchr() functions return a pointer to the matched
character or NULL if the character is not found. The terminating null
byte is considered part of the string, so that if c is specified as
'\0', these functions return a pointer to the terminator.
When str[1] is '\0', strchr("axtd", str[1]) is not NULL, and str[2] is
referenced (i.e. buffer overrun).
Test code
---------
char str1[] = "abc";
char str2[] = "ab";
strcpy(str1, "$");
strcpy(str2, "$");
printf("test1: %d\n", is_arm_mapping_symbol(str1));
printf("test2: %d\n", is_arm_mapping_symbol(str2));
Result
------
test1: 0
test2: 1
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the `-z unique-symbol` linker flag or any similar mechanism,
it is possible to trigger the following:
ERROR: modpost: "param_set_uint.0" [vmlinux] is a static EXPORT_SYMBOL
The reason is that for now the condition from remove_dot():
if (m && (s[n + m] == '.' || s[n + m] == 0))
which was designed to test if it's a dot or a '\0' after the suffix
is never satisfied.
This is due to that `s[n + m]` always points to the last digit of a
numeric suffix, not on the symbol next to it (from a custom debug
print added to modpost):
param_set_uint.0, s[n + m] is '0', s[n + m + 1] is '\0'
So it's off-by-one and was like that since 2014.
Fix this for the sake of any potential upcoming features, but don't
bother stable-backporting, as it's well hidden -- apart from that
LD flag, it can be triggered only with GCC LTO which never landed
upstream.
Fixes: fcd38ed0ff26 ("scripts: modpost: fix compilation warning")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
| |
The kallsyms program supports --absolute-percpu option but does not display
it in the usage message, fix it.
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When building an external module, if users don't need to separate the
compilation output and source code, they run the following command:
"make -C $(LINUX_SRC_DIR) M=$(PWD)". At this point, "$(KBUILD_EXTMOD)"
and "$(src)" are the same.
If they need to separate them, they run "make -C $(KERNEL_SRC_DIR)
O=$(KERNEL_OUT_DIR) M=$(OUT_DIR) src=$(PWD)". Before running the
command, they need to copy "Kbuild" or "Makefile" to "$(OUT_DIR)" to
prevent compilation failure.
So the kernel should change the included path to avoid the copy operation.
Signed-off-by: Jing Leng <jleng@ambarella.com>
[masahiro: I do not think "M=$(OUT_DIR) src=$(PWD)" is the official way,
but this patch is a nice clean up anyway.]
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Enable DM core bioset's per-cpu bio cache if QUEUE_FLAG_POLL set.
This change improves DM's hipri bio polling (REQ_POLLED) performance
by 7 - 20% depending on the system.
- Update DM core to use jump_labels to further reduce cost of unlikely
branches for zoned block devices, dm-stats and swap_bios throttling.
- Various DM core changes to reduce bio-based DM overhead and simplify
IO accounting.
- Fundamental DM core improvements to dm_io reference counting and the
elimination of using bio_split()+bio_chain() -- instead DM's
bio-based IO accounting is updated to account that a split occurred.
- Improve DM core's abnormal bio processing to do less work.
- Improve DM core's hipri polling support to use a single list rather
than an hlist.
- Update DM core to pass NULL bdev to bio_alloc_clone() so that
initialization that isn't useful for DM can be elided.
- Add cond_resched to DM stats' various loops that loop over all
entries.
- Fix incorrect error code return from DM integrity's constructor.
- Make DM crypt's printing of the key constant-time.
- Update bio-based DM multipath to provide high-resolution timer to the
Historical Service Time (HST) path selector.
* tag 'for-5.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (26 commits)
dm: pass NULL bdev to bio_alloc_clone
dm cache metadata: remove unnecessary variable in __dump_mapping
dm mpath: provide high-resolution timer to HST for bio-based
dm crypt: make printing of the key constant-time
dm integrity: fix error code in dm_integrity_ctr()
dm stats: add cond_resched when looping over entries
dm: improve abnormal bio processing
dm: simplify bio-based IO accounting further
dm: put all polled dm_io instances into a single list
dm: improve dm_io reference counting
dm: don't grab target io reference in dm_zone_map_bio
dm: improve bio splitting and associated IO accounting
dm: switch to bdev based IO accounting interfaces
dm: pass dm_io instance to dm_io_acct directly
dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct
dm: use bio_sectors in dm_aceept_partial_bio
dm: simplify basic targets
dm: conditionally enable branching for less used features
dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio
dm: move hot dm_io members to same cacheline as dm_target_io
...
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Most DM targets will remap the clone bio passed to their ->map
function using bio_set_bdev(). So this change to pass NULL bdev to
bio_alloc_clone avoids clone-time work that sets up resources for a
bdev association that will not be used in practice (e.g. clone issued
to underlying device will not use DM device's blk-cgroups resources).
But clone->bi_bdev is still initialized following bio_alloc_clone to
preserve DM target expectations that clone->bi_bdev will be set.
Follow-up work is needed to audit DM targets to remove accesses to a
clone->bi_bdev that the target didn't initialize with bio_set_dev().
Depends-on: 7ecc56c62b27 ("block: allow passing a NULL bdev to bio_alloc_clone/bio_init_clone")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix the following coccicheck warning:
drivers/md/dm-cache-metadata.c:1512:5-6: Unneeded variable: "r".
Return "0" on line 1520.
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The precision loss of reading IO start_time with jiffies_to_nsecs
instead of using a high resolution timer degrades HST path prediction
for BIO-based mpath on high load workloads.
Below, I show the utilization percentage of a 10 disk multipath with
asymmetrical disk access cost, while being exercised by a randwrite FIO
benchmark with high submission queue depth (depth=64). It is possible
to see that the HST path selection degrades heavily for high-iops in
BIO-mpath, underutilizing the slower paths way beyond expected. This
seems to be caused by the start_time truncation, which makes some IO to
seem much slower than it actually is. In this scenario ST outperforms
HST for bio-mpath, but not for mq-mpath, which already uses ktime_get_ns().
The third column shows utilization with this patch applied. It is easy
to see that now HST prediction is much closer to the ideal distribution
(calculated considering the real cost of each path).
| | ST | HST (orig) | HST(ktime) | Best |
| sdd | 0.17 | 0.20 | 0.17 | 0.18 |
| sde | 0.17 | 0.20 | 0.17 | 0.18 |
| sdf | 0.17 | 0.20 | 0.17 | 0.18 |
| sdg | 0.06 | 0.00 | 0.06 | 0.04 |
| sdh | 0.03 | 0.00 | 0.03 | 0.02 |
| sdi | 0.03 | 0.00 | 0.03 | 0.02 |
| sdj | 0.02 | 0.00 | 0.01 | 0.01 |
| sdk | 0.02 | 0.00 | 0.01 | 0.01 |
| sdl | 0.17 | 0.20 | 0.17 | 0.18 |
| sdm | 0.17 | 0.20 | 0.17 | 0.18 |
This issue was originally discussed [1] when we first merged HST, and
this patch was left as a low hanging fruit to be solved later.
Regarding the implementation, as suggested by Mike in that mail thread,
in order to avoid the overhead of ktime_get_ns for other selectors, this
patch adds a flag for the selector code to request the high-resolution
timer.
I tested this using the same benchmark used in the original HST submission.
Full test and benchmark scripts are available here:
https://people.collabora.com/~krisman/HST-BIO-MPATH/
[1] https://lore.kernel.org/lkml/85tv0am9de.fsf@collabora.com/T/
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
[snitzer: cleaned up various implementation details]
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The device mapper dm-crypt target is using scnprintf("%02x", cc->key[i]) to
report the current key to userspace. However, this is not a constant-time
operation and it may leak information about the key via timing, via cache
access patterns or via the branch predictor.
Change dm-crypt's key printing to use "%c" instead of "%02x". Also
introduce hex2asc() that carefully avoids any branching or memory
accesses when converting a number in the range 0 ... 15 to an ascii
character.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The "r" variable shadows an earlier "r" that has function scope. It
means that we accidentally return success instead of an error code.
Smatch has a warning for this:
drivers/md/dm-integrity.c:4503 dm_integrity_ctr()
warn: missing error code 'r'
Fixes: 7eada909bfd7 ("dm: add integrity target")
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
dm-stats can be used with a very large number of entries (it is only
limited by 1/4 of total system memory), so add rescheduling points to
the loops that iterate over the entries.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Read/write/flush are the most common operations, optimize switch in
is_abnormal_io() for those cases. Follows same pattern established in
block perf-wip commit ("block: optimise blk_may_split for normal rw")
Also, push is_abnormal_io() check and blk_queue_split() down from
dm_submit_bio() to dm_split_and_process_bio() and set new
'is_abnormal_io' flag in clone_info. Optimize __split_and_process_bio
and __process_abnormal_io by leveraging ci.is_abnormal_io flag.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now that io splitting is recorded prior to, or during, ->map IO
accounting can happen immediately rather than defer until after
bio splitting in dm_split_and_process_bio().
Remove the DM_IO_START_ACCT flag and also remove dm_io's map_task
member because there is no longer any need to wait for splitting to
occur before accounting.
Also move dm_io struct's 'flags' member to consolidate struct holes.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Now that bio_split() isn't used by DM's bio splitting, it is a bit
overkill to link dm_io into an hlist given there is only single dm_io
in the list.
Convert to using a single list for holding all dm_io instances
associated with this bio.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently each dm_io's reference counter is grabbed before calling
__map_bio(), this way isn't efficient since we can move this grabbing
to initialization time inside alloc_io().
Meantime it becomes typical async io reference counter model: one is
for submission side, the other is for completion side, and the io won't
be completed until both sides are done.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
dm_zone_map_bio() is only called from __map_bio in which the io's
reference is grabbed already, and the reference won't be released
until the bio is submitted, so not necessary to do it dm_zone_map_bio
any more.
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The current DM code (ab)uses late assignment of dm_io->orig_bio (after
__map_bio() returns and any bio splitting is complete) to indicate the
FS bio has been processed and can be accounted. This results in
awkward waiting until ->orig_bio is set in dm_submit_bio_remap().
Also the bio splitting was implemented using bio_split()+bio_chain()
-- a well-worn pattern but it requires bio cloning purely for the
benefit of more natural IO accounting. The bio_split() result was
stored in ->orig_bio to represent the mapped part of the original FS
bio.
DM has switched to the bdev based IO accounting interface. DM's IO
accounting can be implemented in terms of the original FS bio (now
stored early in ->orig_bio) via access to its sectors/bio_op. And
if/when splitting is needed, set a new DM_IO_WAS_SPLIT flag and use
new dm_io fields of .sector_offset & .sectors to allow IO accounting
for split bios _without_ needing to clone a new bio to store in
->orig_bio.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Co-developed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
DM splits flush with data into empty flush followed by bio with data
payload, switch dm_io_acct() to use bdev_{start,end}_io_acct() to do
this accoiunting more naturally (rather than temporarily changing the
bio's bi_size).
This will allow DM to more easily account bios that are split (in
following commit).
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
All the other 4 parameters are retrieved from the 'dm_io' instance, so
it's not necessary to pass all four to dm_io_acct().
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| | |
dm->orig_bio is always passed to __dm_start_io_acct and dm_end_io_acct,
so it isn't necessary to take one bio parameter for the two helpers.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
Rename 'bi_size' to 'bio_sectors' given bi_size is being stored in
sectors. Also, use bio_sectors() rather than open-coding it.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
Remove needless factoring and remap bi_sector regardless of
bio_sectors() being non-zero.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
Use jump_labels to further reduce cost of unlikely branches for zoned
block devices, dm-stats and swap_bios throttling.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
If a bio is marked REQ_NOWAIT optimize dm_submit_bio()'s dm_table RCU
usage to dm_{get,put}_live_table_fast.
DM core offers protection against blocking (via suspend) if REQ_NOWAIT.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
Just saves some cacheline bouncing for members accessed during cloned
bio submission and completion.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| | |
Avoid redundant dereferences in both functions.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| | |
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| | |
Pull common DM_IO_ACCOUNTED check out to beginning of dm_start_io_acct.
Also, use dm_tio_is_normal (and move it to dm-core.h).
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| | |
Use local variable instead of redudant access using ci.io
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| | |
Also eliminate need to use errno_to_blk_status().
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A bioset's per-cpu alloc cache may have broader utility in the future
but for now constrain it to being tightly coupled to QUEUE_FLAG_POLL.
Also change dm_io_complete() to use bio_clear_polled() so that it
properly clears all associated bio state on requeue.
This commit improves DM's hipri bio polling (REQ_POLLED) perf by
7 - 20% depending on the system.
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
|