Skip to content

Commit 18efe86

Browse files
Merge patch series "RISC-V: Detect and report speed of unaligned vector accesses"
Charlie Jenkins <charlie@rivosinc.com> says: Adds support for detecting and reporting the speed of unaligned vector accesses on RISC-V CPUs. Adds vec_misaligned_speed key to the hwprobe adds Zicclsm to cpufeature and fixes the check for scalar unaligned emulated all CPUs. The vec_misaligned_speed key keeps the same format as the scalar unaligned access speed key. This set does not emulate unaligned vector accesses on CPUs that do not support them. Only reports if userspace can run them and speed of unaligned vector accesses if supported. * b4-shazam-merge: RISC-V: hwprobe: Document unaligned vector perf key RISC-V: Report vector unaligned access speed hwprobe RISC-V: Detect unaligned vector accesses supported RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED RISC-V: Scalar unaligned access emulated on hotplug CPUs RISC-V: Check scalar unaligned access on all CPUs Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-0-5b33500160f8@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2 parents 7727020 + 40e09eb commit 18efe86

File tree

15 files changed

+474
-38
lines changed

15 files changed

+474
-38
lines changed

Documentation/arch/riscv/hwprobe.rst

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -274,3 +274,19 @@ The following keys are defined:
274274
represent the highest userspace virtual address usable.
275275

276276
* :c:macro:`RISCV_HWPROBE_KEY_TIME_CSR_FREQ`: Frequency (in Hz) of `time CSR`.
277+
278+
* :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF`: An enum value describing the
279+
performance of misaligned vector accesses on the selected set of processors.
280+
281+
* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN`: The performance of misaligned
282+
vector accesses is unknown.
283+
284+
* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW`: 32-bit misaligned accesses using vector
285+
registers are slower than the equivalent quantity of byte accesses via vector registers.
286+
Misaligned accesses may be supported directly in hardware, or trapped and emulated by software.
287+
288+
* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_FAST`: 32-bit misaligned accesses using vector
289+
registers are faster than the equivalent quantity of byte accesses via vector registers.
290+
291+
* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED`: Misaligned vector accesses are
292+
not supported at all and will generate a misaligned address fault.

arch/riscv/Kconfig

Lines changed: 56 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -786,10 +786,24 @@ config THREAD_SIZE_ORDER
786786

787787
config RISCV_MISALIGNED
788788
bool
789+
help
790+
Embed support for detecting and emulating misaligned
791+
scalar or vector loads and stores.
792+
793+
config RISCV_SCALAR_MISALIGNED
794+
bool
795+
select RISCV_MISALIGNED
789796
select SYSCTL_ARCH_UNALIGN_ALLOW
790797
help
791798
Embed support for emulating misaligned loads and stores.
792799

800+
config RISCV_VECTOR_MISALIGNED
801+
bool
802+
select RISCV_MISALIGNED
803+
depends on RISCV_ISA_V
804+
help
805+
Enable detecting support for vector misaligned loads and stores.
806+
793807
choice
794808
prompt "Unaligned Accesses Support"
795809
default RISCV_PROBE_UNALIGNED_ACCESS
@@ -801,7 +815,7 @@ choice
801815

802816
config RISCV_PROBE_UNALIGNED_ACCESS
803817
bool "Probe for hardware unaligned access support"
804-
select RISCV_MISALIGNED
818+
select RISCV_SCALAR_MISALIGNED
805819
help
806820
During boot, the kernel will run a series of tests to determine the
807821
speed of unaligned accesses. This probing will dynamically determine
@@ -812,7 +826,7 @@ config RISCV_PROBE_UNALIGNED_ACCESS
812826

813827
config RISCV_EMULATED_UNALIGNED_ACCESS
814828
bool "Emulate unaligned access where system support is missing"
815-
select RISCV_MISALIGNED
829+
select RISCV_SCALAR_MISALIGNED
816830
help
817831
If unaligned memory accesses trap into the kernel as they are not
818832
supported by the system, the kernel will emulate the unaligned
@@ -841,6 +855,46 @@ config RISCV_EFFICIENT_UNALIGNED_ACCESS
841855

842856
endchoice
843857

858+
choice
859+
prompt "Vector unaligned Accesses Support"
860+
depends on RISCV_ISA_V
861+
default RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
862+
help
863+
This determines the level of support for vector unaligned accesses. This
864+
information is used by the kernel to perform optimizations. It is also
865+
exposed to user space via the hwprobe syscall. The hardware will be
866+
probed at boot by default.
867+
868+
config RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
869+
bool "Probe speed of vector unaligned accesses"
870+
select RISCV_VECTOR_MISALIGNED
871+
depends on RISCV_ISA_V
872+
help
873+
During boot, the kernel will run a series of tests to determine the
874+
speed of vector unaligned accesses if they are supported. This probing
875+
will dynamically determine the speed of vector unaligned accesses on
876+
the underlying system if they are supported.
877+
878+
config RISCV_SLOW_VECTOR_UNALIGNED_ACCESS
879+
bool "Assume the system supports slow vector unaligned memory accesses"
880+
depends on NONPORTABLE
881+
help
882+
Assume that the system supports slow vector unaligned memory accesses. The
883+
kernel and userspace programs may not be able to run at all on systems
884+
that do not support unaligned memory accesses.
885+
886+
config RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
887+
bool "Assume the system supports fast vector unaligned memory accesses"
888+
depends on NONPORTABLE
889+
help
890+
Assume that the system supports fast vector unaligned memory accesses. When
891+
enabled, this option improves the performance of the kernel on such
892+
systems. However, the kernel and userspace programs will run much more
893+
slowly, or will not be able to run at all, on systems that do not
894+
support efficient unaligned memory accesses.
895+
896+
endchoice
897+
844898
source "arch/riscv/Kconfig.vendor"
845899

846900
endmenu # "Platform type"

arch/riscv/include/asm/cpufeature.h

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88

99
#include <linux/bitmap.h>
1010
#include <linux/jump_label.h>
11+
#include <linux/workqueue.h>
1112
#include <asm/hwcap.h>
1213
#include <asm/alternative-macros.h>
1314
#include <asm/errno.h>
@@ -58,8 +59,9 @@ void __init riscv_user_isa_enable(void);
5859
#define __RISCV_ISA_EXT_SUPERSET_VALIDATE(_name, _id, _sub_exts, _validate) \
5960
_RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate)
6061

61-
#if defined(CONFIG_RISCV_MISALIGNED)
6262
bool check_unaligned_access_emulated_all_cpus(void);
63+
#if defined(CONFIG_RISCV_SCALAR_MISALIGNED)
64+
void check_unaligned_access_emulated(struct work_struct *work __always_unused);
6365
void unaligned_emulation_finish(void);
6466
bool unaligned_ctl_available(void);
6567
DECLARE_PER_CPU(long, misaligned_access_speed);
@@ -70,6 +72,12 @@ static inline bool unaligned_ctl_available(void)
7072
}
7173
#endif
7274

75+
bool check_vector_unaligned_access_emulated_all_cpus(void);
76+
#if defined(CONFIG_RISCV_VECTOR_MISALIGNED)
77+
void check_vector_unaligned_access_emulated(struct work_struct *work __always_unused);
78+
DECLARE_PER_CPU(long, vector_misaligned_access);
79+
#endif
80+
7381
#if defined(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS)
7482
DECLARE_STATIC_KEY_FALSE(fast_unaligned_access_speed_key);
7583

arch/riscv/include/asm/entry-common.h

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -25,18 +25,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
2525
void handle_page_fault(struct pt_regs *regs);
2626
void handle_break(struct pt_regs *regs);
2727

28-
#ifdef CONFIG_RISCV_MISALIGNED
2928
int handle_misaligned_load(struct pt_regs *regs);
3029
int handle_misaligned_store(struct pt_regs *regs);
31-
#else
32-
static inline int handle_misaligned_load(struct pt_regs *regs)
33-
{
34-
return -1;
35-
}
36-
static inline int handle_misaligned_store(struct pt_regs *regs)
37-
{
38-
return -1;
39-
}
40-
#endif
4130

4231
#endif /* _ASM_RISCV_ENTRY_COMMON_H */

arch/riscv/include/asm/hwprobe.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88

99
#include <uapi/asm/hwprobe.h>
1010

11-
#define RISCV_HWPROBE_MAX_KEY 9
11+
#define RISCV_HWPROBE_MAX_KEY 10
1212

1313
static inline bool riscv_hwprobe_key_is_valid(__s64 key)
1414
{

arch/riscv/include/asm/vector.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121

2222
extern unsigned long riscv_v_vsize;
2323
int riscv_v_setup_vsize(void);
24+
bool insn_is_vector(u32 insn_buf);
2425
bool riscv_v_first_use_handler(struct pt_regs *regs);
2526
void kernel_vector_begin(void);
2627
void kernel_vector_end(void);
@@ -268,6 +269,7 @@ struct pt_regs;
268269

269270
static inline int riscv_v_setup_vsize(void) { return -EOPNOTSUPP; }
270271
static __always_inline bool has_vector(void) { return false; }
272+
static __always_inline bool insn_is_vector(u32 insn_buf) { return false; }
271273
static inline bool riscv_v_first_use_handler(struct pt_regs *regs) { return false; }
272274
static inline bool riscv_v_vstate_query(struct pt_regs *regs) { return false; }
273275
static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; }

arch/riscv/include/uapi/asm/hwprobe.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -88,6 +88,11 @@ struct riscv_hwprobe {
8888
#define RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW 2
8989
#define RISCV_HWPROBE_MISALIGNED_SCALAR_FAST 3
9090
#define RISCV_HWPROBE_MISALIGNED_SCALAR_UNSUPPORTED 4
91+
#define RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF 10
92+
#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN 0
93+
#define RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW 2
94+
#define RISCV_HWPROBE_MISALIGNED_VECTOR_FAST 3
95+
#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED 4
9196
/* Increase RISCV_HWPROBE_MAX_KEY when adding items. */
9297

9398
/* Flags */

arch/riscv/kernel/Makefile

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,8 @@ obj-$(CONFIG_MMU) += vdso.o vdso/
7070

7171
obj-$(CONFIG_RISCV_MISALIGNED) += traps_misaligned.o
7272
obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o
73-
obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
73+
obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
74+
obj-$(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS) += vec-copy-unaligned.o
7475

7576
obj-$(CONFIG_FPU) += fpu.o
7677
obj-$(CONFIG_FPU) += kernel_mode_fpu.o

arch/riscv/kernel/copy-unaligned.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,9 @@
1010
void __riscv_copy_words_unaligned(void *dst, const void *src, size_t size);
1111
void __riscv_copy_bytes_unaligned(void *dst, const void *src, size_t size);
1212

13+
#ifdef CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
14+
void __riscv_copy_vec_words_unaligned(void *dst, const void *src, size_t size);
15+
void __riscv_copy_vec_bytes_unaligned(void *dst, const void *src, size_t size);
16+
#endif
17+
1318
#endif /* __RISCV_KERNEL_COPY_UNALIGNED_H */

arch/riscv/kernel/fpu.S

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ SYM_FUNC_END(__fstate_restore)
170170
__access_func(f31)
171171

172172

173-
#ifdef CONFIG_RISCV_MISALIGNED
173+
#ifdef CONFIG_RISCV_SCALAR_MISALIGNED
174174

175175
/*
176176
* Disable compressed instructions set to keep a constant offset between FP
@@ -224,4 +224,4 @@ SYM_FUNC_START(get_f64_reg)
224224
fp_access_epilogue
225225
SYM_FUNC_END(get_f64_reg)
226226

227-
#endif /* CONFIG_RISCV_MISALIGNED */
227+
#endif /* CONFIG_RISCV_SCALAR_MISALIGNED */

0 commit comments

Comments
 (0)