Architecture Flags 3/4
<arch> armv9-a , armv7-a+neon-vfpv4 , znver4 , core2 , skylake
<tune arch> cortex-a9 , neoverse-n2 , generic , intel
<instr set> see2 , avx512
<fp hw> neon , neon-fp-armv8
• <tune arch> should be always greater than <arch>
• In general, -mtune is set to generic if not specified
• -march=native , -mtune=native , -mcpu=native : Allows the compiler to
determine the processor type (not always accurate)
• Especially with new compilers, prefer auto-vectorization to explicit vector
intrinsics
• GCC Arm options, GCC X86-64 options
• Compiler flags across architectures: -march, -mtune, and -mcpu
• NVIDIA Grace CPU Benchmarking Guide, Arm Vector Instructions: SVE and NEON
19/81