Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HwyBlockwiseShiftTest.TestAllShiftRightLanes test failing on Graviton3 #1938

Open
bedroge opened this issue Jan 16, 2024 · 7 comments
Open
Labels
help wanted Extra attention is needed

Comments

@bedroge
Copy link

bedroge commented Jan 16, 2024

I'm trying to compile Highway 1.0.3 with EasyBuild using the following recipe: https://github.com/easybuilders/easybuild-easyconfigs/blob/develop/easybuild/easyconfigs/h/Highway/Highway-1.0.3-GCCcore-12.2.0.eb.
The build completes without any issues, but one test is failing:

211/658 Test #211: HwyBlockwiseShiftTestGroup/HwyBlockwiseShiftTest.TestAllShiftRightLanes/SVE_256  # GetParam() = 33554432 ...............Subprocess aborted***Exception:   0.55 sec
Running main() from /tmp/bot/easybuild/build/googletest/1.12.1/GCCcore-12.2.0/googletest-release-1.12.1/googletest/src/gtest_main.cc
Note: Google Test filter = HwyBlockwiseShiftTestGroup/HwyBlockwiseShiftTest.TestAllShiftRightLanes/SVE_256
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from HwyBlockwiseShiftTestGroup/HwyBlockwiseShiftTest
[ RUN      ] HwyBlockwiseShiftTestGroup/HwyBlockwiseShiftTest.TestAllShiftRightLanes/SVE_256


i32x8 expect [2+ ->]:
  4,0,0,7,8,0,
i32x8 actual [2+ ->]:
  4,0,6,7,8,0,
Abort at /tmp/bot/easybuild/build/Highway/1.0.3/GCCcore-12.2.0/highway-1.0.3/hwy/tests/blockwise_shift_test.cc:135: SVE_256, i32x8 lane 4 mismatch: expected '0', got '6'.

This is on a c7g.4xlarge instance in AWS, and this is the lscpu output:

$ lscpu 
Architecture:           aarch64
  CPU op-mode(s):       32-bit, 64-bit
  Byte Order:           Little Endian
CPU(s):                 16
  On-line CPU(s) list:  0-15
Vendor ID:              ARM
  Model name:           Neoverse-V1
    Model:              1
    Thread(s) per core: 1
    Core(s) per socket: 16
    Socket(s):          1
    Stepping:           r1p1
    BogoMIPS:           2100.00
    Flags:              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asim
                        ddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs dcpodp svei8mm svebf16 i8mm bf16 dgh rng
Caches (sum of all):    
  L1d:                  1 MiB (16 instances)
  L1i:                  1 MiB (16 instances)
  L2:                   16 MiB (16 instances)
  L3:                   32 MiB (1 instance)
NUMA:                   
  NUMA node(s):         1
  NUMA node0 CPU(s):    0-15
Vulnerabilities:        
  Itlb multihit:        Not affected
  L1tf:                 Not affected
  Mds:                  Not affected
  Meltdown:             Not affected
  Mmio stale data:      Not affected
  Retbleed:             Not affected
  Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:           Mitigation; __user pointer sanitization
  Spectre v2:           Mitigation; CSV2, BHB
  Srbds:                Not affected
  Tsx async abort:      Not affected

Any idea why this test is failing?

@jan-wassenberg
Copy link
Member

Hi @bedroge , I think this might be caused by compiler gremlins, or perhaps a bug that I do not yet see, not the CPU: it is the expected value that is incorrect here.

As far as I can see, the test and op are unchanged between 1.0.3 and the current code.

Are you able to test with another compiler, in particular clang?

@bedroge
Copy link
Author

bedroge commented Jan 17, 2024

I now tried it with GCC 12.3.0 as well, but seeing the same issue. Also tried GCC 13.2.0, but then the compilation fails with:

/tmp/eb-vtlvwo0l/tmpxuldwk2g/rpath_wrappers/gxx_wrapper/g++ --sysroot=/cvmfs/software.eessi.io/versions/2023.06/compat/linux/aarch64 -O2 -ftree
-vectorize -mcpu=native -fno-math-errno -O3 -DNDEBUG -L/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_v1/software/go
ogletest/1.14.0-GCCcore-13.2.0/lib64 -L/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_v1/software/googletest/1.14.0-
GCCcore-13.2.0/lib -L/cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_v1/software/GCCcore/13.2.0/lib64 -L/cvmfs/softwa
re.eessi.io/versions/2023.06/software/linux/aarch64/neoverse_v1/software/GCCcore/13.2.0/lib -fPIE -pie CMakeFiles/demote_test.dir/hwy/tests/dem
ote_test.cc.o -o tests/demote_test  libhwy.a libhwy_test.a libhwy_contrib.a /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/ne
overse_v1/software/googletest/1.14.0-GCCcore-13.2.0/lib/libgtest_main.so.1.14.0 libhwy.a /cvmfs/software.eessi.io/versions/2023.06/software/lin
ux/aarch64/neoverse_v1/software/googletest/1.14.0-GCCcore-13.2.0/lib/libgtest.so.1.14.0 
during GIMPLE pass: fre
In file included from /tmp/bedroge/easybuild/build/Highway/1.0.3/GCCcore-13.2.0/highway-1.0.3/hwy/foreach_target.h:136,
                 from /tmp/bedroge/easybuild/build/Highway/1.0.3/GCCcore-13.2.0/highway-1.0.3/hwy/tests/compare_test.cc:22:
/tmp/bedroge/easybuild/build/Highway/1.0.3/GCCcore-13.2.0/highway-1.0.3/hwy/tests/compare_test.cc: In function hwy::N_SVE2_128::Vec<D> hwy::N_S
VE2_128::Make128(D, uint64_t, uint64_t) [with D = Simd<long unsigned int, 2, 0>]:
/tmp/bedroge/easybuild/build/Highway/1.0.3/GCCcore-13.2.0/highway-1.0.3/hwy/tests/compare_test.cc:236:28: internal compiler error: in eliminate
_stmt, at tree-ssa-sccvn.cc:6870
  236 | static HWY_NOINLINE Vec<D> Make128(D d, uint64_t hi, uint64_t lo) {
      |                            ^~~~~~~
0xf92823 eliminate_dom_walker::eliminate_stmt(basic_block_def*, gimple_stmt_iterator*)
        ../../gcc/tree-ssa-sccvn.cc:6870
0xf92a9f eliminate_dom_walker::before_dom_children(basic_block_def*)
        ../../gcc/tree-ssa-sccvn.cc:7304
0xf92a9f eliminate_dom_walker::before_dom_children(basic_block_def*)
        ../../gcc/tree-ssa-sccvn.cc:7237
0x17d07bb dom_walker::walk(basic_block_def*)
        ../../gcc/domwalk.cc:311
0xf89833 eliminate_with_rpo_vn(bitmap_head*)
        ../../gcc/tree-ssa-sccvn.cc:7484
0xf9926f do_rpo_vn_1
        ../../gcc/tree-ssa-sccvn.cc:8597
0xf99e8b execute
        ../../gcc/tree-ssa-sccvn.cc:8683
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
make[2]: *** [CMakeFiles/compare_test.dir/build.make:79: CMakeFiles/compare_test.dir/hwy/tests/compare_test.cc.o] Error 1

Not sure if I can easily try building it with Clang with EasyBuild, but I might give it a try.

But just to be sure: you said that it's the expected value that's incorrect; so that means I can safely ignore this failing test?

@boegel
Copy link

boegel commented Feb 6, 2024

@jan-wassenberg Can you clarify how "serious" this failing test is?

It is currently blocking us from including Highway into EESSI, and we're wondering if it really should, or whether we can ignore the failing test for now.

So basically: is it a problem with the test, or is it really a signal of a problem with the installation of Highway?

@jan-wassenberg
Copy link
Member

Hi @bedroge , sorry to hear the issue still affects GCC 13. Would you like to file a bug with the GCC bugzilla?

But just to be sure: you said that it's the expected value that's incorrect; so that means I can safely ignore this failing test?

Yes, that's right. Though it does seem to point to a compiler bug which would be good for them to fix.

Hi @boegel , I think it is reasonable to ignore this test especially if you are not using the ShiftRightLanes operation it is testing. The test does not signal a problem with the installation. The specific way it is failing shows that the compiler seems to have a bug affecting this test, which is unfortunately reasonably common.
I think you'll find that the error does not happen when building with Clang, in case that is an option?

@bedroge
Copy link
Author

bedroge commented Feb 12, 2024

Thanks @jan-wassenberg. I've tried building it with Clang, and in that case everything did indeed work fine. But for now we will just ignore this failing test, as we would like to stick to GCC.

@jan-wassenberg
Copy link
Member

Thanks for confirming clang works. Makes sense :)

@jan-wassenberg jan-wassenberg added the help wanted Extra attention is needed label Apr 9, 2024
@jan-wassenberg
Copy link
Member

Would appreciate it if someone can help report this to GCC so they can fix it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants