Open
Description
Hi, I benchmark xt::argmax()
recently, and found it takes around 110 ms (100 times average), while NumPy counter part only takes 35 ms.
The input is a (512, 768, 24)
cube, and I argmax it on axis 0
, which produce a (768, 24)
matrix:
xt::xarray<double> cube = xt::random::randn<double>({512, 768, 24});
xt::argmax(cube, 0);
cube = np.random.randn(512, 768, 24)
np.argmax(cube, 0)
Compilable C++ source: https://gist.github.com/keeper/f065294442de562a72e41c47f2e18eff
Python scripts: https://gist.github.com/keeper/a664b6836939ee34b9d12eccdef087bd
Compiler version: g++ (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Compile command: g++ -O3 -o main main.cpp