python - Fastest way to find non-finite values -
this inspired by: python: combined masking in numpy.
the task create boolean array of values not finite. example:
>>> arr = np.array([0, 2, np.inf, -np.inf, np.nan]) >>> ~np.isfinite(arr) array([false, false, true, true, true], dtype=bool)
to me, seems fastest way find non-finite values, seems there faster way. np.isnan(arr - arr)
should same:
>>> np.isnan(arr - arr) array([false, false, true, true, true], dtype=bool)
timing see twice fast!
arr = np.random.rand(100000) %timeit ~np.isfinite(arr) 10000 loops, best of 3: 198 µs per loop %timeit np.isnan(arr - arr) 10000 loops, best of 3: 85.8 µs per loop
so question twofold:
why
np.isnan(arr - arr)
trick faster "obvious"~np.isfinite(arr)
version? there input not work for?is there faster way find non-finite values?
that's hard answer because np.isnan
, np.isfinite
can use different c functions depending on build. , depending on performance (which may depend on compiler, system , how numpy built) of these c functions timings different.
the ufuncs both refer built-in npy_
func (source (1.11.3)):
/**begin repeat1 * #kind = isnan, isinf, isfinite, signbit, copysign, nextafter, spacing# * #func = npy_isnan, npy_isinf, npy_isfinite, npy_signbit, npy_copysign, nextafter, spacing# **/
and these functions defined based on presence of compile time constants (source (1.11.3)):
/* use builtins avoid function calls in tight loops * available if npy_config.h available (= numpys own build) */ #if have___builtin_isnan #define npy_isnan(x) __builtin_isnan(x) #else #ifndef npy_have_decl_isnan #define npy_isnan(x) ((x) != (x)) #else #if defined(_msc_ver) && (_msc_ver < 1900) #define npy_isnan(x) _isnan((x)) #else #define npy_isnan(x) isnan(x) #endif #endif #endif /* available if npy_config.h available (= numpys own build) */ #if have___builtin_isfinite #define npy_isfinite(x) __builtin_isfinite(x) #else #ifndef npy_have_decl_isfinite #ifdef _msc_ver #define npy_isfinite(x) _finite((x)) #else #define npy_isfinite(x) !npy_isnan((x) + (-x)) #endif #else #define npy_isfinite(x) isfinite((x)) #endif #endif
so might in case np.isfinite
has (much) more work np.isnan
. it's equally on computer or build np.isfinite
faster or both equally fast.
so, there not hard rule "fastest way" is. depends on many factors. go np.isfinite
because can faster (and isn't slower in case) , makes intention clearer.
just in case you're optimizing performance, can negating in-place. might decrease time , memory avoiding 1 temporary array:
import numpy np arr = np.random.rand(1000000) def isnotfinite(arr): res = np.isfinite(arr) np.bitwise_not(res, out=res) # in-place return res np.testing.assert_array_equal(~np.isfinite(arr), isnotfinite(arr)) np.testing.assert_array_equal(~np.isfinite(arr), np.isnan(arr - arr)) %timeit ~np.isfinite(arr) # 3.73 ms ± 4.16 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit isnotfinite(arr) # 2.41 ms ± 29.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit np.isnan(arr - arr) # 12.5 ms ± 772 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
note np.isnan
solution much slower on computer (windows 10 64bit python 3.5 numpy 1.13.1 anaconda build)
Comments
Post a Comment