python - Fastest way to find non-finite values -


this inspired by: python: combined masking in numpy.

the task create boolean array of values not finite. example:

>>> arr = np.array([0, 2, np.inf, -np.inf, np.nan]) >>> ~np.isfinite(arr) array([false, false,  true,  true,  true], dtype=bool) 

to me, seems fastest way find non-finite values, seems there faster way. np.isnan(arr - arr) should same:

>>> np.isnan(arr - arr) array([false, false,  true,  true,  true], dtype=bool) 

timing see twice fast!

arr = np.random.rand(100000)  %timeit ~np.isfinite(arr) 10000 loops, best of 3: 198 µs per loop  %timeit np.isnan(arr - arr) 10000 loops, best of 3: 85.8 µs per loop 

so question twofold:

  1. why np.isnan(arr - arr) trick faster "obvious" ~np.isfinite(arr) version? there input not work for?

  2. is there faster way find non-finite values?

that's hard answer because np.isnan , np.isfinite can use different c functions depending on build. , depending on performance (which may depend on compiler, system , how numpy built) of these c functions timings different.


the ufuncs both refer built-in npy_ func (source (1.11.3)):

/**begin repeat1  * #kind = isnan, isinf, isfinite, signbit, copysign, nextafter, spacing#  * #func = npy_isnan, npy_isinf, npy_isfinite, npy_signbit, npy_copysign, nextafter, spacing#  **/ 

and these functions defined based on presence of compile time constants (source (1.11.3)):

/* use builtins avoid function calls in tight loops  * available if npy_config.h available (= numpys own build) */ #if have___builtin_isnan     #define npy_isnan(x) __builtin_isnan(x) #else     #ifndef npy_have_decl_isnan         #define npy_isnan(x) ((x) != (x))     #else         #if defined(_msc_ver) && (_msc_ver < 1900)             #define npy_isnan(x) _isnan((x))         #else             #define npy_isnan(x) isnan(x)         #endif     #endif #endif  /* available if npy_config.h available (= numpys own build) */ #if have___builtin_isfinite     #define npy_isfinite(x) __builtin_isfinite(x) #else     #ifndef npy_have_decl_isfinite         #ifdef _msc_ver             #define npy_isfinite(x) _finite((x))         #else             #define npy_isfinite(x) !npy_isnan((x) + (-x))         #endif     #else         #define npy_isfinite(x) isfinite((x))     #endif #endif 

so might in case np.isfinite has (much) more work np.isnan. it's equally on computer or build np.isfinite faster or both equally fast.

so, there not hard rule "fastest way" is. depends on many factors. go np.isfinite because can faster (and isn't slower in case) , makes intention clearer.


just in case you're optimizing performance, can negating in-place. might decrease time , memory avoiding 1 temporary array:

import numpy np arr = np.random.rand(1000000)  def isnotfinite(arr):     res = np.isfinite(arr)     np.bitwise_not(res, out=res)  # in-place     return res  np.testing.assert_array_equal(~np.isfinite(arr), isnotfinite(arr)) np.testing.assert_array_equal(~np.isfinite(arr), np.isnan(arr - arr))  %timeit ~np.isfinite(arr) # 3.73 ms ± 4.16 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit isnotfinite(arr) # 2.41 ms ± 29.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) %timeit np.isnan(arr - arr) # 12.5 ms ± 772 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) 

note np.isnan solution much slower on computer (windows 10 64bit python 3.5 numpy 1.13.1 anaconda build)


Comments

Popular posts from this blog

Is there a better way to structure post methods in Class Based Views -

performance - Why is XCHG reg, reg a 3 micro-op instruction on modern Intel architectures? -

jquery - Responsive Navbar with Sub Navbar -