Skip to content
This repository was archived by the owner on May 4, 2019. It is now read-only.

Lightweight wrapper for floating point arrays with NaNs #5

Open
simonster opened this issue Oct 29, 2013 · 3 comments
Open

Lightweight wrapper for floating point arrays with NaNs #5

simonster opened this issue Oct 29, 2013 · 3 comments

Comments

@simonster
Copy link
Member

It'd be nice to have a subtype of AbstractDataArray that wraps a regular floating point array and treats NaN values as NA. I think this should generally be faster than indexing the BitArray that holds the list of NA values, and with #4 and some minor API changes this should eventually give us nansum etc. with equivalent performance to manual loops.

@johnmyleswhite
Copy link
Member

This would make @tshort happy I suspect. It's definitely worth adding, but I'm a little worried about how it would fit into the system -- would you always use FloatDataArray instead of DataArray when working with floats?

@simonster
Copy link
Member Author

I'm not sure, although it seems safer to use a DataArray even when working with floats. With a DataArray you could potentially have values that are NaN but not NA, and thus should contaminate the result when NAs are ignored. This wrapper wouldn't be able to express that. On the other hand such cases probably arise rarely in practice. How does R treat NaN vs. NA?

@tshort
Copy link

tshort commented Oct 29, 2013

In R, NA and NaN have a different bit pattern (both of which are NaN's). You could do the same in Julia.

NA's and NaN's can get mixed up if you mix operations that include both.NA + NaN could be a different answer than NaN + NA. I don't think you can avoid those issues. Anyway, I think it's rare to have conflicts between NA and NaN.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants