Skip to content
This repository was archived by the owner on Dec 4, 2019. It is now read-only.

Non-deterministic results because of aggregation steps #44

Open
thunterdb opened this issue Sep 21, 2016 · 0 comments
Open

Non-deterministic results because of aggregation steps #44

thunterdb opened this issue Sep 21, 2016 · 0 comments

Comments

@thunterdb
Copy link
Contributor

Because the aggregation step is not deterministic, the data may be presented in different orders between runs of the same query. A number of algorithms (DBSCAN) will then give different results because.

The recommended solution is to sort all the data point by lexicographic order before fitting them:

sorted_data = data[numpy.lexsort(data.T)]
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants