Skip to content

Migration, standardizer and a few fixes #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Sep 27, 2019
Merged

Migration, standardizer and a few fixes #72

merged 12 commits into from
Sep 27, 2019

Conversation

tlienart
Copy link
Collaborator

@tlienart tlienart commented Sep 26, 2019

Well this was surprisingly difficult...

what's in this PR

primary

  • removes the meta tools and uses the ones from mljbase (@mlj_model etc) reference: todo Towards 0.5.0 #62
  • replaces all use of MLJBase.schema by Tables.schema including in Standardizer which now looks at the machine type instead of the scitype reference: todo Towards 0.5.0 #62 . We can revert this in the future when scitype(X) is slow for large tables ScientificTypes.jl#12 is solved but in the mean time I would strongly prefer it stayed the way it is. Importantly
    • the previous implementation was inconsistent in terms of allowing only Continuous while suggesting that it could be applied on Count (it should) and while the UnivariateStandardizer could be applied on all Real
    • the current implementation by default applies the standardizer on any AbstractFloat and if specified, on anything that is a subtype of Real which is consistent with the way the UnivariateStandardizer works.

Note: closes #63 , also supersedes #69

secondary

  • cleans up the Transformers somewhat as well as their tests
  • bumps up version usage to MLJBase 0.6 reference: todo Towards 0.5.0 #62
  • cleans up the use and definition of global variables AMBIGUOUS_NAMES etc using the usual const trick
  • cleans up where exports and imports should happen
  • updates the model registry though this can probably be checked & possibly done again; reference: todo Towards 0.5.0 #62

Note on meta tools

The tools @mlj_model and metadata_* cannot be used reliably for builtin models (src/builtins/Constant and src/builtins/Transformers). This causes no other problem than irritation, I tried, multiple ways and ended up wasting a fair bit of time; all ways failed, either complaining about the eval in a closed package MLJBase or that things get re-defined. None of these issues arise when using the tools in interfaces.

So in short: use the old approach for anything built-in.

@ablaom ablaom mentioned this pull request Sep 26, 2019
@ablaom ablaom merged commit 609f260 into dev Sep 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants