-
Notifications
You must be signed in to change notification settings - Fork 0
ordinal regression model type & polr engine #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ordinal regression model type & polr engine #6
Conversation
Here is a complete analysis using the library(tidymodels)
library(ordered)
# disaggregated data & partition
house_data <-
MASS::housing[rep(seq(nrow(MASS::housing)), MASS::housing$Freq), -5]
house_split <- initial_split(house_data, prop = .8)
house_train <- training(house_split)
house_test <- testing(house_split)
# tunable model & analysis specification
house_rec <- recipe(Sat ~ Infl + Type + Cont, data = house_train)
house_spec <- ordinal_reg() |>
set_engine("polr") |>
set_args(method = tune())
house_tune <- extract_parameter_set_dials(house_spec)
(house_grid <- grid_regular(house_tune, levels = Inf))
#> # A tibble: 5 × 1
#> method
#> <chr>
#> 1 logistic
#> 2 probit
#> 3 loglog
#> 4 cloglog
#> 5 cauchit
# hyperparameter (link function) optimization
house_res <- tune_grid(
house_spec,
preprocessor = house_rec,
resamples = vfold_cv(house_train),
grid = house_grid,
metrics = metric_set(accuracy, roc_auc)
)
(house_link <- select_best(house_res, metric = "accuracy"))
#> # A tibble: 1 × 2
#> method .config
#> <chr> <chr>
#> 1 logistic Preprocessor1_Model1
# final fit
house_prep <- prep(house_rec)
house_final <- finalize_model(house_spec, house_link)
(house_fit <- fit(house_final, formula(house_prep), data = house_train))
#> parsnip model object
#>
#> Call:
#> MASS::polr(formula = Sat ~ Infl + Type + Cont, data = data, method = ~"logistic")
#>
#> Coefficients:
#> InflMedium InflHigh TypeApartment TypeAtrium TypeTerrace
#> 0.5103368 1.2315652 -0.4973120 -0.2740917 -0.9533085
#> ContHigh
#> 0.3576051
#>
#> Intercepts:
#> Low|Medium Medium|High
#> -0.4677984 0.7202062
#>
#> Residual Deviance: 2803.47
#> AIC: 2819.47
# evaluation
house_pred_class <- predict(house_fit, new_data = house_test, type = "class")
bind_cols(house_test, house_pred_class) |>
accuracy(truth = Sat, estimate = .pred_class)
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 accuracy multiclass 0.528
house_pred_prob <- predict(house_fit, new_data = house_test, type = "prob")
bind_cols(house_test, house_pred_prob) |>
roc_auc(truth = Sat, starts_with(".pred_"))
#> # A tibble: 1 × 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 roc_auc hand_till 0.652 Created on 2024-11-04 with reprex v2.1.1 |
I'll try to review this later today. My first thought is that the bare skeleton of |
@topepo could this be resumed for a minimal CRAN submission in the next several months? I will join a project in June and hope to make use of this package. : ) |
This PR addresses #4 by introducing a single model type for ordinal regression and a single deployable engine. My thinking is that we should complete the implementation of one engine before beginning another.
Model type
The model type is
ordinal_reg()
, per this suggestion. However, as noted in the NEWS, this could be replaced with separateordinal_*()
types for different model structures, per this suggestion.Engine
The model type comes with one engine,
'polr'
, which invokesMASS::polr()
. The engine has one tuning parameter, calledordinal_link
, which mimicssurvival_link
and passed to themethod
parameter ofpolr()
. The engine also providesclass
andprob
prediction formats; confidence intervals for predictions seem not to be implemented in {MASS}. The engine is registered on load.The
ordinal_reg
branch of {ordered} is coordinated with cognominal branches of {parsnip} and of {dials}. In {parsnip}, the model type is registered on load, a basicupdate()
method is provided, and several other brief files or code chunks analogous to those for other model types are included. In {dials}, theordinal_link
parameter tuner is defined.NB: I am not sure i successfully synchronized
ordinal_link
tomethod
; in particular, thepolr_engine_args
tibble is a bit mysterious to me. A unit test with hyperparameter optimization needs to be written. Edit: See the example in a comment below.Documentation
Package documentation was added to 'ordered-package.R' so that illustrative examples, including of {ordinalForest}, could be included there.
NB: I was unable to install the necessary dependencies to knit 'aaa.Rmd', so i manually wrote 'ordinal_reg_polr.md'.