You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
With latest master, string columns trigger an error when used in a formula to build a ModelMatrix. We should probably treat them as categorical variables, either by converting them to CategoricalArray, or (even better) by building contrasts for them on the fly (without a copy).
Though I wonder what to do with other kinds of non-numeric columns. Raise an error? Treat them as categorical by default?
julia>using DataFrames
julia> df =DataFrame(A =1:4, B = ["M", "F", "F", "M"])
4×2 DataFrames.DataFrame
│ Row │ A │ B │
├─────┼───┼─────┤
│ 1 │ 1 │ "M" │
│ 2 │ 2 │ "F" │
│ 3 │ 3 │ "F" │
│ 4 │ 4 │ "M" │
julia>ModelMatrix(ModelFrame(A~B, df))
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Float64
This may have arisen from a call to the constructor Float64(...),
since type constructors fall back to convert methods.
incopy!(::Base.LinearFast, ::Array{Float64,2}, ::Base.LinearFast, ::Array{String,2}) at ./abstractarray.jl:575inmodelmat_cols(::Type{Array{Float64,2}}, ::NullableArrays.NullableArray{String,1}) at /home/milan/.julia/DataFrames/src/statsmodels/formula.jl:349in#modelmat_cols#122(::Bool, ::Function, ::Type{Array{Float64,2}}, ::Symbol, ::DataFrames.ModelFrame) at /home/milan/.julia/DataFrames/src/statsmodels/formula.jl:342in (::DataFrames.#kw##modelmat_cols)(::Array{Any,1}, ::DataFrames.#modelmat_cols, ::Type{Array{Float64,2}}, ::Symbol, ::DataFrames.ModelFrame) at ./<missing>:0in DataFrames.ModelMatrix{Array{Float64,2}}(::DataFrames.ModelFrame) at /home/milan/.julia/DataFrames/src/statsmodels/formula.jl:478in DataFrames.ModelMatrix{T<:AbstractArray{T<:AbstractFloat,2}}(::DataFrames.ModelFrame) at /home/milan/.julia/DataFrames/src/statsmodels/formula.jl:501
The text was updated successfully, but these errors were encountered:
With latest master, string columns trigger an error when used in a formula to build a
ModelMatrix
. We should probably treat them as categorical variables, either by converting them toCategoricalArray
, or (even better) by building contrasts for them on the fly (without a copy).Though I wonder what to do with other kinds of non-numeric columns. Raise an error? Treat them as categorical by default?
@kleinschmidt Comments?
Reproducer:
The text was updated successfully, but these errors were encountered: