-
Notifications
You must be signed in to change notification settings - Fork 8
Create formula interface for RMST test #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello John, I am thrilled to see the remarkable progress that has been achieved! I only have one minor comment to add. This topic would make an excellent presentation for Friday's meeting, where we can gather input and suggestions regarding the formula choice between month ~ event + trt
or Surv(month, event, trt)
.
Thank you for your diligent efforts in moving this forward. It's truly outstanding work. I am fine with merging this pull request first, considering the numerous changes we already have in place.
R/rmst.R
Outdated
#' | ||
#' # Formula interface | ||
#' rmst( | ||
#' month ~ evntd + trt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some concerns regarding whether this formula makes statistical sense. Typically, the ~
symbol is used when fitting a linear model, as in y ~ x1 + x2
. However, in the context of rmst()
, where the object is a survival object, I am wondering if something like a Surv(month, event, trt)
object would be more appropriate and meaningful.
Reference: https://www.rdocumentation.org/packages/survival/versions/2.11-4/topics/Surv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That example is to demonstrate that only the order matters. month ~ evntd + trt
is equivalent to Surv(month, event, trt)
. I'm open to updating the documentation in order to make this more clear to end users. I did my best to explain in the Details section that the formula syntax chosen is purely to indicate intent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I attempted to address your feedback in 3ccb493
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, John!
This is great start. Just a few design and style comments that can be easily addressed once the decision is made:
|
These are all objections to making
I worry about the implementation. Dealing with formulas/language objects is difficult to do robustly. I think the end users need to know that |
Advantages: * Less complex * Able to use more informative arg names like `data` and `formula` * Still easily pipe `data` as first arg even when using formula arg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great! Thanks for the update that simplifies things 💯
Closes #189
xref: #188 (comment)
I added a simple formula interface,UPDATE: After feedback, I instead added an additional argumentrmst.formula
, and convertedrmst()
to an S3 generic function. I say simple because it only extracts the variable names from the formula, and then uses these to callrmst.default()
. Converting to the S3 generic required changing the first argument name to something that could stand for a data frame or a formula. I chosex
since this is a common S3 pattern (eg see?aggregate
). The main downside I see of makingrmst()
an S3 generic is that its first argument is no longerdata
like all the other test functions likewlr()
andmaxcombo()
.formula = NULL
to the existingrmst()
functionPlease give it a try and let me know how it can be improved.