-
-
Notifications
You must be signed in to change notification settings - Fork 71
How should we handle missing base qualities in the consensus caller? #1031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@nh13 passing by comment here. Would there be a way to let the missing qualities truly be missing qualities and make the consensus probability math tolerant of the missing term? So when there are no base qualities, then a simpler consensus calling algorithm could be used that simply is unaware of the base quality? Then we wouldn't need to make a dummy value that could potentially affect downstream probabilities and their meaning. Curious what you think. |
I think providing a better exception is the first PR, then we can look at what the best option is. |
@nh13 you don't answer @clintval saying a better exception is the next step ... then implement a broader solution in the PR 😛 I think there are more options we might consider first:
in addition to the two you have:
|
The more I think about it, the less I like an opt-in default base quality. I mean the user could instead modify the BAM upstream to add in default base qualities and the tool would work. Are we endorsing such a solution if we add the option in to have a default base quality: #1032 |
Yeah - I think your comment further upstream is a good one: make the error much clearer that the callers don't support reads without base qualities, get that merged, then we can think about if we want to do anything in addition. |
Consider a read where the base qualities are not stored, we could:
None
, so base qualities are required, but used only when explicitly specified.The issue with (1) is that would force folks to filter out such reads if they want to get the consensus caller to work. The issue with (2 )is that someone produces a BAM file accidentally with missing qualities and fgbio handles it gracefully, and they would have preferred to have known about it (no one reads the logs).
The text was updated successfully, but these errors were encountered: