-
Notifications
You must be signed in to change notification settings - Fork 218
Add IBL's "noise cutoff" quality metric #3984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some initial comments, suggestions, questions :)
sorting_analyzer=sorting_analyzer | ||
high_quantile=0.25, | ||
low_quantile=0.10, | ||
n_bins=100, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no discussion of what number of bins means for this metric. Might be worth adding. Why not do 10, or 1000 bins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a comment to the docs. This is also the default value used in IBL's implementation.
if peak_sign == "both": | ||
raise TypeError('peak_sign should either be "pos" or "neg"!') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this error occur right after peak sign so we don't waste time doing the _get_amplitudes_by_units
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also how does a user know what the peak sign is? Where did it come from. I would consider making this note refer to the peak sign of the spike_amplitudes extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment about where to change the peak_sign in the error message. I moved the error before _get_amplitudes_by_units
.
amps : array-like | ||
Spike amplitudes. | ||
high_quantile : float, default: 0.25 | ||
Quantile of the amplitude range above which values are treated as "high" (e.g. 0.25 = top 25%), the reference region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optionally thinking about this should we also put a note about the top 25% is really the 75 percentile? Not sure about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 'eg. 0.25 = top 25%' makes this clear.
Co-authored-by: Zach McKenzie <[email protected]>
Co-authored-by: Zach McKenzie <[email protected]>
for more information, see https://pre-commit.ci
Co-authored-by: Zach McKenzie <[email protected]>
for more information, see https://pre-commit.ci
From a profiling perspective is there any benefit to thinking about doing a numba/numpy dual strategy? It seems like |
'`peak_sign` should either be "pos" or "neg". You can set `peak_sign` as an argument when you compute spike_amplitudes.' | ||
) | ||
|
||
amplitudes_by_units = _get_amplitudes_by_units(sorting_analyzer, unit_ids, peak_sign) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets have in mind that this _get_amplitudes_by_units
is not efficient because it internal use mask on all units.
amplitude_extension.get_data(outputs="by_units") is really faster for this. We should use it here and also inside
_get_amplitudes_by_units
the only pain is that amplitude will be splited over segments but concatenate is fast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just tried to implement amplitude_extension.get_data(outputs="by_units")
and then concatenated, but it became very complex. We propose that there should be another PR to make _get_amplitudes_by_units
to use the approach suggested.
Thank you for this, this is really nice. |
related to this, I don't think I've ever got |
Add a new metric "noise cutoff" based on IBL https://figshare.com/articles/online_resource/Spike_sorting_pipeline_for_the_International_Brain_Laboratory/19705522?file=49783080 in a new function
compute_noise_cutoffs
.The original implementation is here https://github.com/int-brain-lab/ibllib/blob/2e1f91c622ba8dbd04fc53946c185c99451ce5d6/brainbox/metrics/single_units.py.
The "noise cutoff" metric assesses whether a unit’s spike‐amplitude distribution is truncated
at the low-end, which may be due to the high amplitude detection threhold in the deconvolution step,
i.e., if low‐amplitude spikes were missed. It does not assume a Gaussian shape;
instead, it directly compares counts in the low‐amplitude bins to counts in high‐amplitude bins (
cutoff
) and the peak bin (ratio
).Here are examples with no truncation and truncation, which give low and high values (

cutoff
andratio
), respectively.I have added documentation and tests.