Implement efficient seeking from non-null trees using tree_pos #2911
+145
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Continuing from #2874, we want to finish moving over the tree-positioning code to use
tree_pos
efficiently. At the moment,tsk.tree_seek
will either calltsk_tree_seek_from_null
ortsk_tree_seek_linear
depending on whether we are starting from the null tree or not.seek_linear
repeatedly callsnext
orprev
until it reaches the given position, with the direction being determined by which would cover the shortest distance.As a first pass, I've implemented
tsk_tree_seek_forward
andtsk_tree_seek_backward
and I've incorporated them intotsk_tree_seek_linear
. We will need to revise some of thetest_highlevel.py
seek tests, because the direction we choose to seek along is different to the old approach in some cases. For example, we now seek forward to go from the first to the last tree in a sequence.Curiously, my implementation passes all the C tests with no memory issues detected by Valgrind, and it also passes all the
test_highlevel.py
andtest_tree_positioning
tests except for the ones dependent on seeking direction. However, it has caused chaos with other Python tests, causing failures and segfaults intest_stats.py
andtest_divmat.py
among others. The failing/crashing tests seem to be primarily be associated with LD calculations and divergence.I'm currently trying to determine whether the problems are due to an error in my implemention (most likely) or the subtle problems with the time ordering of inserted edges, discussed in #2792.
PR Checklist: