@@ -673,6 +673,99 @@ <h3 id="_object_positions_for_incremental_midxs">Object positions for incrementa
673
673
< code > m-</ code > >< code > num_objects_in_base</ code > ).</ p >
674
674
</ div >
675
675
</ div >
676
+ < div class ="sect2 ">
677
+ < h3 id ="_pseudo_pack_order_for_incremental_midxs "> Pseudo-pack order for incremental MIDXs</ h3 >
678
+ < div class ="paragraph ">
679
+ < p > The original implementation of multi-pack reachability bitmaps defined
680
+ the pseudo-pack order in < a href ="../gitformat-pack.html "> gitformat-pack(5)</ a > (see the section
681
+ titled "multi-pack-index reverse indexes") roughly as follows:</ p >
682
+ </ div >
683
+ < div class ="quoteblock ">
684
+ < blockquote >
685
+ < div class ="paragraph ">
686
+ < p > In short, a MIDX’s pseudo-pack is the de-duplicated concatenation of
687
+ objects in packs stored by the MIDX, laid out in pack order, and the
688
+ packs arranged in MIDX order (with the preferred pack coming first).</ p >
689
+ </ div >
690
+ </ blockquote >
691
+ </ div >
692
+ < div class ="paragraph ">
693
+ < p > In the incremental MIDX design, we extend this definition to include
694
+ objects from multiple layers of the MIDX chain. The pseudo-pack order
695
+ for incremental MIDXs is determined by concatenating the pseudo-pack
696
+ ordering for each layer of the MIDX chain in order. Formally two objects
697
+ < code > o1</ code > and < code > o2</ code > are compared as follows:</ p >
698
+ </ div >
699
+ < div class ="olist arabic ">
700
+ < ol class ="arabic ">
701
+ < li >
702
+ < p > If < code > o1</ code > appears in an earlier layer of the MIDX chain than < code > o2</ code > , then
703
+ < code > o1</ code > sorts ahead of < code > o2</ code > .</ p >
704
+ </ li >
705
+ < li >
706
+ < p > Otherwise, if < code > o1</ code > and < code > o2</ code > appear in the same MIDX layer, and that
707
+ MIDX layer has no base, then if one of < code > pack</ code > (< code > o1</ code > ) and < code > pack</ code > (< code > o2</ code > ) is
708
+ preferred and the other is not, then the preferred one sorts ahead of
709
+ the non-preferred one. If there is a base layer (i.e. the MIDX layer
710
+ is not the first layer in the chain), then if < code > pack</ code > (< code > o1</ code > ) appears
711
+ earlier in that MIDX layer’s pack order, then < code > o1</ code > sorts ahead of
712
+ < code > o2</ code > . Likewise if < code > pack</ code > (< code > o2</ code > ) appears earlier, then the opposite is
713
+ true.</ p >
714
+ </ li >
715
+ < li >
716
+ < p > Otherwise, < code > o1</ code > and < code > o2</ code > appear in the same pack, and thus in the
717
+ same MIDX layer. Sort < code > o1</ code > and < code > o2</ code > by their offset within their
718
+ containing packfile.</ p >
719
+ </ li >
720
+ </ ol >
721
+ </ div >
722
+ < div class ="paragraph ">
723
+ < p > Note that the preferred pack is a property of the MIDX chain, not the
724
+ individual layers themselves. Fundamentally we could introduce a
725
+ per-layer preferred pack, but this is less relevant now that we can
726
+ perform multi-pack reuse across the set of packs in a MIDX.</ p >
727
+ </ div >
728
+ </ div >
729
+ < div class ="sect2 ">
730
+ < h3 id ="_reachability_bitmaps_and_incremental_midxs "> Reachability bitmaps and incremental MIDXs</ h3 >
731
+ < div class ="paragraph ">
732
+ < p > Each layer of an incremental MIDX chain may have its objects (and the
733
+ objects from any previous layer in the same MIDX chain) represented in
734
+ its own *.< code > bitmap</ code > file.</ p >
735
+ </ div >
736
+ < div class ="paragraph ">
737
+ < p > The structure of a *.< code > bitmap</ code > file belonging to an incremental MIDX
738
+ chain is identical to that of a non-incremental MIDX bitmap, or a
739
+ classic single-pack bitmap. Since objects are added to the end of the
740
+ incremental MIDX’s pseudo-pack order (see above), it is possible to
741
+ extend a bitmap when appending to the end of a MIDX chain.</ p >
742
+ </ div >
743
+ < div class ="paragraph ">
744
+ < p > (Note: it is possible likewise to compress a contiguous sequence of MIDX
745
+ incremental layers, and their *.< code > bitmap</ code > files into a single layer and
746
+ *.< code > bitmap</ code > , but this is not yet implemented.)</ p >
747
+ </ div >
748
+ < div class ="paragraph ">
749
+ < p > The object positions used are global within the pseudo-pack order, so
750
+ subsequent layers will have, for example, < code > m-</ code > >< code > num_objects_in_base</ code >
751
+ number of < code > 0</ code > bits in each of their four type bitmaps. This follows from
752
+ the fact that we only write type bitmap entries for objects present in
753
+ the layer immediately corresponding to the bitmap).</ p >
754
+ </ div >
755
+ < div class ="paragraph ">
756
+ < p > Note also that only the bitmap pertaining to the most recent layer in an
757
+ incremental MIDX chain is used to store reachability information about
758
+ the interesting and uninteresting objects in a reachability query.
759
+ Earlier bitmap layers are only used to look up commit and pseudo-merge
760
+ bitmaps from that layer, as well as the type-level bitmaps for objects
761
+ in that layer.</ p >
762
+ </ div >
763
+ < div class ="paragraph ">
764
+ < p > To simplify the implementation, type-level bitmaps are iterated
765
+ simultaneously, and their results are OR’d together to avoid recursively
766
+ calling internal bitmap functions.</ p >
767
+ </ div >
768
+ </ div >
676
769
</ div >
677
770
</ div >
678
771
< div class ="sect1 ">
@@ -681,17 +774,6 @@ <h2 id="_future_work">Future Work</h2>
681
774
< div class ="ulist ">
682
775
< ul >
683
776
< li >
684
- < p > The multi-pack-index allows many packfiles, especially in a context
685
- where repacking is expensive (such as a very large repo), or
686
- unexpected maintenance time is unacceptable (such as a high-demand
687
- build machine). However, the multi-pack-index needs to be rewritten
688
- in full every time. We can extend the format to be incremental, so
689
- writes are fast. By storing a small "tip" multi-pack-index that
690
- points to large "base" MIDX files, we can keep writes fast while
691
- still reducing the number of binary searches required for object
692
- lookups.</ p >
693
- </ li >
694
- < li >
695
777
< p > If the multi-pack-index is extended to store a "stable object order"
696
778
(a function Order(hash) = integer that is constant for a given hash,
697
779
even as the multi-pack-index is updated) then MIDX bitmaps could be
@@ -731,7 +813,7 @@ <h2 id="_related_links">Related Links</h2>
731
813
</ div >
732
814
< div id ="footer ">
733
815
< div id ="footer-text ">
734
- Last updated 2025-02-14 21:38:14 -0800
816
+ Last updated 2025-04-08 12:12:27 -0700
735
817
</ div >
736
818
</ div >
737
819
</ body >
0 commit comments