Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free space management initial PR #5205

Open
wants to merge 41 commits into
base: master
Choose a base branch
from
Open

Free space management initial PR #5205

wants to merge 41 commits into from

Conversation

royi-luo
Copy link
Contributor

@royi-luo royi-luo commented Apr 7, 2025

Description

This PR adds the main infrastructure needed to support free space management. In this PR space is only reclaimed after out-of-place updates of column chunks, reclaiming space after other operations (e.g. dropping columns) will be added in future PRs.

New classes:

  • PageChunkManager: All data file page allocations are routed through this class. Consumers of this class will request sequential chunks of pages from this class, which will either be allocated by appending new pages to the data file or by reallocating free pages. This class is also used by consumers to free pages when they are no longer needed.
  • FreeSpaceManager: Keeps track of chunks of pages that are freed. When page allocations are requested, this class can be used to reallocate previously freed pages. This class should not be used directly, instead calls should be made to the PageChunkManager which will handle accesses to this class.

Contributor agreement

@royi-luo royi-luo self-assigned this Apr 7, 2025
@royi-luo royi-luo changed the title Free space management for out of place updates Free space management initial PR Apr 7, 2025
bool isInMemoryMode() const;
};

class PageChunkManager {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is the best name for this, let me know if you have better ideas

metadata.numPages += (newNumPages - numPages);
const auto numNewPages =
writeValues(state, metadata.numValues, data, nullChunkData, 0 /*dataOffset*/, numValues);
KU_ASSERT(numNewPages == 0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there actually a case when we append pages here? I only see this function being called when appending to a dictionary chunks for in-place string writes which I don't think should add new pages to the data file (the pages should all have been added when the data/offset chunks were initially allocated). I also haven't hit the assertion I added when running tests.

I feel like if there is a case where we do append pages this way is wrong since updateShadowedPageWithCursor just appends to the end of the file which doesn't guarantee that all the pages for a data/offset chunk are contiguous so if we don't use this logic I'll just remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjaminwinger can you double check here?

@@ -20,16 +20,16 @@ struct ChunkCheckpointState {
struct ColumnCheckpointState {
ColumnChunkData& persistentData;
std::vector<ChunkCheckpointState> chunkCheckpointStates;
common::row_idx_t maxRowIdxToWrite;
common::row_idx_t endRowIdxToWrite;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed this from maxRowIdxToWrite to endRowIdxToWrite this value is exclusive and the misleading naming was causing it to be used incorrectly (inclusively)

Copy link

codecov bot commented Apr 8, 2025

Codecov Report

Attention: Patch coverage is 85.16129% with 46 lines in your changes missing coverage. Please review.

Project coverage is 86.95%. Comparing base (34852cc) to head (7a3acd2).

Files with missing lines Patch % Lines
src/function/table/free_space_info.cpp 25.80% 23 Missing ⚠️
src/storage/free_space_manager.cpp 79.20% 21 Missing ⚠️
src/include/storage/page_manager.h 66.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5205      +/-   ##
==========================================
- Coverage   86.97%   86.95%   -0.02%     
==========================================
  Files        1405     1411       +6     
  Lines       61629    61864     +235     
  Branches     7541     7570      +29     
==========================================
+ Hits        53603    53796     +193     
- Misses       7853     7895      +42     
  Partials      173      173              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@royi-luo
Copy link
Contributor Author

royi-luo commented Apr 8, 2025

Benchmarks

I tested on three FSM strategies:

  1. If a freed chunk contains n pages, it is put in the largest level that doesn't exceed size n. For example a chunk with 5 pages would go to level 2 (page size 4). When looking for a chunk of size n we go to the same level and look for a page that is large enough. If there is none we go up a level and try again.
  2. We only allocate blocks sizes that are powers of 2. So if a chunk with 5 pages is requested we allocated a chunk of size 8. This allows us to use the same strategy when searching the FSM which avoid iteration.
  3. FSM is disabled
    I've attached the testing script here: test_fsm.txt

Baselines

Below are the database sizes of some of datasets with a single COPY (no FSM triggered):

Dataset # of used pages data.kz size
ldbc-sf100 Comment (first 1M rows) 12052 84MB
ldbc-sf100 comment_replyOf_comment (first 2M rows) 3891 822MB
100K random values from 0-65535 101 6.5MB

Results

Scenario Strategy # of used pages # of free pages data.kz size Execution time Notes
Start with 300k values from 0-255, update values to 0-65535 in batches of 10 (2 batches) 1 450 110 15MB 4.70s Free entries large (27, 64, 269 pages)
^ 2 608 0 15MB 4.75s
^ 3 450 0 113MB 4.76s
Start with 300k values from 0-65535, update values to 0-255 in batches of 10k (30 batches) 1 208 96 13MB 0.84s Free entries large (32, 64 pages)
^ 2 352 81 14MB 0.79s Free entries sizes 1, 16, 64 pages
^ 3 208 0 20MB 0.78s
Insert (create) 100K values between 0-65535 in batches of size 1k 1 101 132 7.0MB 3.03s
^ 2 128 112 7.1MB 7.66s
^ 3 101 0 16MB 2.81s
Batch insert (copy) 100K values between 0-65535 in batches of size 1k 1 101 132 7.0MB 3.46s
^ 2 128 112 7.1MB 6.91s
^ 3 101 0 16MB 3.19s
Batch insert ldbc-sf1 Comment in batches of size 1k 1 2148 513 20MB 5.85s
^ 2 3525 690 30MB 12.62s
^ 3 2148 0 383MB 5.94s
Batch insert first 1M rows of ldbc-sf100 Comment in batches of size 20k 1 12153 1025 95MB 7.03s 15/38 free entries are 5 pages or less, 512/1025 free pages are in a single free entry
^ 2 13808 608 103MB 13.24s Less small free entries(4x16 pages, 3x32 pages, 1x64 pages, 1x128 pages, 1x256 pages)
^ 3 12153 0 277MB 6.63s
Batch insert first 2M rows of ldbc-sf100 comment_replyOf_comment in batches of size 10k 1 7572 925 847MB 174.49s
^ 2 8936 62 940MB 178.14s
^ 3 7572 0 1006MB 173.86s

Observations

The space penalty for always allocating a power of 2 for number of pages generally offsets the benefits of potentially having less fragmentation (even when there are fewer free pages left the database size is still larger). Since the FSM seems to have a minimum impact on query times, this strategy is not worth it. Instead I went with the first strategy since the additional time penalty of needing to iterate is minimal.

One thing to keep in mind though is that this strategy results in many tiny free chunks accumulating over time. Although these chunks may have a minimal impact on the space footprint, due to their size there is a low chance of them being reallocated. We should keep this in mind when we design our VACCUM mechanism.

Copy link

github-actions bot commented Apr 8, 2025

Benchmark Result

Master commit hash: fd6ac8d83498bd09ba7cde0aedced867afbd1db4
Branch commit hash: db51bf680e39fc08227001fe3aa7ac95740d5598

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 717.04 725.55 -8.51 (-1.17%)
aggregation q28 6556.38 6483.69 72.69 (1.12%)
copy node-Comment 190213.60 N/A N/A
copy node-Forum 13491.67 N/A N/A
copy node-Organisation 1279.59 N/A N/A
copy node-Person 2981.55 N/A N/A
copy node-Place 1237.58 N/A N/A
copy node-Post 91105.19 N/A N/A
copy node-Tag 1377.08 N/A N/A
copy node-Tagclass 1233.70 N/A N/A
copy rel-comment-hasCreator 135628.17 N/A N/A
copy rel-comment-hasTag 106528.00 N/A N/A
copy rel-comment-isLocatedIn 77823.21 N/A N/A
copy rel-containerOf 24739.44 N/A N/A
copy rel-forum-hasTag 5049.19 N/A N/A
copy rel-hasInterest 6022.55 N/A N/A
copy rel-hasMember 93764.37 N/A N/A
copy rel-hasModerator 1373.64 N/A N/A
copy rel-hasType 323.41 N/A N/A
copy rel-isPartOf 337.25 N/A N/A
copy rel-isSubclassOf 359.55 N/A N/A
copy rel-knows 21370.41 N/A N/A
copy rel-likes-comment 173920.34 N/A N/A
copy rel-likes-post 68609.92 N/A N/A
copy rel-organisation-isLocatedIn 385.95 N/A N/A
copy rel-person-isLocatedIn 775.86 N/A N/A
copy rel-post-hasCreator 19817.98 N/A N/A
copy rel-post-hasTag 34774.63 N/A N/A
copy rel-post-isLocatedIn 22752.72 N/A N/A
copy rel-replyOf-comment 68130.25 N/A N/A
copy rel-replyOf-post 52714.71 N/A N/A
copy rel-studyAt 821.44 N/A N/A
copy rel-workAt 1155.65 N/A N/A
filter q14 117.67 125.91 -8.24 (-6.54%)
filter q15 120.81 121.60 -0.79 (-0.65%)
filter q16 336.49 343.84 -7.34 (-2.14%)
filter q17 440.11 446.00 -5.89 (-1.32%)
filter q18 1879.70 1901.36 -21.66 (-1.14%)
filter zonemap-node 81.46 88.75 -7.29 (-8.22%)
filter zonemap-node-lhs-cast 81.17 89.17 -8.00 (-8.97%)
filter zonemap-node-null 80.97 88.61 -7.64 (-8.62%)
filter zonemap-rel 5434.00 5322.62 111.38 (2.09%)
fixed_size_expr_evaluator q07 680.22 678.66 1.56 (0.23%)
fixed_size_expr_evaluator q08 963.39 964.34 -0.95 (-0.10%)
fixed_size_expr_evaluator q09 967.66 959.23 8.43 (0.88%)
fixed_size_expr_evaluator q10 253.54 253.94 -0.40 (-0.16%)
fixed_size_expr_evaluator q11 254.43 255.11 -0.68 (-0.27%)
fixed_size_expr_evaluator q12 232.17 232.77 -0.60 (-0.26%)
fixed_size_expr_evaluator q13 1555.75 1567.36 -11.61 (-0.74%)
fixed_size_seq_scan q23 112.02 112.81 -0.79 (-0.70%)
join q29 717.17 698.13 19.04 (2.73%)
join q30 1707.66 1576.50 131.16 (8.32%)
join q31 4.96 4.33 0.63 (14.50%)
join SelectiveTwoHopJoin 49.59 42.05 7.53 (17.91%)
ldbc_snb_ic q35 7.66 10.83 -3.17 (-29.24%)
ldbc_snb_ic q36 85.25 99.82 -14.57 (-14.60%)
ldbc_snb_is q32 4.00 6.25 -2.25 (-35.97%)
ldbc_snb_is q33 12.27 13.24 -0.97 (-7.33%)
ldbc_snb_is q34 1.20 1.28 -0.09 (-6.86%)
multi-rel multi-rel-large-scan 1678.78 1725.36 -46.58 (-2.70%)
multi-rel multi-rel-lookup 11.86 11.59 0.26 (2.27%)
multi-rel multi-rel-small-scan 204.43 189.32 15.11 (7.98%)
order_by q25 120.31 131.03 -10.71 (-8.18%)
order_by q26 434.19 448.12 -13.92 (-3.11%)
order_by q27 1379.50 1392.45 -12.96 (-0.93%)
recursive_join recursive-join-bidirection 273.33 290.23 -16.91 (-5.82%)
recursive_join recursive-join-dense 5451.30 7036.75 -1585.44 (-22.53%)
recursive_join recursive-join-path 22655.58 23312.03 -656.45 (-2.82%)
recursive_join recursive-join-sparse 633.53 631.92 1.61 (0.25%)
recursive_join recursive-join-trail 5971.96 6994.60 -1022.63 (-14.62%)
scan_after_filter q01 162.45 168.09 -5.64 (-3.35%)
scan_after_filter q02 146.37 153.58 -7.21 (-4.69%)
shortest_path_ldbc100 q37 82.91 83.32 -0.41 (-0.49%)
shortest_path_ldbc100 q38 329.11 281.85 47.26 (16.77%)
shortest_path_ldbc100 q39 59.60 62.24 -2.64 (-4.25%)
shortest_path_ldbc100 q40 394.49 338.04 56.46 (16.70%)
var_size_expr_evaluator q03 2116.59 2114.65 1.93 (0.09%)
var_size_expr_evaluator q04 2205.20 2206.17 -0.97 (-0.04%)
var_size_expr_evaluator q05 2674.12 2682.23 -8.11 (-0.30%)
var_size_expr_evaluator q06 1351.03 1355.14 -4.11 (-0.30%)
var_size_seq_scan q19 1424.87 1439.01 -14.14 (-0.98%)
var_size_seq_scan q20 2332.31 2357.12 -24.81 (-1.05%)
var_size_seq_scan q21 2239.26 2267.05 -27.80 (-1.23%)
var_size_seq_scan q22 123.42 126.04 -2.62 (-2.08%)

@royi-luo royi-luo marked this pull request as ready for review April 8, 2025 21:12
@royi-luo royi-luo requested a review from benjaminwinger as a code owner April 8, 2025 21:12
@royi-luo royi-luo requested a review from ray6080 April 8, 2025 21:12
Copy link

github-actions bot commented Apr 8, 2025

Benchmark Result

Master commit hash: fd6ac8d83498bd09ba7cde0aedced867afbd1db4
Branch commit hash: 3b117b92c9ba8574a35d2bb83a46c603537d8b6e

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 720.81 725.55 -4.75 (-0.65%)
aggregation q28 6574.38 6483.69 90.69 (1.40%)
copy node-Comment 135576.26 N/A N/A
copy node-Forum 5763.35 N/A N/A
copy node-Organisation 1313.99 N/A N/A
copy node-Person 2226.41 N/A N/A
copy node-Place 1243.94 N/A N/A
copy node-Post 33085.01 N/A N/A
copy node-Tag 1313.26 N/A N/A
copy node-Tagclass 1218.27 N/A N/A
copy rel-comment-hasCreator 131217.38 N/A N/A
copy rel-comment-hasTag 109771.64 N/A N/A
copy rel-comment-isLocatedIn 63897.62 N/A N/A
copy rel-containerOf 25534.72 N/A N/A
copy rel-forum-hasTag 3785.61 N/A N/A
copy rel-hasInterest 6032.19 N/A N/A
copy rel-hasMember 93058.16 N/A N/A
copy rel-hasModerator 1732.29 N/A N/A
copy rel-hasType 336.08 N/A N/A
copy rel-isPartOf 328.58 N/A N/A
copy rel-isSubclassOf 320.91 N/A N/A
copy rel-knows 12956.05 N/A N/A
copy rel-likes-comment 145866.64 N/A N/A
copy rel-likes-post 58082.24 N/A N/A
copy rel-organisation-isLocatedIn 282.52 N/A N/A
copy rel-person-isLocatedIn 596.61 N/A N/A
copy rel-post-hasCreator 18476.69 N/A N/A
copy rel-post-hasTag 22371.23 N/A N/A
copy rel-post-isLocatedIn 19320.90 N/A N/A
copy rel-replyOf-comment 61249.38 N/A N/A
copy rel-replyOf-post 43797.90 N/A N/A
copy rel-studyAt 695.18 N/A N/A
copy rel-workAt 890.82 N/A N/A
filter q14 118.77 125.91 -7.14 (-5.67%)
filter q15 118.75 121.60 -2.86 (-2.35%)
filter q16 336.40 343.84 -7.43 (-2.16%)
filter q17 437.35 446.00 -8.65 (-1.94%)
filter q18 1901.42 1901.36 0.07 (0.00%)
filter zonemap-node 81.36 88.75 -7.39 (-8.33%)
filter zonemap-node-lhs-cast 82.60 89.17 -6.57 (-7.37%)
filter zonemap-node-null 81.62 88.61 -6.99 (-7.89%)
filter zonemap-rel 5617.84 5322.62 295.22 (5.55%)
fixed_size_expr_evaluator q07 678.88 678.66 0.22 (0.03%)
fixed_size_expr_evaluator q08 965.31 964.34 0.98 (0.10%)
fixed_size_expr_evaluator q09 967.54 959.23 8.32 (0.87%)
fixed_size_expr_evaluator q10 253.62 253.94 -0.32 (-0.13%)
fixed_size_expr_evaluator q11 254.52 255.11 -0.59 (-0.23%)
fixed_size_expr_evaluator q12 231.61 232.77 -1.16 (-0.50%)
fixed_size_expr_evaluator q13 1552.61 1567.36 -14.75 (-0.94%)
fixed_size_seq_scan q23 112.51 112.81 -0.30 (-0.27%)
join q29 741.84 698.13 43.70 (6.26%)
join q30 1613.16 1576.50 36.67 (2.33%)
join q31 5.24 4.33 0.91 (21.12%)
join SelectiveTwoHopJoin 41.68 42.05 -0.38 (-0.89%)
ldbc_snb_ic q35 10.46 10.83 -0.37 (-3.42%)
ldbc_snb_ic q36 92.75 99.82 -7.07 (-7.09%)
ldbc_snb_is q32 3.85 6.25 -2.40 (-38.43%)
ldbc_snb_is q33 10.74 13.24 -2.49 (-18.84%)
ldbc_snb_is q34 1.28 1.28 -0.00 (-0.36%)
multi-rel multi-rel-large-scan 1716.07 1725.36 -9.29 (-0.54%)
multi-rel multi-rel-lookup 6.69 11.59 -4.91 (-42.32%)
multi-rel multi-rel-small-scan 206.46 189.32 17.14 (9.05%)
order_by q25 128.62 131.03 -2.41 (-1.84%)
order_by q26 435.77 448.12 -12.34 (-2.75%)
order_by q27 1374.98 1392.45 -17.47 (-1.25%)
recursive_join recursive-join-bidirection 267.30 290.23 -22.94 (-7.90%)
recursive_join recursive-join-dense 6996.73 7036.75 -40.02 (-0.57%)
recursive_join recursive-join-path 23341.97 23312.03 29.94 (0.13%)
recursive_join recursive-join-sparse 655.53 631.92 23.61 (3.74%)
recursive_join recursive-join-trail 6951.42 6994.60 -43.18 (-0.62%)
scan_after_filter q01 163.32 168.09 -4.77 (-2.84%)
scan_after_filter q02 150.17 153.58 -3.40 (-2.22%)
shortest_path_ldbc100 q37 77.33 83.32 -5.99 (-7.19%)
shortest_path_ldbc100 q38 321.01 281.85 39.15 (13.89%)
shortest_path_ldbc100 q39 57.78 62.24 -4.46 (-7.17%)
shortest_path_ldbc100 q40 382.26 338.04 44.22 (13.08%)
var_size_expr_evaluator q03 2147.52 2114.65 32.87 (1.55%)
var_size_expr_evaluator q04 2202.81 2206.17 -3.36 (-0.15%)
var_size_expr_evaluator q05 2679.87 2682.23 -2.36 (-0.09%)
var_size_expr_evaluator q06 1345.27 1355.14 -9.87 (-0.73%)
var_size_seq_scan q19 1423.06 1439.01 -15.95 (-1.11%)
var_size_seq_scan q20 2583.15 2357.12 226.04 (9.59%)
var_size_seq_scan q21 2231.90 2267.05 -35.16 (-1.55%)
var_size_seq_scan q22 123.76 126.04 -2.28 (-1.81%)

const auto pageIdx = entry.startPageIdx + i;
fileHandle->removePageFromFrameIfNecessary(pageIdx);
}
freeSpaceManager->addFreeChunk(entry);
Copy link
Contributor Author

@royi-luo royi-luo Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a freed chunk immediately rewritable may not be fully robust so I may need to change this to only making freed chunks rewritable after a checkpoint completes (the reasoning should be similar to why we have shadow files). This will make the FSM efficiency slightly worse though.

A case like the following might happen otherwise:
begin out of place update -> free existing chunk -> reclaim the same chunk -> start writing -> DB crashes
When the DB is reloaded part of the existing chunk will already be written to so there is no way to recover

Copy link
Contributor

@ray6080 ray6080 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good to me in the high level. I have some initial comments here. Wanna take a closer look after they're addressed.

@@ -234,6 +235,9 @@ class KUZU_API ColumnChunkData {

void updateStats(const common::ValueVector* vector, const common::SelectionView& selVector);

virtual void reclaimAllocatedPages(PageChunkManager& pageChunkManager,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this function is necessary as the reclaim is just based on ChunkState, and you don't need to go inside ColumnChunkData for that as the ChunkState is already a function param.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved logic to ChunkState

metadata.numPages += (newNumPages - numPages);
const auto numNewPages =
writeValues(state, metadata.numValues, data, nullChunkData, 0 /*dataOffset*/, numValues);
KU_ASSERT(numNewPages == 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjaminwinger can you double check here?

Copy link

github-actions bot commented Apr 9, 2025

Benchmark Result

Master commit hash: 34852cc8eb1df3a7b2319dc42ffdae293311f910
Branch commit hash: e15c4bbc49b00f8de2222bdfba4fa9d935aee761

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 744.16 741.23 2.93 (0.40%)
aggregation q28 6737.19 6582.37 154.82 (2.35%)
copy node-Comment 191123.58 N/A N/A
copy node-Forum 13288.20 N/A N/A
copy node-Organisation 1302.52 N/A N/A
copy node-Person 3010.10 N/A N/A
copy node-Place 1258.99 N/A N/A
copy node-Post 89912.79 N/A N/A
copy node-Tag 1345.07 N/A N/A
copy node-Tagclass 1169.40 N/A N/A
copy rel-comment-hasCreator 134502.05 N/A N/A
copy rel-comment-hasTag 102792.96 N/A N/A
copy rel-comment-isLocatedIn 75816.40 N/A N/A
copy rel-containerOf 24980.05 N/A N/A
copy rel-forum-hasTag 4921.29 N/A N/A
copy rel-hasInterest 5868.90 N/A N/A
copy rel-hasMember 94779.50 N/A N/A
copy rel-hasModerator 1334.05 N/A N/A
copy rel-hasType 369.85 N/A N/A
copy rel-isPartOf 335.27 N/A N/A
copy rel-isSubclassOf 354.77 N/A N/A
copy rel-knows 22246.31 N/A N/A
copy rel-likes-comment 173303.26 N/A N/A
copy rel-likes-post 72016.15 N/A N/A
copy rel-organisation-isLocatedIn 334.33 N/A N/A
copy rel-person-isLocatedIn 749.64 N/A N/A
copy rel-post-hasCreator 19741.32 N/A N/A
copy rel-post-hasTag 34746.93 N/A N/A
copy rel-post-isLocatedIn 22658.22 N/A N/A
copy rel-replyOf-comment 71180.79 N/A N/A
copy rel-replyOf-post 55058.21 N/A N/A
copy rel-studyAt 824.00 N/A N/A
copy rel-workAt 1198.29 N/A N/A
filter q14 135.61 141.31 -5.70 (-4.03%)
filter q15 130.96 140.33 -9.37 (-6.68%)
filter q16 338.73 358.86 -20.13 (-5.61%)
filter q17 454.11 464.90 -10.79 (-2.32%)
filter q18 1914.58 1928.65 -14.08 (-0.73%)
filter zonemap-node 89.55 97.00 -7.44 (-7.68%)
filter zonemap-node-lhs-cast 90.30 97.37 -7.07 (-7.26%)
filter zonemap-node-null 88.69 96.92 -8.24 (-8.50%)
filter zonemap-rel 5464.30 5644.90 -180.60 (-3.20%)
fixed_size_expr_evaluator q07 680.82 710.71 -29.89 (-4.21%)
fixed_size_expr_evaluator q08 963.67 975.61 -11.94 (-1.22%)
fixed_size_expr_evaluator q09 965.25 976.46 -11.21 (-1.15%)
fixed_size_expr_evaluator q10 255.52 269.91 -14.39 (-5.33%)
fixed_size_expr_evaluator q11 255.75 269.91 -14.16 (-5.25%)
fixed_size_expr_evaluator q12 232.59 248.91 -16.32 (-6.56%)
fixed_size_expr_evaluator q13 1566.92 1583.09 -16.17 (-1.02%)
fixed_size_seq_scan q23 113.39 128.27 -14.88 (-11.60%)
join q29 764.99 689.97 75.02 (10.87%)
join q30 1695.31 1601.88 93.42 (5.83%)
join q31 5.41 6.25 -0.85 (-13.54%)
join SelectiveTwoHopJoin 45.52 55.14 -9.63 (-17.46%)
ldbc_snb_ic q35 8.74 9.88 -1.14 (-11.54%)
ldbc_snb_ic q36 96.63 84.94 11.69 (13.76%)
ldbc_snb_is q32 5.90 3.58 2.31 (64.65%)
ldbc_snb_is q33 12.98 13.59 -0.61 (-4.50%)
ldbc_snb_is q34 1.21 1.21 0.00 (0.22%)
multi-rel multi-rel-large-scan 1693.72 1677.88 15.84 (0.94%)
multi-rel multi-rel-lookup 11.42 9.84 1.57 (15.99%)
multi-rel multi-rel-small-scan 199.02 197.44 1.58 (0.80%)
order_by q25 131.43 141.32 -9.89 (-7.00%)
order_by q26 445.56 464.07 -18.51 (-3.99%)
order_by q27 1376.66 1393.29 -16.63 (-1.19%)
recursive_join recursive-join-bidirection 299.07 293.18 5.89 (2.01%)
recursive_join recursive-join-dense 5379.95 7038.39 -1658.44 (-23.56%)
recursive_join recursive-join-path 22661.27 23147.98 -486.71 (-2.10%)
recursive_join recursive-join-sparse 626.67 636.68 -10.02 (-1.57%)
recursive_join recursive-join-trail 5893.87 7040.40 -1146.54 (-16.29%)
scan_after_filter q01 171.22 184.78 -13.56 (-7.34%)
scan_after_filter q02 154.56 170.13 -15.57 (-9.15%)
shortest_path_ldbc100 q37 98.99 82.61 16.38 (19.83%)
shortest_path_ldbc100 q38 342.76 196.28 146.48 (74.63%)
shortest_path_ldbc100 q39 60.45 52.93 7.52 (14.20%)
shortest_path_ldbc100 q40 385.45 264.86 120.59 (45.53%)
var_size_expr_evaluator q03 2088.66 2091.41 -2.75 (-0.13%)
var_size_expr_evaluator q04 2209.48 2171.33 38.15 (1.76%)
var_size_expr_evaluator q05 2675.56 2611.39 64.16 (2.46%)
var_size_expr_evaluator q06 1362.99 1358.31 4.68 (0.34%)
var_size_seq_scan q19 1423.96 1436.90 -12.95 (-0.90%)
var_size_seq_scan q20 2432.82 2624.75 -191.93 (-7.31%)
var_size_seq_scan q21 2249.13 2270.91 -21.78 (-0.96%)
var_size_seq_scan q22 126.06 130.10 -4.04 (-3.11%)

Copy link

github-actions bot commented Apr 9, 2025

Benchmark Result

Master commit hash: 34852cc8eb1df3a7b2319dc42ffdae293311f910
Branch commit hash: ee9adc0ed4f4e2faabb8cd05a941579aa2898430

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 727.09 741.23 -14.14 (-1.91%)
aggregation q28 6689.17 6582.37 106.80 (1.62%)
filter q14 125.79 141.31 -15.52 (-10.98%)
filter q15 125.11 140.33 -15.22 (-10.84%)
filter q16 341.57 358.86 -17.29 (-4.82%)
filter q17 445.39 464.90 -19.51 (-4.20%)
filter q18 1915.99 1928.65 -12.66 (-0.66%)
filter zonemap-node 88.85 97.00 -8.14 (-8.39%)
filter zonemap-node-lhs-cast 89.16 97.37 -8.21 (-8.43%)
filter zonemap-node-null 88.89 96.92 -8.04 (-8.29%)
filter zonemap-rel 5496.32 5644.90 -148.57 (-2.63%)
fixed_size_expr_evaluator q07 679.99 710.71 -30.72 (-4.32%)
fixed_size_expr_evaluator q08 964.17 975.61 -11.44 (-1.17%)
fixed_size_expr_evaluator q09 966.77 976.46 -9.69 (-0.99%)
fixed_size_expr_evaluator q10 256.07 269.91 -13.84 (-5.13%)
fixed_size_expr_evaluator q11 257.15 269.91 -12.76 (-4.73%)
fixed_size_expr_evaluator q12 233.31 248.91 -15.59 (-6.26%)
fixed_size_expr_evaluator q13 1567.26 1583.09 -15.83 (-1.00%)
fixed_size_seq_scan q23 115.53 128.27 -12.74 (-9.93%)
join q29 730.30 689.97 40.33 (5.85%)
join q30 1723.02 1601.88 121.13 (7.56%)
join q31 5.51 6.25 -0.74 (-11.89%)
join SelectiveTwoHopJoin 48.64 55.14 -6.50 (-11.79%)
ldbc_snb_ic q35 9.87 9.88 -0.00 (-0.04%)
ldbc_snb_ic q36 95.67 84.94 10.73 (12.63%)
ldbc_snb_is q32 4.48 3.58 0.90 (25.01%)
ldbc_snb_is q33 13.39 13.59 -0.20 (-1.48%)
ldbc_snb_is q34 1.11 1.21 -0.09 (-7.78%)
multi-rel multi-rel-large-scan 1924.95 1677.88 247.07 (14.73%)
multi-rel multi-rel-lookup 12.20 9.84 2.36 (23.94%)
multi-rel multi-rel-small-scan 193.71 197.44 -3.73 (-1.89%)
order_by q25 127.56 141.32 -13.77 (-9.74%)
order_by q26 446.96 464.07 -17.11 (-3.69%)
order_by q27 1371.11 1393.29 -22.18 (-1.59%)
recursive_join recursive-join-bidirection 307.77 293.18 14.59 (4.97%)
recursive_join recursive-join-dense 7098.93 7038.39 60.55 (0.86%)
recursive_join recursive-join-path 23172.07 23147.98 24.09 (0.10%)
recursive_join recursive-join-sparse 623.08 636.68 -13.60 (-2.14%)
recursive_join recursive-join-trail 7101.61 7040.40 61.21 (0.87%)
scan_after_filter q01 171.24 184.78 -13.54 (-7.33%)
scan_after_filter q02 156.26 170.13 -13.87 (-8.15%)
shortest_path_ldbc100 q37 93.38 82.61 10.78 (13.05%)
shortest_path_ldbc100 q38 319.54 196.28 123.26 (62.80%)
shortest_path_ldbc100 q39 55.48 52.93 2.54 (4.81%)
shortest_path_ldbc100 q40 392.80 264.86 127.94 (48.31%)
var_size_expr_evaluator q03 2081.08 2091.41 -10.33 (-0.49%)
var_size_expr_evaluator q04 2208.30 2171.33 36.97 (1.70%)
var_size_expr_evaluator q05 2680.90 2611.39 69.51 (2.66%)
var_size_expr_evaluator q06 1365.95 1358.31 7.65 (0.56%)
var_size_seq_scan q19 1419.30 1436.90 -17.61 (-1.23%)
var_size_seq_scan q20 2438.00 2624.75 -186.75 (-7.12%)
var_size_seq_scan q21 2252.09 2270.91 -18.81 (-0.83%)
var_size_seq_scan q22 122.84 130.10 -7.26 (-5.58%)

@royi-luo royi-luo requested a review from ray6080 April 10, 2025 13:36
Copy link

Benchmark Result

Master commit hash: 385278eadd223724f100019765442323994d6edd
Branch commit hash: 100070dfcb07ed0cadc48c45ebd9c668b7c52604

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 723.07 741.69 -18.62 (-2.51%)
aggregation q28 6540.64 6647.71 -107.07 (-1.61%)
copy node-Comment 130919.83 N/A N/A
copy node-Forum 5784.10 N/A N/A
copy node-Organisation 1429.94 N/A N/A
copy node-Person 2223.94 N/A N/A
copy node-Place 1266.52 N/A N/A
copy node-Post 36419.14 N/A N/A
copy node-Tag 1304.68 N/A N/A
copy node-Tagclass 1301.58 N/A N/A
copy rel-comment-hasCreator 71519.02 N/A N/A
copy rel-comment-hasTag 90642.29 N/A N/A
copy rel-comment-isLocatedIn 67924.77 N/A N/A
copy rel-containerOf 14121.45 N/A N/A
copy rel-forum-hasTag 3735.94 N/A N/A
copy rel-hasInterest 2751.01 N/A N/A
copy rel-hasMember 54001.01 N/A N/A
copy rel-hasModerator 1521.46 N/A N/A
copy rel-hasType 314.45 N/A N/A
copy rel-isPartOf 383.61 N/A N/A
copy rel-isSubclassOf 372.98 N/A N/A
copy rel-knows 6305.75 N/A N/A
copy rel-likes-comment 107086.49 N/A N/A
copy rel-likes-post 35445.17 N/A N/A
copy rel-organisation-isLocatedIn 323.07 N/A N/A
copy rel-person-isLocatedIn 535.98 N/A N/A
copy rel-post-hasCreator 15388.90 N/A N/A
copy rel-post-hasTag 22787.59 N/A N/A
copy rel-post-isLocatedIn 16309.77 N/A N/A
copy rel-replyOf-comment 52212.76 N/A N/A
copy rel-replyOf-post 43013.97 N/A N/A
copy rel-studyAt 611.03 N/A N/A
copy rel-workAt 809.90 N/A N/A
filter q14 127.65 141.05 -13.40 (-9.50%)
filter q15 130.34 145.35 -15.01 (-10.33%)
filter q16 342.72 359.10 -16.38 (-4.56%)
filter q17 445.98 460.80 -14.83 (-3.22%)
filter q18 1920.39 1877.71 42.68 (2.27%)
filter zonemap-node 91.43 97.61 -6.19 (-6.34%)
filter zonemap-node-lhs-cast 91.51 97.34 -5.82 (-5.98%)
filter zonemap-node-null 90.72 97.82 -7.10 (-7.26%)
filter zonemap-rel 5793.30 5558.56 234.75 (4.22%)
fixed_size_expr_evaluator q07 689.36 697.01 -7.65 (-1.10%)
fixed_size_expr_evaluator q08 968.14 972.57 -4.43 (-0.46%)
fixed_size_expr_evaluator q09 969.03 980.12 -11.09 (-1.13%)
fixed_size_expr_evaluator q10 261.11 270.71 -9.61 (-3.55%)
fixed_size_expr_evaluator q11 260.89 271.64 -10.75 (-3.96%)
fixed_size_expr_evaluator q12 239.43 249.75 -10.32 (-4.13%)
fixed_size_expr_evaluator q13 1565.21 1577.83 -12.62 (-0.80%)
fixed_size_seq_scan q23 116.59 128.12 -11.52 (-8.99%)
join q29 747.91 758.43 -10.52 (-1.39%)
join q30 1703.52 1683.12 20.39 (1.21%)
join q31 6.73 8.37 -1.64 (-19.61%)
join SelectiveTwoHopJoin 49.35 46.26 3.09 (6.67%)
ldbc_snb_ic q35 10.59 10.59 0.00 (0.00%)
ldbc_snb_ic q36 79.22 93.72 -14.49 (-15.46%)
ldbc_snb_is q32 4.90 4.94 -0.04 (-0.74%)
ldbc_snb_is q33 11.39 18.03 -6.65 (-36.85%)
ldbc_snb_is q34 1.24 1.22 0.02 (1.75%)
multi-rel multi-rel-large-scan 1809.57 1682.11 127.46 (7.58%)
multi-rel multi-rel-lookup 12.01 10.88 1.13 (10.39%)
multi-rel multi-rel-small-scan 212.04 193.96 18.08 (9.32%)
order_by q25 128.38 147.49 -19.11 (-12.96%)
order_by q26 444.16 481.34 -37.18 (-7.72%)
order_by q27 1378.31 1389.65 -11.33 (-0.82%)
recursive_join recursive-join-bidirection 314.97 300.24 14.73 (4.91%)
recursive_join recursive-join-dense 6996.41 7035.70 -39.30 (-0.56%)
recursive_join recursive-join-path 23244.59 23406.99 -162.40 (-0.69%)
recursive_join recursive-join-sparse 633.99 626.56 7.43 (1.19%)
recursive_join recursive-join-trail 6970.38 6978.39 -8.01 (-0.11%)
scan_after_filter q01 174.95 186.32 -11.37 (-6.10%)
scan_after_filter q02 155.06 170.96 -15.91 (-9.30%)
shortest_path_ldbc100 q37 83.35 94.43 -11.07 (-11.73%)
shortest_path_ldbc100 q38 356.20 403.81 -47.60 (-11.79%)
shortest_path_ldbc100 q39 60.19 65.42 -5.23 (-7.99%)
shortest_path_ldbc100 q40 396.19 379.80 16.39 (4.32%)
var_size_expr_evaluator q03 2091.66 2093.32 -1.66 (-0.08%)
var_size_expr_evaluator q04 2257.78 2193.39 64.39 (2.94%)
var_size_expr_evaluator q05 2685.09 2607.78 77.31 (2.96%)
var_size_expr_evaluator q06 1403.75 1352.82 50.94 (3.77%)
var_size_seq_scan q19 1425.41 1440.30 -14.89 (-1.03%)
var_size_seq_scan q20 2474.66 2539.25 -64.59 (-2.54%)
var_size_seq_scan q21 2280.11 2274.38 5.73 (0.25%)
var_size_seq_scan q22 125.38 127.58 -2.20 (-1.73%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants