Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(hstore): Hstore support bulkload #2685

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

haohao0103
Copy link
Contributor

@haohao0103 haohao0103 commented Oct 17, 2024

refer #2669

as mentioned in the discussions, this is the code for implementing the bulkload feature.

@JackyYangPassion @imbajin @VGalaxies please review ,please let me know if you have any questions. Thank you

@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. feature New feature store Store module labels Oct 17, 2024
Hstore bulkload
Hstore bulkload
Copy link

codecov bot commented Oct 18, 2024

Codecov Report

Attention: Patch coverage is 0% with 443 lines in your changes missing coverage. Please review.

Project coverage is 34.83%. Comparing base (f6f3708) to head (b1c6c9a).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
...java/org/apache/hugegraph/pd/common/HdfsUtils.java 0.00% 135 Missing ⚠️
...java/org/apache/hugegraph/pd/PartitionService.java 0.00% 119 Missing ⚠️
...java/org/apache/hugegraph/pd/StoreNodeService.java 0.00% 57 Missing ⚠️
...a/org/apache/hugegraph/store/HeartbeatService.java 0.00% 25 Missing ⚠️
...hugegraph/store/PartitionInstructionProcessor.java 0.00% 24 Missing ⚠️
...ava/org/apache/hugegraph/pd/meta/TaskInfoMeta.java 0.00% 18 Missing ⚠️
...va/org/apache/hugegraph/store/PartitionEngine.java 0.00% 18 Missing ⚠️
.../java/org/apache/hugegraph/pd/client/PDClient.java 0.00% 10 Missing ⚠️
...rg/apache/hugegraph/pd/meta/MetadataKeyHelper.java 0.00% 10 Missing ⚠️
...g/apache/hugegraph/store/pd/DefaultPdProvider.java 0.00% 9 Missing ⚠️
... and 9 more

❗ There is a different number of reports uploaded between BASE (f6f3708) and HEAD (b1c6c9a). Click for more details.

HEAD has 6 uploads less than BASE
Flag BASE (f6f3708) HEAD (b1c6c9a)
7 1
Additional details and impacted files
@@              Coverage Diff              @@
##             master    #2685       +/-   ##
=============================================
- Coverage     47.68%   34.83%   -12.86%     
+ Complexity      821      383      -438     
=============================================
  Files           719      721        +2     
  Lines         58914    59423      +509     
  Branches       7595     7663       +68     
=============================================
- Hits          28096    20701     -7395     
- Misses        28007    36435     +8428     
+ Partials       2811     2287      -524     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Hstore bulkload
Hstore bulkload
Hstore bulkload
@imbajin imbajin requested a review from Copilot April 1, 2025 07:36
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements the bulkload feature for Hstore support. It introduces new REST endpoints, protobuf message definitions, and task scheduling logic to support bulkload operations.

  • Added a new bulkload method in the REST service and corresponding API endpoints.
  • Updated protobuf definitions and task metadata to include bulkload task information.
  • Extended internal services to handle bulkload task creation, reporting, and HDFS file operations.

Reviewed Changes

Copilot reviewed 39 out of 40 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
hg-pd-service/src/main/java/org/apache/hugegraph/pd/service/PDRestService.java Adds a bulkload method invoking store node bulkload operations.
hg-pd-service/src/main/java/org/apache/hugegraph/pd/rest/TaskAPI.java Introduces REST endpoints for bulkloading and leader redirection logic.
hg-pd-service/src/main/java/org/apache/hugegraph/pd/rest/PartitionAPI.java Adds an endpoint to retrieve partition and graph ID mapping.
hg-pd-service/src/main/java/org/apache/hugegraph/pd/model/BulkloadRestRequest.java Defines the bulkload request payload.
hg-pd-grpc/src/main/proto/pd_pulse.proto, metapb.proto, metaTask.proto Updates protobuf definitions to include bulkload task and info.
hg-pd-core/src/main/java/org/apache/hugegraph/pd/meta/TaskInfoMeta.java, PartitionMeta.java, MetadataKeyHelper.java Introduces bulkload task methods and key helpers for metadata storage.
hg-pd-core/src/main/java/org/apache/hugegraph/pd/TaskScheduleService.java, StoreNodeService.java, PartitionService.java Implements bulkload task scheduling, processing, and status reporting.
hg-pd-common/src/main/java/org/apache/hugegraph/pd/common/HdfsUtils.java Adds utility methods for parsing HDFS file paths and downloading files.
hg-pd-client/src/main/java/org/apache/hugegraph/pd/client/PDClient.java Provides a new client method for querying leader partitions.
Files not reviewed (1)
  • hugegraph-pd/hg-pd-common/pom.xml: Language not supported

Comment on lines +209 to +210
while (true) {

Copy link
Preview

Copilot AI Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a timeout or exit condition to this indefinite loop to prevent a potential hang if the bulkload task status never reaches a terminal state.

Suggested change
while (true) {
int maxRetries = 60; // Maximum number of retries (10 minutes)
int retries = 0;
while (retries < maxRetries) {

Copilot is powered by AI, so mistakes are possible. Review output carefully before use.

Comment on lines +1294 to +1298
if (statusMap.get(partition.getId()).state != null &&
statusMap.get(partition.getId()).state != MetaTask.TaskState.Task_Ready) {
var newTask =
pdMetaTask.toBuilder().setState(statusMap.get(partition.getId()).state).build();

Copy link
Preview

Copilot AI Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Store the result of 'statusMap.get(partition.getId())' in a local variable to avoid repeated lookups and to improve code readability.

Suggested change
if (statusMap.get(partition.getId()).state != null &&
statusMap.get(partition.getId()).state != MetaTask.TaskState.Task_Ready) {
var newTask =
pdMetaTask.toBuilder().setState(statusMap.get(partition.getId()).state).build();
var partitionStatus = statusMap.get(partition.getId());
if (partitionStatus.state != null &&
partitionStatus.state != MetaTask.TaskState.Task_Ready) {
var newTask =
pdMetaTask.toBuilder().setState(partitionStatus.state).build();

Copilot is powered by AI, so mistakes are possible. Review output carefully before use.

return handleBulkload(request);
} else {
String leaderAddress = RaftEngine.getInstance().getLeader().getIp();
String url = "http://" + leaderAddress+":8620" + "/v1/task/bulkload";
Copy link
Preview

Copilot AI Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider extracting the hardcoded port value into a constant or configuration property to simplify future updates and improve maintainability.

Suggested change
String url = "http://" + leaderAddress+":8620" + "/v1/task/bulkload";
String url = "http://" + leaderAddress + ":" + LEADER_PORT + "/v1/task/bulkload";

Copilot is powered by AI, so mistakes are possible. Review output carefully before use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature size:XXL This PR changes 1000+ lines, ignoring generated files. store Store module
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

1 participant