Skip to content

Finish the GCP pipeline #166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: development
Choose a base branch
from

Conversation

dmitrii-ubskii
Copy link
Member

@dmitrii-ubskii dmitrii-ubskii commented Apr 4, 2025

What is the goal of this PR?

We add configuration for Neo4J and Postgres and finish the end-to-end pipeline script that is to run a benchmark in GCP.

What are the changes implemented in this PR?

@typedb-bot
Copy link
Member

PR Review Checklist

Do not edit the content of this comment. The PR reviewer should simply update this comment by ticking each review item below, as they get completed.


Trivial Change

  • This change is trivial and does not require a code or architecture review.

Code

  • Packages, classes, and methods have a single domain of responsibility.
  • Packages, classes, and methods are grouped into cohesive and consistent domain model.
  • The code is canonical and the minimum required to achieve the goal.
  • Modules, libraries, and APIs are easy to use, robust (foolproof and not errorprone), and tested.
  • Logic and naming has clear narrative that communicates the accurate intent and responsibility of each module (e.g. method, class, etc.).
  • The code is algorithmically efficient and scalable for the whole application.

Architecture

  • Any required refactoring is completed, and the architecture does not introduce technical debt incidentally.
  • Any required build and release automations are updated and/or implemented.
  • Any new components follows a consistent style with respect to the pre-existing codebase.
  • The architecture intuitively reflects the application domain, and is easy to understand.
  • The architecture has a well-defined hierarchy of encapsulated components.
  • The architecture is extensible and scalable.

@dmitrii-ubskii dmitrii-ubskii force-pushed the gcp branch 5 times, most recently from 53efc84 to 59c3b5a Compare April 10, 2025 13:27
Comment on lines +11 to +12
(curl -o "$DISTRIBUTION_TARGZ" "$DISTRIBUTION_URL_SNAPSHOT" && tar -xf "$DISTRIBUTION_TARGZ") ||
(curl -o "$DISTRIBUTION_TARGZ" "$DISTRIBUTION_URL_RELEASE" && tar -xf "$DISTRIBUTION_TARGZ")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try downloading the version given from the snapshot url, retry from the release url on failure.

Comment on lines 6 to 7
pip install typedb-driver=="$DRIVER_VERSION" --index-url https://repo.typedb.com/public/public-snapshot/python/simple/ No newline at end of file
pip install typedb-driver=="$DRIVER_VERSION" --extra-index-url https://repo.typedb.com/public/public-snapshot/python/simple/
Copy link
Member Author

@dmitrii-ubskii dmitrii-ubskii Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Releases are on PyPI, so we don't want to actually override the index.

sudo mongod --replSet rs0 --bind_ip localhost --config ./tool/mongodb/mongod.conf &
sudo mongod --replSet rs0 --bind_ip localhost --config ./tool/mongodb/mongod.conf >&/dev/null &
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mongod keeps the SSH connection open by keeping the stdour/stderr open, so the script cannot progress unless we do this.


KEEP_SERVER=
while getopts ":d:w:c:s:t:k" opt; do
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getopts can only handle single letter flags. There's also getopt (note no s) which is a lot more powerful.
However, since getopts is a shell builtin and getopt is a side binary which behaves differently on different platforms, I'm sticking with this simlpe one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've merged the individual stages (setup, load, execute) into this one script. Each one effectively called a single command, and it was way too much indirection to keep track of.

Comment on lines -30 to +34
BENCH_ID=b-$USER-db$DB_SHORT-$SERVER_VERSION_SHORT-$DRIVER_VERSION_SHORT-$MACHINE_TYPE_SHORT-$DISK_SIZE-sf$SCALE_FACTOR-w$WAREHOUSES-c$CLIENTS-dur$DURATION-r$RUN_NUM
BENCH_ID=$USER-$DB_SHORT-$SERVER_VERSION_SHORT-$DRIVER_VERSION_SHORT-$MACHINE_TYPE_SHORT-$DISK_SIZE-sf$SCALE_FACTOR-w$WAREHOUSES-c$CLIENTS-dur$DURATION-$ID
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GCP has a limit on the machine name, so I had to make a few cuts.

Comment on lines +24 to +26
# four digit random number with zero padding
ID=0000$RANDOM
ID=${ID:(-4)}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that we can run multiples of the same configuration at once.

@dmitrii-ubskii dmitrii-ubskii changed the title Gcp Finish the GCP pipeline Apr 25, 2025
@dmitrii-ubskii dmitrii-ubskii marked this pull request as ready for review April 25, 2025 11:39
@dmitrii-ubskii dmitrii-ubskii requested a review from lolski April 25, 2025 11:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants