Skip to content

Commit 0bce447

Browse files
committed
Implement query tests
1 parent 204084e commit 0bce447

File tree

10 files changed

+250
-114
lines changed

10 files changed

+250
-114
lines changed

README.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ All tests were run on an empty database.
2222

2323
Upon execution the helm chart found in `deployment/` is installed on the cluster. By default we execute 3 runs and average out the results to compensate for fluctuations in performance. A run consists of the workers and one collector instance being spawned in individual pods inside the same k3s cluster. Due to node selectors workers could not run on the same nodes as DB instances to avoid interference.
2424
Each worker generates and writes the configured amount of events into the database. Event schemata and API usage are found in `simulator/modules/{database_name}.py` (note: for cockroachdb and yugabytedb we use the postgres module), event generation logic may be reviewed under `simulator/modules/event_generator.py`.
25-
After each run the workers reports statistics to the collector instance. The database is wiped inbetween separate runs to have a reproducible baseline.
25+
After each run the workers report statistics to the collector instance. The database is wiped inbetween separate runs to have a reproducible baseline.
2626

2727
For generating primary keys for the events we have two modes: Calculating it on the client side based on some of the fields that from a functional perspective guarantee uniqueness, or having the database increment a `SERIAL` primary key field. The exception here is Cassandra as using serial primary keys for rows opposes the main concepts of Cassandra we omitted this step and always relied on a db-generated unique partition key (device_id, timestamp).
2828

@@ -83,11 +83,11 @@ Note: The provided values are for a k3s cluster. If you use another distribution
8383

8484
### Run the test
8585

86-
To run the test use `python run.py`. You can use the following options:
86+
To run the test use `python run.py insert`. You can use the following options:
8787

8888
* `--target`: The target database to use (name must correspond to the target name in `config.yaml`). This is required
8989
* `--workers`: Set of worker counts to try, default is `1,4,8,12,16` meaning the test will try with 1 concurrent worker, then with 4, then 8, then 12 and finally 16
90-
* `--runs`: How often should the test be repeated for each worker count, default is `3`.
90+
* `--runs`: How often should the test be repeated for each worker count, default is `3`
9191
* `--primary-key`: Defines how the primary key should be generated, see below for choices. Defaults to `db`
9292
* `--tables`: To simulate how the databases behave if inserts are done to several tables this option can be changed from `single` to `multiple` to have the test write into four instead of just one table
9393
* `--num-inserts`: The number of inserts each worker should do, by default 10000 to get a quick result. Increase this to see how the databases behave under constant load. Also increase the timout option accordingly
@@ -100,6 +100,15 @@ To run the test use `python run.py`. You can use the following options:
100100

101101
If the test takes too long and the timeout is reached or the script runs into any problems it will crash. To clean up you must then manually uninstall the simulator by running `helm uninstall dbtest`.
102102

103+
The query test can be run using `python run.py query`. You can use the following options:
104+
105+
* `--target`: The target database to use (name must correspond to the target name in `config.yaml`). This is required
106+
* `--workers`: Number of concurrent workers that should issue queries
107+
* `--runs`: How often should each query be issued, default is `3`
108+
* `--timeout`: How long should the script wait for the insert test to complete in seconds. Default is `0`. Increase accordingly if you increase the number of inserts or disable by stting to `0`
109+
110+
Before running the query test use the insert test to provide an appropriate amount of data.
111+
103112
### Primary key
104113

105114
There are several options on how the primary key for the database table can be generated, defined by the `--primary-key` option:

cli/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
import click
2+
3+
@click.group()
4+
def commands():
5+
pass

cli/insert.py

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
import base64
2+
import json
3+
import sys
4+
import click
5+
import yaml
6+
from . import commands
7+
from .run import one_run
8+
9+
10+
11+
@commands.command()
12+
@click.option('-t', '--target', required=True, help="Name of the target")
13+
@click.option('-c', '--config', default="config.yaml", help="Name of the config file to use")
14+
@click.option('-w', '--workers', default="1,4,8,12,16", help="Sets of worker counts to use, separate by comma without space, default='1,4,8,12,16'")
15+
@click.option('-r', '--runs', default=3, help='Number of runs per worker count, default=3')
16+
@click.option("--primary-key", default="db", type=click.Choice(['sql', 'db', 'client', 'uuid'], case_sensitive=False))
17+
@click.option("--tables", default="single", type=click.Choice(['single', 'multiple'], case_sensitive=False))
18+
@click.option("--num-inserts", default=10000, help="Number of inserts per worker, default=10000")
19+
@click.option("--prefill", default=0, help="Insert this number of events into the table before starting the test run, default=0")
20+
@click.option("--extra-option", multiple=True, help="Extra options for the database module")
21+
@click.option("--timeout", default=0, help="Timeout in seconds to wait for one run to complete. Increase this if you use higher number of inserts, or set to 0 to disable timeout. default=0")
22+
@click.option("--batch", default=0, help="Number of events to insert in one batch, default 0 disables batch mode")
23+
@click.option('--clean/--no-clean', default=True, help="Clean up the database before each run, enabled by default")
24+
@click.option('--steps', default=0, help="TODO")
25+
def insert(target, config, workers, runs, primary_key, tables, num_inserts, prefill, extra_option, timeout, batch, clean, steps):
26+
worker_counts = list(map(lambda el: int(el), workers.split(",")))
27+
config = _read_config(config)
28+
target_config = config["targets"][target]
29+
namespace = config.get("namespace", "default")
30+
if steps and len(worker_counts) > 1:
31+
print("ERROR: If using the --steps option only one worker count can be used")
32+
sys.exit(1)
33+
if steps and runs > 1:
34+
print("ERROR: If using the --steps option only one run is allowed")
35+
sys.exit(1)
36+
if steps and prefill:
37+
print("ERROR: --steps and --prefill cannot be used at the same time")
38+
sys.exit(1)
39+
40+
if steps:
41+
_steps_test(target_config, worker_counts[0], namespace, primary_key, tables, num_inserts, extra_option, timeout, batch, clean, steps)
42+
else:
43+
_normal_test(target_config, worker_counts, namespace, runs, primary_key, tables, num_inserts, prefill, extra_option, timeout, batch, clean)
44+
45+
46+
def _normal_test(target_config, worker_counts, namespace, runs, primary_key, tables, num_inserts, prefill, extra_option, timeout, batch, clean):
47+
run_config, target_module = _prepare_run_config(target_config, primary_key, tables, num_inserts, prefill, int(batch) if batch else None, clean, extra_option)
48+
49+
print(f"Workers\tMin\tMax\tAvg")
50+
for worker_count in worker_counts:
51+
run_results = [one_run(worker_count, run_config, target_module, timeout, namespace)["sum"]["ops_per_second"] for _ in range(runs)]
52+
result_min = round(min(run_results))
53+
result_max = round(max(run_results))
54+
result_avg = round(int(sum(run_results)/len(run_results)))
55+
print(f"{worker_count:2}\t{result_min:6}\t{result_max:6}\t{result_avg:6}")
56+
57+
58+
def _steps_test(target_config, workers, namespace, primary_key, tables, num_inserts, extra_option, timeout, batch, clean, steps):
59+
run_config, target_module = _prepare_run_config(target_config, primary_key, tables, num_inserts, 0, int(batch) if batch else None, clean, extra_option)
60+
run_config_continued, _ = _prepare_run_config(target_config, primary_key, tables, num_inserts, 0, int(batch) if batch else None, False, extra_option)
61+
stepsize = workers*num_inserts
62+
width = len(f"{stepsize*steps}")
63+
print(f"Stepsize: {stepsize}")
64+
print(f"Level".rjust(width)+"\tInserts/s")
65+
for step in range(steps):
66+
fill = f"{step*stepsize}".rjust(width)
67+
inserts = int(round(one_run(workers, run_config, target_module, timeout, namespace)["sum"]["ops_per_second"], -1))
68+
run_config = run_config_continued
69+
print(f"{fill}\t{inserts:6}")
70+
71+
72+
def _prepare_run_config(target_config, primary_key, tables, num_inserts, prefill, batch, clean, extra_options):
73+
config = target_config
74+
config.update({
75+
"task": "insert",
76+
"num_inserts": num_inserts,
77+
"prefill": int(prefill),
78+
"primary_key": primary_key,
79+
"use_multiple_tables": tables=="multiple",
80+
"clean_database": clean,
81+
})
82+
if batch:
83+
config["batch_mode"] = True
84+
config["batch_size"] = batch
85+
for option in extra_options:
86+
k, v = option.split("=", 1)
87+
config[k] = v
88+
return base64.b64encode(json.dumps(config).encode("utf-8")).decode("utf-8"), config["module"]
89+
90+
91+
def _read_config(config_file):
92+
with open(config_file) as f:
93+
config = yaml.safe_load(f)
94+
return config

cli/query.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
import base64
2+
import json
3+
import click
4+
import yaml
5+
from .run import one_run
6+
from . import commands
7+
8+
9+
@commands.command()
10+
@click.option('-t', '--target', required=True, help="Name of the target")
11+
@click.option('-c', '--config', default="config.yaml", help="Name of the config file to use")
12+
@click.option('-w', '--workers', default=1, help="Number of workers to use")
13+
@click.option('-r', '--runs', default=3, help='Number of times each query should be executed, default=3')
14+
@click.option("--timeout", default=0, help="Timeout in seconds to wait for one run to complete. Increase this if you use higher number of inserts, or set to 0 to disable timeout. default=0")
15+
def query(target, config, workers, runs, timeout):
16+
config = _read_config(config)
17+
target_config = config["targets"][target]
18+
namespace = config.get("namespace", "default")
19+
run_config, target_module = _prepare_run_config(target_config, runs)
20+
results = one_run(workers, run_config, target_module, timeout, namespace, endpoint="/report/queries")
21+
max_name_len = max([len(name) for name in results["queries"].keys()])
22+
spacing = " " * (max_name_len - len("Query"))
23+
print(f"Query{spacing}\tMin \tMax \tAvg")
24+
for name, stats in results["queries"].items():
25+
spacing = " " * (max_name_len - len(name))
26+
result_min = round(stats['min'], 2)
27+
result_max = round(stats['max'], 2)
28+
result_avg = round(stats['avg'], 2)
29+
print(f"{name}{spacing}\t{result_min:>5.2f}\t{result_max:>5.2f}\t{result_avg:>5.2f}")
30+
31+
def _prepare_run_config(target_config, runs):
32+
config = target_config
33+
config.update({
34+
"task": "query",
35+
"runs": runs,
36+
})
37+
return base64.b64encode(json.dumps(config).encode("utf-8")).decode("utf-8"), config["module"]
38+
39+
40+
def _read_config(config_file):
41+
with open(config_file) as f:
42+
config = yaml.safe_load(f)
43+
return config

cli/test_run.py renamed to cli/run.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ def http_request(url):
1717
raise Exception("http_request failed with retry")
1818

1919

20-
def one_run(num_workers, run_config, target_module, timeout, namespace):
20+
def one_run(num_workers, run_config, target_module, timeout, namespace, endpoint="/report/insert"):
2121
kube = Kubernetes()
2222
res = subprocess.run(["helm", "install", "-n", namespace, "dbtest", ".", "--set", f"workers={num_workers}", "--set", f"run_config={run_config}",
2323
"--set", f"target_module={target_module}", "--set", f"namespace={namespace}"], cwd="deployment", stdout=subprocess.DEVNULL)
@@ -34,8 +34,8 @@ def one_run(num_workers, run_config, target_module, timeout, namespace):
3434
time.sleep(10)
3535
collector_pod_name = kube.find_pod(namespace, "app", "dbtest-collector")
3636
kube.patch_socket()
37-
results = json.loads(http_request(f"http://{collector_pod_name}.pod.{namespace}.kubernetes:5000/report"))
37+
results = json.loads(http_request(f"http://{collector_pod_name}.pod.{namespace}.kubernetes:5000{endpoint}"))
3838
res = subprocess.run(f"helm uninstall -n {namespace} dbtest".split(" "), stdout=subprocess.DEVNULL)
3939
res.check_returncode()
4040
kube.wait_for_pods_terminated(namespace, "app", "dbtest-worker")
41-
return results["sum"]["ops_per_second"]
41+
return results

run.py

Lines changed: 4 additions & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -1,95 +1,7 @@
1-
import base64
2-
import json
3-
import sys
4-
import click
5-
import yaml
6-
from cli.test_run import one_run
7-
8-
9-
@click.command()
10-
@click.option('-t', '--target', required=True, help="Name of the target")
11-
@click.option('-c', '--config', default="config.yaml", help="Name of the config file to use")
12-
@click.option('-w', '--workers', default="1,4,8,12,16", help="Sets of worker counts to use, separate by comma without space, default='1,4,8,12,16'")
13-
@click.option('-r', '--runs', default=3, help='Number of runs per worker count, default=3')
14-
@click.option("--primary-key", default="db", type=click.Choice(['sql', 'db', 'client', 'uuid'], case_sensitive=False))
15-
@click.option("--tables", default="single", type=click.Choice(['single', 'multiple'], case_sensitive=False))
16-
@click.option("--num-inserts", default=10000, help="Number of inserts per worker, default=10000")
17-
@click.option("--prefill", default=0, help="Insert this number of events into the table before starting the test run, default=0")
18-
@click.option("--extra-option", multiple=True, help="Extra options for the database module")
19-
@click.option("--timeout", default=0, help="Timeout in seconds to wait for one run to complete. Increase this if you use higher number of inserts, or set to 0 to disable timeout. default=0")
20-
@click.option("--batch", default=0, help="Number of events to insert in one batch, default 0 disables batch mode")
21-
@click.option('--clean/--no-clean', default=True, help="Clean up the database before each run, enabled by default")
22-
@click.option('--steps', default=0, help="TODO")
23-
def main(target, config, workers, runs, primary_key, tables, num_inserts, prefill, extra_option, timeout, batch, clean, steps):
24-
worker_counts = list(map(lambda el: int(el), workers.split(",")))
25-
config = _read_config(config)
26-
target_config = config["targets"][target]
27-
namespace = config.get("namespace", "default")
28-
if steps and len(worker_counts) > 1:
29-
print("ERROR: If using the --steps option only one worker count can be used")
30-
sys.exit(1)
31-
if steps and runs > 1:
32-
print("ERROR: If using the --steps option only one run is allowed")
33-
sys.exit(1)
34-
if steps and prefill:
35-
print("ERROR: --steps and --prefill cannot be used at the same time")
36-
sys.exit(1)
37-
38-
if steps:
39-
_steps_test(target_config, worker_counts[0], namespace, primary_key, tables, num_inserts, extra_option, timeout, batch, clean, steps)
40-
else:
41-
_normal_test(target_config, worker_counts, namespace, runs, primary_key, tables, num_inserts, prefill, extra_option, timeout, batch, clean)
42-
43-
44-
def _normal_test(target_config, worker_counts, namespace, runs, primary_key, tables, num_inserts, prefill, extra_option, timeout, batch, clean):
45-
run_config, target_module = _prepare_run_config(target_config, primary_key, tables, num_inserts, prefill, int(batch) if batch else None, clean, extra_option)
46-
47-
print(f"Workers\tMin\tMax\tAvg")
48-
for worker_count in worker_counts:
49-
run_results = [one_run(worker_count, run_config, target_module, timeout, namespace) for _ in range(runs)]
50-
result_min = round(min(run_results))
51-
result_max = round(max(run_results))
52-
result_avg = round(int(sum(run_results)/len(run_results)))
53-
print(f"{worker_count:2}\t{result_min:6}\t{result_max:6}\t{result_avg:6}")
54-
55-
56-
def _steps_test(target_config, workers, namespace, primary_key, tables, num_inserts, extra_option, timeout, batch, clean, steps):
57-
run_config, target_module = _prepare_run_config(target_config, primary_key, tables, num_inserts, 0, int(batch) if batch else None, clean, extra_option)
58-
run_config_continued, _ = _prepare_run_config(target_config, primary_key, tables, num_inserts, 0, int(batch) if batch else None, False, extra_option)
59-
stepsize = workers*num_inserts
60-
width = len(f"{stepsize*steps}")
61-
print(f"Stepsize: {stepsize}")
62-
print(f"Level".rjust(width)+"\tInserts/s")
63-
for step in range(steps):
64-
fill = f"{step*stepsize}".rjust(width)
65-
inserts = int(round(one_run(workers, run_config, target_module, timeout, namespace), -1))
66-
run_config = run_config_continued
67-
print(f"{fill}\t{inserts:6}")
68-
69-
70-
def _prepare_run_config(target_config, primary_key, tables, num_inserts, prefill, batch, clean, extra_options):
71-
config = target_config
72-
config.update({
73-
"num_inserts": num_inserts,
74-
"prefill": int(prefill),
75-
"primary_key": primary_key,
76-
"use_multiple_tables": tables=="multiple",
77-
"clean_database": clean,
78-
})
79-
if batch:
80-
config["batch_mode"] = True
81-
config["batch_size"] = batch
82-
for option in extra_options:
83-
k, v = option.split("=", 1)
84-
config[k] = v
85-
return base64.b64encode(json.dumps(config).encode("utf-8")).decode("utf-8"), config["module"]
86-
87-
88-
def _read_config(config_file):
89-
with open(config_file) as f:
90-
config = yaml.safe_load(f)
91-
return config
1+
from cli import commands
2+
from cli.query import query
3+
from cli.insert import insert
924

935

946
if __name__ == '__main__':
95-
main()
7+
commands()

simulator/collector.py

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
import time
33
from flask import Flask, request, jsonify, make_response
44
from modules import select_module
5+
from modules.config import config
56

67

78
app = Flask(__name__)
@@ -37,8 +38,8 @@ def report_result():
3738
return "OK"
3839

3940

40-
@app.route("/report")
41-
def collect_results():
41+
@app.route("/report/insert")
42+
def collect_results_insert():
4243
report = dict()
4344
report["workers"] = results
4445
sum_ops, sum_duration = 0, 0
@@ -50,8 +51,30 @@ def collect_results():
5051
return jsonify(report)
5152

5253

54+
@app.route("/report/queries")
55+
def collect_results_queries():
56+
report = dict()
57+
queries = dict([(name, []) for name in list(results.values())[0]["results"].keys()])
58+
report["workers"] = results
59+
report["queries"] = dict()
60+
61+
for worker in results.values():
62+
for name, values in worker["results"].items():
63+
queries[name].extend(values)
64+
65+
for name, values in queries.items():
66+
report["queries"][name] = {
67+
"min": min(values),
68+
"max": max(values),
69+
"avg": sum(values)/len(values),
70+
}
71+
72+
return jsonify(report)
73+
5374
def run():
54-
select_module().init()
75+
module = select_module()
76+
if config.get("task", "insert") == "insert":
77+
module.init()
5578
# It looks like in some cases for yugabytedb the created table is not instantly available for all workers so wait a few seconds
5679
time.sleep(10)
5780
app.run(host='0.0.0.0', port=5000, debug=False, threaded=False)

simulator/modules/cassandra.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,10 @@ def init():
2121
if config["clean_database"]:
2222
for table_name in ["events0", "events1", "events2", "events3", "events"]:
2323
session.execute(f"DROP TABLE IF EXISTS {KEYSPACE}.{table_name}")
24-
session.execute(f"""DROP KEYSPACE IF EXISTS {KEYSPACE}""")
25-
session.execute(f"""CREATE KEYSPACE IF NOT EXISTS {KEYSPACE}
26-
WITH replication = {{ 'class': 'SimpleStrategy', 'replication_factor':{config["replication_factor"]}}}
27-
""")
24+
session.execute(f"""DROP KEYSPACE IF EXISTS {KEYSPACE}""")
25+
session.execute(f"""CREATE KEYSPACE IF NOT EXISTS {KEYSPACE}
26+
WITH replication = {{ 'class': 'SimpleStrategy', 'replication_factor':{config["replication_factor"]}}}
27+
""")
2828

2929
if config["use_multiple_tables"]:
3030
table_names = ["events0", "events1", "events2", "events3"]

0 commit comments

Comments
 (0)