1
1
SQLTap
2
2
======
3
3
4
- SQLTap is a document-oriented query frontend and cache for MySQL.
4
+ _ This project is old and has become obsolete; it has not been verified to work with newer versions
5
+ of MySQL and there is only a single company that uses it in production that I know of which maintains
6
+ their own private fork. Additionally, Facebook published the GraphQL project, which looks like a much
7
+ nicer API and has gained significant momentum (GraphQL was released many years after SQLTap). This
8
+ means I will not maintain or work on this code anymore._
5
9
6
- You send it requests for complex documents (usually involving "joins" over multiple tables) using a
7
- HTTP API. SQLTap rewrites and pipelines these requests before executing them on the backend MySQL
8
- servers. It also caches common data partials in memcached, reducing query latency and database load. This
9
- is completely transparent to the end user and does not require explicit cache expiration since SQLTap
10
- acts as a MySQL slave and updates cached data partials when they are changed.
10
+ SQLTap is a document-oriented query frontend and cache for MySQL.
11
11
12
- SQLTap was created at DaWanda.com, one of Germany's largest ecommerce sites, where it serves hundreds
13
- of millions of requests per day. It has greatly reduced page render times and has reduced the number
14
- of SQL queries that hit the MySQL database by XX%.
12
+ Users request (nested) documents using a custom declarative query language (see
13
+ below). SQLTap takes these requests and rewrites them into many small SQL queries,
14
+ which are then executed on the backend MySQL database. Once all data has
15
+ been returned from the database, SQLTap assembles the responses into a
16
+ single JSON document and returns it to the user.
17
+
18
+ SQLTap is a caching proxy; responses from MySQL are also stored in a local
19
+ memcache server. This allows SQLTap - after a short warmup phase - to answer most
20
+ repetitive queries without actually having to consult the database. It is important
21
+ to note that SQLTap caches _ partial_ query responses, i.e. it doesn't cache the
22
+ full query response, but the individual parts from which the response was constructed.
23
+ The cached data partials are shared accross similar queries, reducing the total
24
+ cache size and increasing hitrate.
25
+
26
+ The query cache is completely transparent to the user - there is no need for
27
+ explicit expiration and SQLTap will never serve stale data. This is achieved by
28
+ using the MySQL's row based replication protocol to subscribe to notifications on
29
+ row changes and expiring cached data partials accordingly.
15
30
16
31
SQLTap requires MySQL 5.6+ with Row Based Replication enabled.
17
32
33
+
18
34
### Table of Contents
19
35
20
- + [ Rationale] ( #rationale )
21
36
+ [ Usage] ( #usage )
22
37
+ [ HTTP API] ( #http-api )
23
- + [ Configuration] ( #configuration )
24
38
+ [ Query Language] ( #query-language )
25
- + [ Caching] ( #caching )
26
39
+ [ Internals] ( #internals )
27
40
+ [ Examples] ( #examples )
28
41
+ [ License] ( #license )
29
42
30
43
31
- Rationale
32
- ---------
33
-
34
- A question that comes up frequently is "Why would I want use a proxy to retrieve records
35
- from MySQL rather than accessing it directly"?
36
-
37
- SQLTap was created under the name "LoveOS Fast Fetch Service" while re-designing a substantial
38
- part of the DaWanda.com ecommerce application. The goal was to improve page render times and
39
- to obviate some of the anti-patterns that are commonly found in ORM-based web apps. These are
40
- the main reasons that led to the decision:
41
-
42
- #### Automatic Parallelization
43
-
44
- In a web application context, you often need to retrieve a collection of related
45
- database records to fulfill a http request. For example, to render a product detail page,
46
- you might need to retrieve a 'product' record and all 'image' records that belong to
47
- the product record.
48
-
49
- The naive way to do this without putting the burden on the database by using an
50
- expensive join operation is to sequentially execute multiple SQL queries. E.g. first
51
- retrieve the product record and then retrieve all the image records. This is also what
52
- some ORMs like Rail's ActiveRecord will do by default.
53
-
54
- On the other hand, retrieving the records in parallel rather than sequentially can result
55
- in a huge drop in response time, which is highly desirable for user facing applications.
56
-
57
- As an example, assume retrieving a single record takes 10ms. Then retrieving 5 records
58
- using sequential execution would take 50ms, but retrieving them in parallel would (in a
59
- perfect world) still only take 10ms.
60
-
61
- While this parallelization could be implemented explicitly in your application code, it
62
- would introduce redundant logic and unnessecary complexity; Running parallel sql queries
63
- from a single threaded web framework is not trivial, as the MySQL protocol does not allow
64
- for pipelining per se and most MySQL client implementations use blocking I/O.
65
-
66
- SQLTap executes all sql queries in parallel where possible using multiple connections to
67
- MySQL and non-blocking I/O.
68
-
69
- #### Query Caching
70
-
71
- SQLTap caches partial query responses in memcache, which speeds up some queries by
72
- multiple orders of magnitude and greatly reduces the load on the MySQL database.
73
-
74
- It doesn't cache the full query responses, but only normalized common query subtrees which
75
- means that the cached data partials are shared accross similar queries. This makes the cache
76
- more space efficient (as it contains fewer redundancies) and increases the hit-rate.
77
-
78
- The query cache is completely transparent as there is no need for explicit expiration and it
79
- will never serve stale data: SQLTap uses MySQL's row based replication protocol to get
80
- notifications on record changes and refresh the cached data partials accordingly.
81
-
82
- #### Encapsulation
83
-
84
- SQLTap permits only a subset of SQL to be executed and enforces limits on maximum execution
85
- time and result set size. This is to prevent SQL queries that might seem harmless at first,
86
- but turn out to be a bottleneck as the data set grows.
87
-
88
- #### Document Oriented Query Language
89
-
90
- Some of the modern web frameworks encourage you to use an ORM for database access. This often
91
- results in bad code where requests to the sql database are scattered all over the code and
92
- sometimes even the templates. In these codebases it can get really hard to predict the runtime
93
- of a method/template and whether it will block.
94
-
95
- Take as an example a helper method that renders one entry in a navigation menu. For each entry
96
- the helper calls something like "entry.translation" which in turn issues a request to the
97
- database to retrieve the translation record for this entry. As the number of entries in the
98
- navigation grows, this leads to potentially thousands of sql queries being executed just
99
- to render a simple navigation menu.
100
-
101
- The SQLTap query language encourages you to fetch all required data with only a few but therefore
102
- large and nested queries (documents). This will hopefully make applications easier to maintain and
103
- less bloated in the long term.
104
-
105
-
106
- #### Query Optimizations
107
-
108
- SQLTap also performs some trivial query optimizations (i.e. eliminating redundant queries)
109
-
110
-
111
44
Usage
112
45
-----
113
46
114
47
### Starting SQLTap
115
48
116
- ./sqltap --mysql-host localhost --mysql-port 3006 --mysql-user root --mysql-database mydb --http 8080 -c config.xml
49
+ ./sqltap \
50
+ --mysql-host localhost \
51
+ --mysql-port 3006 \
52
+ --mysql-user root \
53
+ --mysql-database mydb \
54
+ --http 8080 \
55
+ -c config.xml
117
56
118
57
119
58
HTTP API
@@ -144,9 +83,8 @@ is the same as:
144
83
/query?q=user.findOne(1){*};user.findOne(2){*};user.findOne(3){*}
145
84
146
85
147
-
148
86
Query Language
149
- --------------
87
+ ------------
150
88
151
89
##### resource.findOne(id){...}
152
90
##### relation.findOne{...}
219
157
220
158
/query?q=vote.countAllWhere("product_id = 44244778 and created_at>'2014-01-01'"){}
221
159
222
- Configuration
223
- -------------
224
-
225
- here be dragons
226
-
227
160
228
161
Internals
229
162
---------
@@ -243,26 +176,13 @@ are also thread local. This means there is very little locking in the hot path
243
176
worker owns one sql connection pool which opens a fixed max number of connections
244
177
and also contains a query queue.
245
178
246
- + Main thread also runs watchdog; kills dead workers.
247
-
248
- + The QueryParser and the HTTP and SQL protocl are implemented as simple state
249
- machines.
250
-
251
- + CTrees are only used if the CTree is a subtree of the request - a ctree is not used
252
- when the ctree is a "supertree" of the request. CTree are only matched on findOne
253
- Instructions and each CTree query must start with a findOne Instruction.
254
-
255
- + All memcache contents are gzipped
256
-
257
- ### Bechmarks
258
-
259
- ab / weighttp benchmarks here
179
+ + Cached partial query responses are gzipped and cached in memcache
260
180
261
181
262
182
Examples
263
183
--------
264
184
265
- Real-life product detail page:
185
+ Real-life product detail page query :
266
186
267
187
product.findOne(12345){
268
188
deleted_at,view_counter,category_id,category_parent_id,is_valid,id,
0 commit comments