-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat(gateway): Add GraphQL endpoint #3555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3555 +/- ##
==========================================
- Coverage 59.67% 58.30% -1.38%
==========================================
Files 125 126 +1
Lines 10014 10283 +269
Branches 2266 2340 +74
==========================================
+ Hits 5976 5995 +19
- Misses 3756 4002 +246
- Partials 282 286 +4
Continue to review full report at Codecov.
|
@lvauvillier Hey Luc! Thanks for contributing this one! It's a huge accomplishment! I'm still reviewing it. So far so good. It might be beneficial to add a couple of E2E tests with graphql requests to the |
|
||
const json = await response.json(); | ||
|
||
if (json.error === 'Continue wait') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cube.js has a queue system that supports removing unused queries (orphaned). Continue wait
(long polling) is used to detect that client still requires and waiting for a response from the server. I don't think that it's a blocker for this feature, but anyway it's a limitation.
cC @paveltiunov
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lvauvillier There's no easy solution to the long polling problem in graphql so for now I'd suggest just returning the error or any other type of response that means it needs to be fetched again. Workarounds to this problem would be using pre-aggregations, increasing Continue wait
timeouts, and implementing retry logic client-side. Retry logic on the backend isn't necessary and it could be dropped as it basically handled by continueWaitTimeout
(https://cube.dev/docs/config#queue-options). The reason long polling exists in Cube isn't that Cube itself can't wait for so long but that reverse proxies will usually drop connection much earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paveltiunov I now use apiGateway.load() to retrieve results. A Continue wait
response will raise an error.
const headers = { ...originalReq.headers } as Record<string, string>; | ||
['host', 'connexion', 'content-length'].forEach(key => delete headers[key]); | ||
|
||
const response = await fetch(url, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, I think it's better to use a direct call for the load
method from ApiGateway
instead of creating a proxy.
We use a similar strategy for WS & SQL servers:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ovr, i agree, this was my first option, but i revert to a fetch call because i don't find how to deal with the continue wait issue.
I can see a hardcoded continueWait: true
in the adapterApi.executeQuery()
options. I didn't spend a lot of time to understand how the load api works and what triggers the continue wait response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ovr, i finally used theload()
api and raise an error if i get a continue wait
response.
@lvauvillier On a side note how would query with members from multiple cubes look like? |
@paveltiunov There is no problem to query multiple cubes: {
cube1 {
measures {
count
}
}
cube2 {
measures {
count
}
}
} you will get a result: {
"cube1": [{ "measures": { "count": 10 }}],
"cube2": [{ "measures": { "count": 20 }}],
} The query will be resolved using 2 distinct api calls. |
@lvauvillier I mean what if we need to do a joined cubes query? |
@paveltiunov good question. This is not possible with this design. |
This design can work (and is closer to the rest api): {
load(filters: ..., limit: ...) {
measures {
cube1 {
count
}
cube2 {
count
}
}
dimensions {
cube1 {
country
}
cube2 {
publishedAt(granularity: day) <-- timeDimension
}
}
} But the result shape is complex... A simpler design can also work as we can guess the type (dimension or measures) of each cube fields. (timeDimensions are already auto detected if we set a {
load(filters: ..., limit: ...) {
cube1 {
count <-- measure
country <-- dimension
}
cube2 {
count <-- measure
publishedAt(granularity: day) <-- timeDimension
}
}
} Other point:In this new design, filters are now global, so we need to put back the cube information in the filter Keeping graphqQL naming convention (camelCase) filters will look like this: {
load(filters: [{ member: "cube2.publishedAt", operator: inDateRange, values: ["2021-01-01", "2021-12-31"]}]) {
cube1 {
count <-- measure
country <-- dimension
}
cube2 {
count <-- measure
publishedAt(granularity: day) <-- timeDimension
}
}
} I dont like this syntax and the "dot" prevents the usage of a generated graphql enum. Maybe we can use here:
What do you think? |
@lvauvillier Yeah. Cool! This nested cube approach seems to be solving an issue. Regarding |
@paveltiunov I also like the idea to have a filtering argument at both query and member fields level. For nested filters we need to use the simpler design to get rid of the unnecessary This also means that:
I'll give a try this weekend. |
@paveltiunov I just push an update and it works nicely :) The new filters works great with autocompletion and we can now join cubes. Filters are autogenerated based on the member type with some changes from the REST API (heavily inspired from hasura and prisma filtering api):
Example: {
load(offset: 20, limit: 5, where: {users: {firstName: {notEquals: "Golden"}}}) {
orders(granularity: {createdAt: day}, where: {OR: [{createdAt: {afterDate: "2021-01-01"}}, {status: {equals: "shipped"}}]}, orderBy: {createdAt: asc}) {
count
status
createdAt
}
users {
firstName
}
}
} will generate: {
"measures": [
"Orders.count"
],
"dimensions": [
"Orders.status",
"Users.firstName"
],
"timeDimensions": [
{
"dimension": "Orders.createdAt",
"granularity": "day"
}
],
"order": {
"Orders.createdAt": "asc"
},
"limit": 5,
"offset": 20,
"filters": [
{
"member": "Users.firstName",
"operator": "notEquals",
"values": [
"Golden"
]
},
{
"or": [
{
"member": "Orders.createdAt",
"operator": "afterDate",
"values": [
"2021-01-01"
]
},
{
"member": "Orders.status",
"operator": "equals",
"values": [
"shipped"
]
}
]
}
]
} New graphQL spec: input DateTimeFilter {
afterDate: String
beforeDate: String
equals: String
in: [String]
inDateRange: [String]
notEquals: String
notIn: [String]
notInDateRange: [String]
set: Boolean
}
input FloatFilter {
contains: Float
equals: Float
in: [Float]
notContains: Float
notEquals: Float
notIn: [Float]
set: Boolean
}
input StringFilter {
contains: String
equals: String
in: [String]
notContains: String
notEquals: String
notIn: [String]
set: Boolean
}
enum Granularity {
day
hour
minute
month
second
week
year
}
enum OrderBy {
asc
desc
}
type Query {
load(limit: Int, offset: Int, renewQuery: Boolean, timezone: String, where: RootWhereInput): [Result!]!
}
type Result {
<cubeName>(granularity: <CubeName>GranularityInput, orderBy: <CubeName>OrderByInput, where: <CubeName>WhereInput): <CubeName>Members!
...
}
type <CubeName>Members {
<member>: String | Float | DateTime
...
}
input RootWhereInput {
AND: [RootWhereInput!]
OR: [RootWhereInput!]
<cubeName>: <CubeName>WhereInput
...
}
input <CubeName>WhereInput {
AND: [<CubeName>WhereInput!]
OR: [<CubeName>WhereInput!]
<member>: FloatFilter | DateTimeFilter | StringFilter
...
}
input <CubeName>GranularityInput {
<timeMember>: Granularity
}
input <CubeName>OrderByInput {
<member>: OrderBy
...
} |
@lvauvillier Looks really good and it seems to play nicely overall! What do you think about putting granularity inside the time dimension field like this?
The reason for it is generally speaking you can query multiple granularities of the same time dimension. It's rare but still valid use case. Could you please also rebase so we can run the test and be ready to merge? |
@paveltiunov If we add granularity subfields to time dimensions, we need to keep the ability use the time dimension without granularity. We can add a new With granularity {
load(offset: 20, limit: 5, where: {users: {firstName: {notEquals: "Golden"}}}) {
orders(where: {OR: [{createdAt: {afterDate: "2021-01-01"}}, {status: {equals: "shipped"}}]}, orderBy: {createdAt: asc}) {
count
status
createdAt {
day
}
}
users {
firstName
}
}
} Without granularity {
load(offset: 20, limit: 5, where: {users: {firstName: {notEquals: "Golden"}}}) {
orders(where: {OR: [{createdAt: {afterDate: "2021-01-01"}}, {status: {equals: "shipped"}}]}, orderBy: {createdAt: asc}) {
count
status
createdAt {
raw
}
}
users {
firstName
}
}
} I also think about using a union type, but scalars are not allowed: we can't use a simple createdAt {
...on Granularity {
day
}
} More info on this issue: graphql/graphql-spec#215 |
d4a2c2f
to
ed9d06e
Compare
@lvauvillier Yeah. This is a really great point! {
load(offset: 20, limit: 5, where: {users: {firstName: {notEquals: "Golden"}}}) {
orders(where: {OR: [{createdAt: {afterDate: "2021-01-01"}}, {status: {equals: "shipped"}}]}, orderBy: {createdAt: asc}) {
count
status
dimensions {
createdAt
}
}
users {
firstName
}
}
} |
@lvauvillier I guess we can go with |
@paveltiunov I just pushed a new version. I finally chose the name I also added the missing New graphQL spec: input DateTimeFilter {
afterDate: String
beforeDate: String
equals: String
in: [String]
inDateRange: [String]
notEquals: String
notIn: [String]
notInDateRange: [String]
set: Boolean
}
input FloatFilter {
contains: Float
equals: Float
in: [Float]
notContains: Float
notEquals: Float
notIn: [Float]
set: Boolean
}
input StringFilter {
contains: String
equals: String
in: [String]
notContains: String
notEquals: String
notIn: [String]
set: Boolean
}
type TimeDimension {
day: DateTime!
hour: DateTime!
minute: DateTime!
month: DateTime!
quarter: DateTime!
second: DateTime!
value: DateTime!
week: DateTime!
year: DateTime!
}
enum OrderBy {
asc
desc
}
type Query {
load(limit: Int, offset: Int, renewQuery: Boolean, timezone: String, where: RootWhereInput): [Result!]!
}
type Result {
<cubeName>(orderBy: <CubeName>OrderByInput, where: <CubeName>WhereInput): <CubeName>Members!
...
}
type <CubeName>Members {
<member>: String! | Float!
<timeMember>: TimeDimension!
...
}
input RootWhereInput {
AND: [RootWhereInput!]
OR: [RootWhereInput!]
<cubeName>: <CubeName>WhereInput
...
}
input <CubeName>WhereInput {
AND: [<CubeName>WhereInput!]
OR: [<CubeName>WhereInput!]
<member>: FloatFilter | DateTimeFilter | StringFilter
...
}
input <CubeName>OrderByInput {
<member>: OrderBy
...
} |
@lvauvillier We're almost there! The small thing I noted is you introduced cache for the schema. If we cache it we should respect multi-tenancy. To not withhold merging of this big PR we can consider either removing caching (I realize it won't be suitable for production workload yet) or storing it as a part of |
@paveltiunov Oh, you are right, schemas can de dynamic if a It also raises the point of extending graphiQL UI to add the ability to define security contexts like in the playground. This can be part of a new PR. |
@lvauvillier Hey Luc! Looks great! Thanks again for such a big effort! |
This PR is an implementation of the GraphQL spec described in issue #3433
I think i covered all the main functionalities of the REST API.
Description
A new endpoint
/cubejs-api/graphql
is added in the gateway.All security middlewares (checkAuth, queryRewrite, etc.) are still used.
In dev mode, graphiql UI is available.
How it works
Warning
Current implementation will raise an error for a
continueWait
response. Long polling is not implemented (and not possible with this API design).Example
original query:
is equivalent to graphql query:
directives
@skip
and@include
are supported: