GraphQL query Support #3433

lvauvillier · 2021-09-15T21:06:16Z

Context

Is your feature request related to a problem? Please describe.
A previous issue was opened on this topic (#1765) but was closed due to query format design and ResultSet issue.

As we already run a working graphql API on top of cubeJS REST API on production, I wanted to share here the design API we choosed and how we solved it.

I hope this will start an official support of graphQL API.

Describe the solution you'd like
There is two possible solutions:

A graphql API on top of the existing REST API (the GraphQL API act as a "proxy")
A standalone graphql API (a new graphQLApiGateway module in the cubejs server core?)

We will here choose the first solution to benefit of existing optimization, security, query rewrite etc...

API design

Context

GraphQL APIs returns data in the same shape as they were requested. This constraint dont allow a perfect mapping to the existing REST api and its response.

⚠️ The ResultSet object will not be usable for data blending and has to be rewrited for graphQL responses. (Or provide a GraphQLResultSet object). This task is not hard to do, if we flatten the graphQL object, most of the existing part of the current ResultSet class can be reused.

Example

{
    events (
       filters: [
       {
            member: "publishedAt"
            operator: inDateRange
            values: ["2021-08-01", "2021-08-30"]
        }
    ]) {
        measures {
            count
            pageViewsCount
        }
        dimensions {
            country
        }
    }
}

Response

{
    "events": [{
        "measures": {
            "count": 145,
            "pageViewCount": 35
        },
        "dimensions": {
            "country": "US"
        } 
    },
    {
        "measures": {
            "count": 23,
            "pageViewCount": 9
        },
        "dimensions": {
            "country": "FR"
        } 
    },
    {
        "measures": {
            "count": 45,
            "pageViewCount": 12
        },
        "dimensions": {
            "country": "DE"
        } 
    }]
}

Spec

enum CubeFilterOperator {
  afterDate
  beforeDate
  contains
  equals
  gt
  gte
  inDateRange
  lt
  lte
  notContains
  notEquals
  notInDateRange
  notSet
  set
}

enum CubeOrder {
  asc
  desc
}

enum CubeGranularity {
  second
  minute
  hour
  day
  week
  month
  year
}

input CubeFilterInput {
  member: String!
  operator: CubeFilterOperator!
  values: [String]
  or: [CubeFilterInput!]
  and: [CubeFilterInput!]
}

type Root {
    <cubeName>(filters: [CubeFilterInput!], timezone: String, limit: Int, offset: Int, renewQuery: Boolean): [<CubeName>!]!
    ...
}

type <CubeName> {
    dimensions: <CubeName>Dimension!
    measures: <CubeName>Measure!
}

type <CubeName>Dimension {
    <dimensionName>(order: CubeOrder): String | SafeInt
    <timeDimensionName>(order: CubeOrder, granularity: CubeGranularity): DateTime
    ...
}

API Generation

Framework

We need to use a code-first graphql framework to automatically generate entities using the cube schema definition.

Example here of api generation using Data model (using nexus framework):
https://github.com/prisma/nexus-prisma
https://github.com/graphql-nexus/nexus-plugin-prisma

Implementation

Resolving steps

1 - For each at each <CubeName>, a query has to be generated. If more than one Cube are requested in the same graphQL query, we can use a DataLoader to blend generated queries at the end and make a unique REST API call.
2 - At <CubeName> level, we have only the arguments (filters, limit, offset, etc.). We need to walk through the children nodes and collect all measures, dimensions and granularity. To achieve this task we can use the info argument to get the graphQL query AST. Wee need to take care of graphQL directives (@if, @Skip, @include) to build a valid query.
3 - Send the query to the existing cubejs REST API, handle the "Continue Wait" response, handle errors and re-shape the result to match the graphQL tree

Custom resolvers

With the current api if we wanted to query by dimension (eg. by product) and if this dimension is stored as an id (productId), we will get only the productId in the result.

What if we wanted to display a nice graph with the product Name? We can add the name in the cube, but what if we need more data? this can lead to issues especially a waste of data if we use preaggregations.

We can use here the GraphQL capabilities to create a new Product entity and use custom resolvers.
the entity name and resolvers can be added to the cube schema definition. Only the productId will be used to resolve the entity.

Example:

{
    purchases {
        measures {
            count
        }
        dimensions {
            product {
                id
                name
            }
        }
    }
}

The text was updated successfully, but these errors were encountered:

paveltiunov · 2021-09-20T05:29:21Z

@lvauvillier Hey Luc! Thanks for posting this one! I'm curious if you have any ideas on how to handle long polling over some long periods of time? I see you propose to handle it server-side. What if the load balancer has a timeout for HTTP requests?

Mentioning @MattGson @tomsej here as participants of #1765.

lvauvillier · 2021-09-20T08:59:26Z

@paveltiunov this is a good question. For my current usage we assume that queries takes a reasonable time (our pre-aggregations covers 100% of possible dashboard queries).

We just handle the "continue wait" response using promises and delays.

This is the getCubeResults function we use in our resolvers:

export async function getCubeResults(
  query: CubeQuery,
  delay = 500
): Promise<any> {
  const token = createToken();
  const headers = { Authorization: `Bearer ${token}` };

  const url = `${process.env.CUBEJS_API}/cubejs-api/v1/load?${encode({
    query: JSON.stringify(query)
  })}`;

  const response = await fetch(url, { method: "GET", headers });
  if (!response.ok) {
    throw Error(`Error querying cubejs api: status ${response.status}`);
  }

  const json = await response.json();

  if (json.error === "Continue wait") {
    await new Promise((resolve) => setTimeout(resolve, delay));
    return getCubeResults(query, delay * 1.2); // increase delay
  }

  return json;
}

I dont think GraphQL is designed to handle long running queries.
The best practice for long running queries in GraphQL might be:
1 - register a task using a mutation and get a taskId
2 - poll task status (a subscription can be used)
3 - get results when status is "completed"

lvauvillier · 2021-10-18T20:45:12Z

A first implementation is available (see #3555). All feedbacks are welcomed

lvauvillier · 2021-11-02T10:09:42Z

New spec is available in PR #3555.
I close this issue

* feat(gateway): Add GraphQL proxy * Add missing hour granularity * Add graphql as regular dependency * Use apiGateway.load() instead of fetch() to get results * New api design and filter argument * Move granularity from args to fields * Lint * Non null members * Use compilerApi instance to cache graphql schema Fixes #3433

rpaik added the enhancement New feature proposal label Sep 15, 2021

paveltiunov self-assigned this Sep 20, 2021

lvauvillier mentioned this issue Oct 18, 2021

feat(gateway): Add GraphQL endpoint #3555

Merged

lvauvillier closed this as completed Nov 2, 2021

peterklingelhofer mentioned this issue Jul 14, 2023

feat: GraphQL GROUP BY Resolver #6886

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GraphQL query Support #3433

GraphQL query Support #3433

lvauvillier commented Sep 15, 2021 •

edited

Loading

paveltiunov commented Sep 20, 2021

lvauvillier commented Sep 20, 2021 •

edited

Loading

lvauvillier commented Oct 18, 2021

lvauvillier commented Nov 2, 2021

GraphQL query Support #3433

GraphQL query Support #3433

Comments

lvauvillier commented Sep 15, 2021 • edited Loading

Context

API design

Context

Example

Spec

API Generation

Framework

Implementation

Custom resolvers

paveltiunov commented Sep 20, 2021

lvauvillier commented Sep 20, 2021 • edited Loading

lvauvillier commented Oct 18, 2021

lvauvillier commented Nov 2, 2021

lvauvillier commented Sep 15, 2021 •

edited

Loading

lvauvillier commented Sep 20, 2021 •

edited

Loading