-
Notifications
You must be signed in to change notification settings - Fork 265
draft: augment yaml definitions with test schema #1659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v2.x
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good idea to have the schema of the source collection.
@@ -1,6 +1,6 @@ | |||
# $schema: ../schema.json | |||
name: $accumulator | |||
link: 'https://www.mongodb.com/docs/manual/reference/operator/aggregation/accumulator/' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to keep quotes around strings? This style changes make the diff hard to read. Also, I use quotes by default around every string to prevent unexpected parsing rule from Yaml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, looks like the quotes usage is inconsistent in these files - I'd be inclined to define a prettier rule and reformat them in a separate PR to make this one less noisy if that'd make sense to you. The reason for these getting removed is that the way we're adding the schema information is parsing the yaml, fetching the schema from the docs, then writing it, which results in the file getting reformatted based on the tool conventions (we use https://www.npmjs.com/package/js-yaml here).
@@ -25,7 +25,7 @@ arguments: | |||
- | |||
name: p | |||
type: | |||
- resolvesToArray # of resolvesToNumber |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was keeping this comments as a note for later, as we would like to extend type definition for generic lists and hashes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense - same as above, those got removed due to the yaml parser not preserving comments, but will try and find a way to avoid that.
pipeline: | ||
- | ||
$sample: | ||
size: 100 | ||
- | ||
$group: | ||
_id: ~ | ||
_id: null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted that both ~
and null
can be used according to the spec.
onNull: | ||
bytes: !!binary AAAAAAAAAAAAAAAAAABAMA== |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new notation seems less legible to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will look at tweaking the generator to ensure it preserves the old notation.
@@ -24,99 +24,122 @@ arguments: | |||
A positive integral expression that is either a constant or depends on the _id value for $group. | |||
tests: | |||
- | |||
name: 'Null and Missing Values' | |||
link: 'https://www.mongodb.com/docs/manual/reference/operator/aggregation/firstN/#null-and-missing-values' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed a few tests that no longer seem to point to anything in the docs - e.g. "firstN:Null and Missing Values".
This example is in the "Behavior" part of the docs.
https://www.mongodb.com/docs/manual/reference/operator/aggregation/firstN/#null-and-missing-values
"type": "string", | ||
"enum": [ | ||
"array", | ||
"object", | ||
"single", | ||
"search" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, this value is not used.
"enum": [ | ||
"Array", | ||
"Binary", | ||
"Boolean", | ||
"Code", | ||
"CodeWScope", | ||
"Date", | ||
"Decimal128", | ||
"Double", | ||
"Int32", | ||
"Int64", | ||
"MaxKey", | ||
"MinKey", | ||
"Null", | ||
"ObjectId", | ||
"BSONRegExp", | ||
"String", | ||
"BSONSymbol", | ||
"Timestamp", | ||
"Undefined", | ||
"Document", | ||
"Number" | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll have to converge on the same type names for operator return types and argument types. I'm fine with changing the names in expressions.php
to be consistent with other tools.
"Binary", | ||
"Boolean", | ||
"Code", | ||
"CodeWScope", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the CodeWScope
type specific to mongosh?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the deprecated code_w_s bson type. Realistically, I don't expect any of the more advanced types to be found in the docs, but I generated the schema from the type definitions and opted not to manually edit it to make it easier to maintain if it ever changes. If we wanted to clean it up, we can definitely remove the deprecated types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GromNaN: For additional context, PHPC conditionally encodes Code or CodeWScope based on whether the MongoDB\BSON\Javascript object was constructed with a non-null $scope
parameter (see: php_phongo_javascript_init
).
generator/config/stage/project.yaml
Outdated
@@ -11,7 +11,7 @@ arguments: | |||
name: specification | |||
type: | |||
- expression | |||
variadic: object | |||
variadic: object # XXX: This should somehow allow explicit typings for { fieldName: 1|0 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type is "expression", because any expression is accepted to compute a value.
This is very much a POC, but I figured I'd open up a PR to get early feedback on the direction we're heading in.
For context, the devtools team wants to use the yaml definitions from the PHP driver to build an autocomplete experience for the aggregation pipelines in mongosh/compass. We've built a tool to convert the yaml to typescript definitions and it seems to be working well for the base case, but to polish it, we'd need to enhance the yaml definitions to allow us to generate more precise type definitions.
Here's a list of the high-level changes we're proposing (will be updated over time):
$comment
fields (since most json parsers are not happy about it).SimplifiedSchema
from themongodb-schema
package. This is a little bit verbose since it supports multiple types per field, even though the docs rarely have heterogeneous schemas.mongodb-schema
package to infer a probabilistic schema from the example collection.firstN:Null and Missing Values
.