-
Notifications
You must be signed in to change notification settings - Fork 43
[external API] alerts: the renamening #8169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
had to go back and un-rename some columns since apparently CRDB can't do that (sad face!)
i gotta stop forgetting expectorate tests
Clone, Debug, Queryable, Selectable, Insertable, Serialize, Deserialize, | ||
)] | ||
#[diesel(table_name = alert_subscription)] | ||
pub struct AlertRxSubscription { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for myself: this was moved from nexus/db-model/src/webhook_rx.rs
receiver: Query<params::AlertReceiverSelector>, | ||
state_filter: Query<params::AlertDeliveryStateFilter>, | ||
pagination: Query<PaginatedByTimeAndId>, | ||
) -> Result<HttpResponseOk<ResultsPage<views::WebhookDelivery>>, HttpError>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it weird that this is titled "list alert-deliveries", but it's returning "webhook delivery" objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namely: Should this be an "enum-of-one" if we expect there will be non-webhook alerts in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I wasn't really sure what to do about this endpoint, honestly. Since it lists deliveries by receiver ID, the type of the response entries will all be the same type of delivery based on the type of receiver.
They could be an enum, but ideally, there would be a single enum containing the list of deliveries, so the user only has to handle the potentially variable type a single time, and once you inspect the top level enum, all the entries are the same type. I thought about doing this, but it felt a bit awkward with dropshot's ResultsPage
--- though we could return an enum of multiple types of ResultsPage
s.
Alternatively, we could just make this endpoint specific to receiver types, so this one could be webhook-deliveries
and we could just add separate delivery list endpoints if we were to add other receiver types in future. That might be the better move, since then there's no enum at all..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I wasn't really sure what to do about this endpoint, honestly. Since it lists deliveries by receiver ID, the type of the response entries will all be the same type of delivery based on the type of receiver.
Why would we expect these to differ by receiver type?
They could be an enum, but ideally, there would be a single enum containing the list of deliveries, so the user only has to handle the potentially variable type a single time, and once you inspect the top level enum, all the entries are the same type. I thought about doing this, but it felt a bit awkward with dropshot's
ResultsPage
--- though we could return an enum of multiple types ofResultsPage
s.
A ResultsPage of enums would be great; an enum of ResultsPage would fuck up pagination magic.
Alternatively, we could just make this endpoint specific to receiver types, so this one could be
webhook-deliveries
and we could just add separate delivery list endpoints if we were to add other receiver types in future. That might be the better move, since then there's no enum at all..
Again, since multiple receiver types is speculative at this point, let's choose something relatively consistently and re-evaluate when we have a second receiver.
@@ -1246,7 +1249,7 @@ pub static DEMO_WEBHOOK_SECRET_CREATE: LazyLock<params::WebhookSecretCreate> = | |||
secret: "TRUSTNO1".to_string(), | |||
}); | |||
|
|||
// pub static DEMO_WEBHOOK_SUBSCRIPTION: LazyLock<shared::WebhookSubscription> = | |||
// pub static DEMO_ALERT_SUBSCRIPTION: LazyLock<shared::WebhookSubscription> = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I know this is commented out code (why is this here?) but should this be a shared::AlertScubscription
now?
Co-authored-by: Sean Klein <[email protected]>
Co-authored-by: Sean Klein <[email protected]>
Co-authored-by: Sean Klein <[email protected]>
@david-crespo re:
Yeah, I agree that this feels a bit weird. One thing I considered doing is also having a |
Oh, @smklein, one other thing: there are a couple of places where we now have tables that have a few columns with unfortunate names ("event_class" and "event_id" rather than "alert_class" and "alert_id") because CRDB doesn't support renaming columns idempotently. Do you think it's worth changing the migrations to drop those tables and create new ones with nicer names, instead of just renaming the table? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ughhhh apparently this won't work as written, since you can't "ALTER TYPE IF EXISTS":
──── STDERR: omicron-nexus::test_all integration_tests::schema::dbinit_equals_sum_of_all_up
log file: /tmp/test_all-fef1124d0eceaf7d-dbinit_equals_sum_of_all_up.3128075.0.log
note: configured to log to "/tmp/test_all-fef1124d0eceaf7d-dbinit_equals_sum_of_all_up.3128075.0.log"
thread 'integration_tests::schema::dbinit_equals_sum_of_all_up' panicked at nexus/tests/integration_tests/schema.rs:70:38:
Failed to execute update: Error { kind: Db, cause: Some(DbError { severity: "ERROR", parsed_severity: Some(Error), code: SqlState(E42601), message: "at or near \"exists\": syntax error", detail: Some("source SQL:\nALTER TYPE IF EXISTS omicron.public.webhook_event_class\n ^"), hint: Some("try \\h ALTER TYPE"), position: None, where_: None, schema: None, table: None, column: None, datatype: None, constraint: None, file: Some("lexer.go"), line: Some(271), routine: Some("Error") }) }
I guess we'll have to drop the previous enums and add new ones, that's a bummer...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
took a pass comparing what we had determined in the CLI with the changes you're proposing to the API
@@ -3660,79 +3693,48 @@ pub trait NexusExternalApi { | |||
/// queued for re-delivery. | |||
#[endpoint { | |||
method = POST, | |||
path = "/v1/webhooks/receivers/{receiver}/probe", | |||
tags = ["system/webhooks"], | |||
path = "/v1/webhook-receivers/{receiver}/probe", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you consider nesting webhook-receivers under alerts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I had initially wanted to do /v1/alert-receivers/webhooks
. however, we can't nest routes under a route which can also look up a resource by name. because there's a /v1/alert-receivers/{name-or-id}
route, we can't also have a /v1/alert-receivers/webhooks/{name-or-id}
route, since it's unclear whether /webhooks
should be treated as a receiver name or as a routable path segment.
i did also consider nesting all of this stuff under a top-level /v1/alerts
, so /v1/alerts/receivers
, /v1/alerts/webhook-receivers
, and so on. however, /v1/alerts
is also the route for looking up the actual alert resource (currently used for resend but maybe also for actually getting the payload etc in future). alerts are only looked up by UUID and never by name, so we could nest other routes under /v1/alerts
, but it felt a bit weird, and i didn't want to put the alert-lookup route under /v1/alerts/alerts
because...that's gross.
also, @david-crespo had previously told me that we try to keep the public API as "flat" as possible rather than nesting, and i think the /v1/alerts
, /v1/alert-receivers
, /v1/alert-deliveries
etc structure is closer to what we've done elsewhere? if either of you have suggestions for a better structure given all that, though, i'm all ears!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@david-crespo can you comment? It seems like "as flat as possible... but not flatter" might be the addendum.
path = "/v1/webhooks/receivers/{receiver}/probe", | ||
tags = ["system/webhooks"], | ||
path = "/v1/webhook-receivers/{receiver}/probe", | ||
tags = ["system/alerts"], | ||
}] | ||
async fn webhook_receiver_probe( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this particular to webhook receivers? In the CLI we had decided to call this subcommand oxide alert receiver probe
i.e. assuming it would not be specific to webhooks but general for all receivers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it returns a response model that is webhook-specific (it contains information about the status code of the HTTP response etc).
we could make that model be an enum with a single variant if we think it's preferable for this to be a route that applies to all receiver types. but, my thinking was that it's possible there might be some future receiver types that cannot be probed (can you synchronously probe an email?)
path = "/v1/alert-deliveries", | ||
tags = ["system/alerts"], | ||
}] | ||
async fn alert_delivery_list( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we had decided to call this alert receiver log
in the CLI. It is particular to a specific receiver, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it's particular to a receiver, which is taken as a query param. i think log
is a good name for the CLI, but in the API it felt like it should be a "list of delivery resources"...it could also be /v1/alert-receivers/{name-or-id}/deliveries
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/v1/alert-receivers/{name-or-id}/deliveries
That's what I was thinking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, we should be able to do that. i'll note that the previous route was /v1/webhook-deliveries?receiver=...
and i think that may also have been at @david-crespo's urging IIRC.
#[endpoint { | ||
method = PUT, | ||
path = "/v1/webhooks/receivers/{receiver}", | ||
tags = ["system/webhooks"], | ||
method = POST, | ||
path = "/v1/alert-receivers/{receiver}/subscriptions", | ||
tags = ["system/alerts"], | ||
}] | ||
async fn webhook_receiver_update( | ||
async fn alert_receiver_subscription_add( | ||
rqctx: RequestContext<Self::Context>, | ||
path_params: Path<params::WebhookReceiverSelector>, | ||
params: TypedBody<params::WebhookReceiverUpdate>, | ||
) -> Result<HttpResponseUpdatedNoContent, HttpError>; | ||
path_params: Path<params::AlertReceiverSelector>, | ||
params: TypedBody<params::AlertSubscriptionCreate>, | ||
) -> Result<HttpResponseCreated<views::AlertSubscriptionCreated>, HttpError>; | ||
|
||
/// Delete webhook receiver | ||
/// Remove alert receiver subscription | ||
#[endpoint { | ||
method = DELETE, | ||
path = "/v1/webhooks/receivers/{receiver}", | ||
tags = ["system/webhooks"], | ||
path = "/v1/alert-receivers/{receiver}/subscriptions/{subscription}", | ||
tags = ["system/alerts"], | ||
}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you made what I thought was a good suggestion in the CLI to call these oxide alert receiver subscribe
and oxide alert receiver unsubscribe
. Do you now prefer subscription add/remove? I think I prefer the former, but my only strong preference is that CLI and API match in this regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like the "subscribe"/"unsubscribe" naming for the CLI. per this comment from @david-crespo we prefer to use verbs like "add"/"remove" in API routes because they're consistent with other API operations, rather than descriptive verbs like "subscribe"/"unsubscribe". personally i think i would strongly prefer the more descriptive verbs in the CLI and don't have strong preferences about whether we should also use that in the API...perhaps it's more important for both to be consistent with each other than to use the more descriptive verbs, i dunno...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds fine; just wanted to make sure we're making the decision eyes open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i definitely think there's value in using the same verbs in the CLI and the API, because then the CLI becomes a sort of teaching tool for the API: if you've done something manually, you know exactly where to look if you want to do it programmatically. but, i'm not totally sure how strongly we've weighed that against other concerns in the past. are there any cases where we've intentionally chosen to use different verbs (or nouns!) in the CLI, or are they always the same?
What about creation via
My suggestion is that we do one or the other for now i.e. "list all receivers" (which happen to just be webhooks) or "list webhook receivers" (which happen to be all receivers). In the future, I can't imagine the utility for listing JUST the webhooks, but I have been known to lack imagination!
Given that a second flavor of webhook is speculation at this point, I would suggest just pick something and re-evaluate when we have something more concrete in the future. |
Unfortunately we can't have a route like that, as receivers are looked up by name, and there's an ambiguity as to whether |
In conversation with @ahl, we have determined that the external API for webhooks added in #7277 should be changed to focus on "alerts" as the first-class user-facing concept, with "webhooks" as one delivery mechanism for alerts. This way, we can talk about alerts as an entity in the API that exist independently of webhooks that deliver alerts, and the same alert types can be shared with other alert delivery mechanisms if any are added in the future.
What we currently refer to as "webhook events" and "webhook event classes" are therefore renamed to "alerts" and "alert classes". The current concept of "webhook receivers" is generalized to an "alert receiver" resource, of which webhook receivers are (currently) the only subtype. This way, if we add other mechanisms of delivering alerts in the future (email, first-class Slack integration, etc), we can introduce new subtypes of alert receivers. I've restructured the API to have both
/v1/alert-receivers/...
and/v1/webhook-receivers/...
routes, with operations common to all alert receivers (list, view, add/remove subscriptions, delete) under thealert-receivers
route, and operations related to webhook-specific configuration (add/remove secrets, probe, deliveries) under thewebhook-receivers
route. I've also changed theAlertReceiver
view to have a "kind" enum that stores the subtype-specific configuration; currently, this will only ever be "webhook", but I thought it was worth doing this now to make future additions cause less breakage for API consumers.This is, admittedly, a somewhat large diff, but fortunately, most of it is just renaming stuff and moving it around. Reviewers can focus more or less exclusively to the changes to the external API routes and models, and maybe the database migrations. Any mistakes while renaming and moving things around have already been caught by the Rust compiler. :)