-
Notifications
You must be signed in to change notification settings - Fork 218
feat: primary resource caching for followup reconciliation(s) #2761
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
9019eb9
to
f8b6dc6
Compare
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
19dd31c
to
3b99f78
Compare
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
...ore/src/main/java/io/javaoperatorsdk/operator/api/reconciler/PrimaryUpdateAndCacheUtils.java
Outdated
Show resolved
Hide resolved
...operatorsdk/operator/baseapi/statuscache/primarycache/StatusPatchPrimaryCacheReconciler.java
Outdated
Show resolved
Hide resolved
…tor/api/reconciler/PrimaryUpdateAndCacheUtils.java Co-authored-by: Martin Stefanko <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
...ore/src/main/java/io/javaoperatorsdk/operator/api/reconciler/PrimaryUpdateAndCacheUtils.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Attila Mészáros <[email protected]>
…tor/api/reconciler/PrimaryUpdateAndCacheUtils.java Co-authored-by: Antonio <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using the PatchAndCacheStatusWithLock
approach, we will need to explicitly use optimistic locking here, which in theory it should be fine.
However, the default retry mechanism provided by the JOSDK when a 409
conflict happen doesn't pick the latest resource status, forcing us to implement a custom retry mechanism for this. Otherwise the default retry mechanism is not useful here.
Anyways, I can raise a separate issue for that default mechanism to see if it can be improved there.
Beside that, great work and thanks a lot for this PR.
...ore/src/main/java/io/javaoperatorsdk/operator/api/reconciler/PrimaryUpdateAndCacheUtils.java
Outdated
Show resolved
Hide resolved
(P p, KubernetesClient c) -> c.resource(primary).updateStatus()); | ||
} | ||
|
||
public static <P extends HasMetadata> P patchAndCacheStatus( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add small JavaDoc for this exposed method as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still would be nice to have a JavaDoc for this public method.
public class PeriodicTriggerEventSource<P extends HasMetadata> | ||
extends AbstractEventSource<Void, P> { | ||
|
||
public static final int DEFAULT_PERIOD = 30; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static final int DEFAULT_PERIOD = 30; | |
private static final int DEFAULT_PERIOD = 30; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for tests, and for reusability would stick with the public approach
.untilAsserted( | ||
() -> { | ||
assertThat(reconciler.errorPresent).isFalse(); | ||
assertThat(reconciler.latestValue).isGreaterThan(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe to make sure and guarantee no status update is lost, we could check an array or a list of values from 1 to 10 are present (all of them in particular order).
Otherwise, this assert isGreaterThan(10)
will eventually be true regardless of the caching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi there is that is error check, what should induce that there was monotonic, but yeah definitelly can improve the azsserts
.../io/javaoperatorsdk/operator/baseapi/statuscache/primarycache/StatusPatchPrimaryCacheIT.java
Outdated
Show resolved
Hide resolved
...ava/io/javaoperatorsdk/operator/baseapi/statuscache/withlock/StatusPatchCacheWithLockIT.java
Outdated
Show resolved
Hide resolved
…aseapi/statuscache/primarycache/StatusPatchPrimaryCacheIT.java Co-authored-by: Antonio <[email protected]>
…tor/api/reconciler/PrimaryUpdateAndCacheUtils.java Co-authored-by: Antonio <[email protected]>
Probably the best would be to not use the approach with the lock, but the lockless alternative. |
The only drawback here with the lockless alternative is that it will require us to handle the cache in the operator itself |
Yes, but that is as simple as in the integration test. We might interate on this in the future, but in the short term I think that is best for you. |
Just thinking more about this, that the locked version in this form is not very useful. Since it fails to fulfill its purpose, thus to make sure that the resource is updated with a generated id. On the other hand we could do something that (now deprecated in client) is the Some kind of retry would be nice also for patch withouth the lock. So will spend some more time on this to improve. |
Curious here, what do you mean by
That would be awesome! Thanks |
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
This reverts commit 84eec7b.
This reverts commit 68ca625.
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Chris Laprun <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Signed-off-by: Attila Mészáros <[email protected]>
Basically that you configure fabric8 client to retry. |
@afalhambra-hivemq @metacosm @xstefank added docs and made some improvements. |
(P p, KubernetesClient c) -> c.resource(primary).updateStatus()); | ||
} | ||
|
||
public static <P extends HasMetadata> P patchAndCacheStatus( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still would be nice to have a JavaDoc for this public method.
* Updates the resource and adds it to the {@link PrimaryResourceCache} provided. Optimistic | ||
* locking is not required. | ||
* | ||
* @param primary resource* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @param primary resource* | |
* @param primary resource |
* Patches the resource with JSON Merge patch and adds it to the {@link PrimaryResourceCache} | ||
* provided. Optimistic locking is not required. | ||
* | ||
* @param primary resource* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @param primary resource* | |
* @param primary resource |
* Patches the resource with JSON Patch and adds it to the {@link PrimaryResourceCache} provided. | ||
* Optimistic locking is not required. | ||
* | ||
* @param primary resource* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @param primary resource* | |
* @param primary resource |
private static final Logger log = LoggerFactory.getLogger(PrimaryUpdateAndCacheUtils.class); | ||
|
||
/** | ||
* Makes sure that the up-to-date primary resource will be present during the next reconciliation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that the signature of the method has changed and the checkResourceVersionPresent
method is removed, would be nice if we make it clear and explicit in the JavaDocs which methods require optimistic locking and which don't?
|
||
Therefore, | ||
the framework provides facilities | ||
to cover these use cases withing [`PrimaryUpdateAndCacheUtils`](https://github.com/operator-framework/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/reconciler/PrimaryUpdateAndCacheUtils.java#L16). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to cover these use cases withing [`PrimaryUpdateAndCacheUtils`](https://github.com/operator-framework/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/reconciler/PrimaryUpdateAndCacheUtils.java#L16). | |
to cover these use cases with [`PrimaryUpdateAndCacheUtils`](https://github.com/operator-framework/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/api/reconciler/PrimaryUpdateAndCacheUtils.java#L16). |
recent version of the resource. Note that it is not necessarily the version from the update response, it can be newer | ||
since other parties can do additional updates meanwhile, but if not explicitly modified, it will contain the up-to-date | ||
status. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it is not necessarily the version from the update response, it can be newer
since other parties can do additional updates meanwhile, but if not explicitly modified, it will contain the up-to-date
status.
Not so clear, what you mean here. Updated resource version?
In other words, when to evict the resource from the cache. Typically, as show in the [integration test](https://github.com/operator-framework/java-operator-sdk/blob/main/operator-framework/src/test/java/io/javaoperatorsdk/operator/baseapi/statuscache/primarycache) | ||
you can have a counter in status to check on that. | ||
|
||
Since all of this happens explicitly, you cannot use it for now with managed dependent resources and workflows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since all of this happens explicitly, you cannot use it for now with managed dependent resources and workflows.
Can you please elaborate? This PrimaryUpdateAndCacheUtils.ssaPatchAndCacheStatus(primary, freshCopy, context, cache);
cannot be used under what conditions?
} | ||
``` | ||
|
||
In the background `PrimaryUpdateAndCacheUtils.ssaPatchAndCacheStatusWith` puts the result of the update into an internal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the background `PrimaryUpdateAndCacheUtils.ssaPatchAndCacheStatusWith` puts the result of the update into an internal | |
In the background `PrimaryUpdateAndCacheUtils.ssaPatchAndCacheStatus` puts the result of the update into an internal |
#### Additional remarks | ||
|
||
As shown in the integration tests, there is no optimistic locking used when updating the | ||
[resource](https://github.com/operator-framework/java-operator-sdk/blob/main/operator-framework/src/test/java/io/javaoperatorsdk/operator/baseapi/statuscache/internal/StatusPatchCacheReconciler.java#L41) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular IT, optimistic locking is in place, or am completely I wrong here?
It provides facilities to cache the primary resource for follow-up reconciliations, thus ensuring that the reconciler always handles the up-to-date primary.
Later, we might extend this to the resource part, not just status (in terms of utils, cache already supports that) - now the utility methods focus on status.