Skip to content

Soft reboot prep #4133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 6, 2025
Merged

Soft reboot prep #4133

merged 3 commits into from
Jun 6, 2025

Conversation

cgwalters
Copy link
Member

@cgwalters cgwalters commented Jun 5, 2025

Prep for #4119


kolet: Capture error messages from mkfifo

The Go exec.Command when used naively like this captures
stderr and then it gets lost.

It turned out with soft reboot it was this mkfifo that was
failing but all we got is "Child process exited with code 1"
but there are like 5 different processes on two different hosts
that could be talking about, and we really thought it
was talking about the code-under-test, not the framework setting
it up.


kola: Don't drop stderr on the floor from unit starting function

Amazingly we had two places that dropped stderr, this was
the second. The default c.SSH captures logs from test failures
but this was an infra failure so we need to drop to the
raw ssh tool.

(This could use a big cleanup but that's...a bigger project)


mantle: Log commands executed over ssh and return code

This definitely adds some chatter at debug level but I think
it's really worth it.


cgwalters added 3 commits June 5, 2025 16:56
The Go `exec.Command` when used naively like this captures
stderr and then it gets lost.

It turned out with soft reboot it was this `mkfifo` that was
failing but all we got is "Child process exited with code 1"
but there are like 5 different processes on two different hosts
that could be talking about, and we *really* thought it
was talking about the code-under-test, not the framework setting
it up.
Amazingly we had *two* places that dropped stderr, this was
the second. The default `c.SSH` captures logs from test failures
but this was an *infra* failure so we need to drop to the
raw ssh tool.

(This could use a big cleanup but that's...a bigger project)
This definitely adds some chatter at debug level but I think
it's really worth it.
@jmarrero
Copy link
Member

jmarrero commented Jun 6, 2025

/retest-required

@jlebon jlebon enabled auto-merge (rebase) June 6, 2025 13:44
@jlebon
Copy link
Member

jlebon commented Jun 6, 2025

Prow failing on coreos/rhel-coreos-config#20.

/override ci/prow/rhcos

Copy link

openshift-ci bot commented Jun 6, 2025

@jlebon: Overrode contexts on behalf of jlebon: ci/prow/rhcos

In response to this:

Prow failing on coreos/rhel-coreos-config#20.

/override ci/prow/rhcos

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@jlebon jlebon merged commit fbca096 into coreos:main Jun 6, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants