Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Switch agent to use network-online.target instead of network.target #5139

Open
1 of 4 tasks
dp-robleady opened this issue Mar 6, 2025 · 0 comments
Open
1 of 4 tasks

Comments

@dp-robleady
Copy link

What happened?

As mentioned in this closed issue: #4319 can you consider switching to using After=network-online.target for the systemd unit that starts/stops the DevOps agent.

As can be noted in the journalctl output below, the Agent starts up before the network has got an IP address.

In the Agent logs, after the agent starts, it initially fails to connect to the upstream API as the network connection isn't up at that point, retrying 10 seconds later.

Changing vsts.agent.service.template to the below, ensures that the first connection with dev.azure.com works.

[Unit]
Description={{Description}}
After=network-online.target
Requires=network-online.target

[Service]
ExecStart={{AgentRoot}}/runsvc.sh
User={{User}}
WorkingDirectory={{AgentRoot}}
KillMode=process
KillSignal=SIGTERM
TimeoutStopSec=5min

[Install]
WantedBy=multi-user.target

I think this also helps when the agent is configured in an Environment where it might get rebooted as part of a deployment task. This change should hopefully ensure that the agent is shut down before the network is stopped which I suspect is causing issues pushing logs upstream.

Versions

4.258-1
Ubuntu 24.04

Environment type (Please select at least one enviroment where you face this issue)

  • Self-Hosted
  • Microsoft Hosted
  • VMSS Pool
  • Container

Azure DevOps Server type

dev.azure.com (formerly visualstudio.com)

Azure DevOps Server Version (if applicable)

No response

Operation system

Ubuntu 22.04

Version controll system

No response

Relevant log output

journalctl output after a reboot:

Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Reached target Network.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Reached target Host and Network Name Lookups.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Starting OpenBSD Secure Shell server...
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Starting Permit User Sessions...
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Started Azure Pipelines Agent (xxxxxxxx.yyyyyyyy.gfx-test-vsn-rob-01).
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Finished Permit User Sessions.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Starting Set console scheme...
Mar 06 07:57:20 gfx-test-vsn-rob-01 runsvc.sh[553]: .path=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Finished Set console scheme.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Created slice Slice /system/getty.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Started Getty on tty1.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Reached target Login Prompts.
Mar 06 07:57:20 gfx-test-vsn-rob-01 sshd[558]: Server listening on 0.0.0.0 port 22.
Mar 06 07:57:20 gfx-test-vsn-rob-01 sshd[558]: Server listening on :: port 22.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Started OpenBSD Secure Shell server.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Reached target Multi-User System.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Reached target Graphical Interface.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Starting Record Runlevel Change in UTMP...
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: systemd-update-utmp-runlevel.service: Deactivated successfully.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Finished Record Runlevel Change in UTMP.
Mar 06 07:57:20 gfx-test-vsn-rob-01 systemd[1]: Startup finished in 5.074s (kernel) + 1.172s (userspace) = 6.247s.
Mar 06 07:57:20 gfx-test-vsn-rob-01 runsvc.sh[556]: v20.17.0
Mar 06 07:57:20 gfx-test-vsn-rob-01 runsvc.sh[560]: Starting Agent listener with startup type: service
Mar 06 07:57:20 gfx-test-vsn-rob-01 runsvc.sh[560]: Started listener process
Mar 06 07:57:20 gfx-test-vsn-rob-01 runsvc.sh[560]: Started running service
Mar 06 07:57:20 gfx-test-vsn-rob-01 kernel: Agent.Listener[567]: memfd_create() called without MFD_EXEC or MFD_NOEXEC_SEAL set
Mar 06 07:57:22 gfx-test-vsn-rob-01 kernel: igb 0000:0d:00.0 enp13s0: igb: enp13s0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Mar 06 07:57:22 gfx-test-vsn-rob-01 systemd-networkd[388]: enp13s0: Gained carrier
Mar 06 07:57:23 gfx-test-vsn-rob-01 systemd-networkd[388]: enp0s31f6: Gained carrier
Mar 06 07:57:23 gfx-test-vsn-rob-01 kernel: e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
Mar 06 07:57:24 gfx-test-vsn-rob-01 systemd-networkd[388]: enp13s0: Gained IPv6LL
Mar 06 07:57:24 gfx-test-vsn-rob-01 systemd-networkd[388]: enp0s31f6: Gained IPv6LL
Mar 06 07:57:25 gfx-test-vsn-rob-01 systemd-networkd[388]: enp13s0: DHCPv4 address 10.xx.yy.zz/24 via 10.xx.yy.254
Mar 06 07:57:27 gfx-test-vsn-rob-01 systemd-networkd[388]: enp0s31f6: DHCPv4 address 10.xx.yy.zz/24 via 10.xx.yy.254
Mar 06 07:57:33 gfx-test-vsn-rob-01 runsvc.sh[560]: Scanning for tool capabilities.
Mar 06 07:57:33 gfx-test-vsn-rob-01 runsvc.sh[560]: Connecting to the server.
Mar 06 07:57:35 gfx-test-vsn-rob-01 runsvc.sh[560]: 2025-03-06 07:57:35Z: Listening for Jobs


Agent logs:
[2025-03-06 07:57:21Z WARN VisualStudioServices] Attempt 1 of GET request to https://dev.azure.com/xxxxxxxx/_apis/connectionData failed (Socket Error: TryAgain). The operation will be retried in 10.8907567 seconds.
[2025-03-06 07:57:32Z WARN VisualStudioServices] Authentication failed with status code 401.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant