Skip to content

Blocks probably shouldn't retry infinitely in fail state #2152

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sungodmoth opened this issue Mar 24, 2025 · 1 comment · May be fixed by #2157
Open

Blocks probably shouldn't retry infinitely in fail state #2152

sungodmoth opened this issue Mar 24, 2025 · 1 comment · May be fixed by #2157

Comments

@sungodmoth
Copy link

This is a description of an already solved problem which I think indicates a (minor) design flaw.

Earlier today I had internet problems in the house. We were able to solve the problem, but several hours later I noticed something strange about my network activity (ironically, thanks to the net block on my bar): a consistent down speed of about 9Mbps despite having nothing running that should be downloading. I also noticed that the packages block was displaying "pacman -Sy exited with non zero exit status", even though pacman was working just fine when I ran it from the terminal.

Long story short I eventually found out that the cause of the mysterious leech on my down speed was i3status-rust repeatedly spawning pacman instances, each of which downloaded 11MB of package databases before exiting with an error. The error was somewhat opaque (something about error: GPGME error: No data) but easily resolved by deleting the contents of /tmp/checkup-db-i3statusrs-{user}. I suppose an internet outage earlier must have coincidentally hit while the block was running, and corrupted the checkup database by interrupting pacman at the wrong time.

That the database was corrupted is obviously no fault of i3status-rust, but I can't help but think there's something better it could have done than repeatedly run the block with seemingly no retry delay or maximum number of tries. Indeed, I have the packages block configured to run once every 10 minutes precisely because it's relatively intensive on resources, so it seems wrong that an error in the output could cause it to run more like once every 10 seconds. This was probably close to the worst-case scenario that could come from this oversight (if a CPU-intensive rather than network-intensive block were stuck in a loop like this, I'd at least notice the fans getting loud rather than going hours without noticing the resource drain), but I suspect other blocks could be similarly affected.

I'd think that the simplest way to mitigate this issue would be to respect the block interval when retrying from an error state. For instance, the protocol when a block errors could be to

  • immediately retry a handful of times, up to some arbitrary limit;
  • if this still fails, "give up" and wait until the next time the block would ordinarily run (until the block interval has elapsed) to try again.
    Something like this would significantly lessen the impact of this issue, considering that resource-intensive blocks are likely to be set to a high interval in the first place -- though I may well be missing some other reason this wouldn't work.
@bim9262
Copy link
Collaborator

bim9262 commented Apr 19, 2025

It may be a good idea to allow for a max number of retries or allow different kinds of back off strategies than linear (like exponential). Currently we only have error_interval, which would still help signifigantly reduce the amount of wasted resources that you saw.

EX:

[[block]]
block = "pacman"
interval = 600
error_interval = 300
format = " $icon $pacman + $aur = $both updates available "
format_singular = " $icon $both update available "
format_up_to_date = " $icon system up to date "
critical_updates_regex = "(linux|linux-lts|linux-zen)"
# aur_command should output available updates to stdout (ie behave as echo -ne "update\n")
aur_command = "yay -Qua"

bim9262 added a commit to bim9262/i3status-rust that referenced this issue Apr 22, 2025
if `max_retries` is not set then no limit is used (same as previous
behavior).

Fixes greshake#2152
@bim9262 bim9262 linked a pull request Apr 22, 2025 that will close this issue
bim9262 added a commit to bim9262/i3status-rust that referenced this issue Apr 30, 2025
if `max_retries` is not set then no limit is used (same as previous
behavior).

Fixes greshake#2152
bim9262 added a commit to bim9262/i3status-rust that referenced this issue May 4, 2025
if `max_retries` is not set then no limit is used (same as previous
behavior).

Fixes greshake#2152
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants