Dealing With Flakey CI Commands With a Retry Loop in Bash

One of the most frustrating things to deal with in Continuous Integration is flakey commands. Whether it’s flakey tests, or intermittent networking issues, when your build fails for issues outside of your control not only does it cause frustration, it reduces the trust in your CI process.

One strategy for dealing with this type of issue is to introduce some retry logic into your commands. This can easily be accomplished with good old bash.

For example, pretend that I have $FLAKEY_COMMAND and I want to retry it three times before finally failing my build. I could wrap the whole thing up in a bash loop like this.

counter=1
max=3
$FLAKEY_COMMAND

while [[ $? -ne 0 && $counter -lt $max ]]; do
    counter=$((counter+1))
    $FLAKEY_COMMAND
done

This script will run my command, if the exit code (the output of $?) is non zero (i.e something went wrong) and my counter is less than three, then it will retry the command. You can increase or decrease the number of attempts by adjusting the max variable.

This is not a foolproof strategy, but is one approach to handle flakey commands in your CI pipeline.

This entry was posted in programming. Bookmark the permalink.

Leave a Reply