CTO and co-founder of Signal Sciences. Author and speaker on software engineering, devops, and security.

Using curl in Automation

Learn how to optimize curl for downloading network resources in your batch scripts, provisioning systems and continuous deployment pipelines

Often times, in provisioning systems, batch scripts, and CI/CD pipelines, some call to fetch an external (network) resource is required. While it’s best to eliminate as many external dependencies and network calls as possible, sometimes it can’t be helped. In which case the omnipresent curl is useful. However, by default, curl isn’t well optimized for automation. In particular:

  • shows a progress meter designed for humans. In CI/CD logs, progress meters add no value and make horrible log output.
  • doesn’t follow redirects. You almost always want to follow redirects.
  • doesn’t timeout. The lack of timeout can (and does) cause CI/CD runs to hang. I’ve seen Jenkins and Travis-ci runs take hours due to a hanging download.
  • doesn’t fail (or exit non-zero) on 404s. As long as curl received whatever the server sent back, it’s a success, and the HTTP code doesn’t matter. This is probably not your definition of success.
  • doesn’t retry on transient errors. Totally fine for humans. Totally bad for CI/CD runs.

This is no surprise. Curl has been worked on for dozens of years and grew organically. And it has numerous, often non-obvious, flags to control its behavior.

Crucial Flags

Here are the most important flags for use in batch scripts.

Turn off the progress bar

The -s or --silent flag turns off all output. Unfortunately this also means error output, which probably do want. So...

Turn back on error output

The flag -S or --show-errors turns back on error output. You probably want this.

Fail on 404s

The flag -f or --fail cause curl to exit (or fail) with exit code 22 if it doesn't get an HTTP status of 200. The previous flag --show-errors is needed to actually see what the status code is. See below for a way of getting the status code and exiting a bit more gracefully.

Follow redirects

The -L or --location flag instructs curl to follow redirects, which one probably wants. Use the --max-redirs flag to prevent infinite loops from wildly misconfigured servers.

Timeout

There are many ways to timeouts in shell scripting, but curl provides one automatically. The -m or --max-time flag will specify a timeout in seconds. After that, the connections are cancelled and curl returns a non-zero exit code.

Retry

Similar to timeouts, there are many ways to retry a command on transient failure. Again, curl provides a built-in mechanism:

--retry NUM   Retry request NUM times if transient problems occur
--retry-connrefused  Retry on connection refused (use with --retry)
--retry-delay SECONDS  Wait SECONDS between retries
--retry-max-time SECONDS  Retry only within this period

A good starting point might be:

--retry 3 --retry-connrefused --retry-delay 2

All Together

curl --silent --show-error \
  -L --max-redirs 3 \
  --retry 3 --retry-connrefused --retry-delay 2 \
  --max-time 30

For quick and dirty scripts, you can cheat with:

curl -sfSL

Better Fails on 404s

Curl has a way of customizing the output using the -w flag. One can use this to fail on HTTP status in a different way:

http_code=$(curl -w '%{http_code}' -s -o dest src)
if [ "$http_code" != "200" ]; then
    echo "curl received HTTP status $http_code"
    exit 1
fi

Security

Never use the -k or --insecure flags. This turns off critical security checks. If you think you need to use this spend the time to debug and fix properly. This isn’t some abstract concern. Real sites have gotten popped by turning off these security checks.

Final Notes

The best solution is often to eliminate the network call from scripts, either by finding a different way or checking in a known-good version of the resource. But when that’s not possible (or when it really isn’t critical), these curl flags will handle errors more gracefully.

If you need to download multiple items, time can be saved by downloading them in parallel. See Parallelize Shell or Bash Scripts Using Xargs for details.

And if you are looking to bootstrap something from a unknown OS, then take a look at an posix shell abstraction shlib that wraps curl or wget depending on what’s present.

devops

© 2018 Nick Galbreath