Using reverse pipes to preserve shell context

Every so often I see bash scripts that follow the pattern:

NODEID=`curl -s https://api.github.com/repos/alindobre/qemu/commits?per_page=1 | jq -r '.[] .node_id'`
SHA=`curl -s https://api.github.com/repos/alindobre/qemu/commits?per_page=1 | jq -r '.[] .sha'`

Sometimes I see even 3-5 variables extracted this way. Even worse, I noticed code with additional pipes added after jq, which perform very simple filtering when jq is a potent tool by its own.

The most upsetting problem with the above code is that is running curl two times. This can be translated to making twice the number of API requests. If you have five such lines in your script, you’re wasting your server computing capability with 4 times unnecessary requests.

Normally, to fix this problem, you’d only make a single request and use jq to output both values in the same run.

curl -s https://api.github.com/repos/alindobre/qemu/commits?per_page=1 \
  | jq -r '.[] | .node_id, .sha'

The first instinct might be to continue with another pipe like this:

curl -s https://api.github.com/repos/alindobre/qemu/commits?per_page=1 \
  | jq -r '.[] | .node_id, .sha' \
  | { read NODEID
      read SHA
      echo NODEID=$NODEID
      echo SHA=$SHA
    }
echo NODEID=$NODEID
echo SHA=$SHA

The second set of echo calls are outside of the pipe subshell in the parent shell context. And because you can never modify parent’s environment from a subshell, those two echo calls will output empty values.

What I call a reverse pipe is using the triple input redirection from a command substitution. In this case, the input goes from the subshell to the parent command and not the other way around. Let’s see it in action:

{ read NODEID
  read SHA
  echo NODEID=$NODEID
  echo SHA=$SHA
} <<<`curl -s https://api.github.com/repos/alindobre/qemu/commits?per_page=1 \
      | jq -r '.[] | .node_id, .sha'`
echo NODEID=$NODEID
echo SHA=$SHA

What happens is that I took the last pipe from the previous command and execute it in the context of the parent. So now, the second pair of echo calls will no longer output empty values. That’s because the read commands are executed in the context of the parent shell.

I hope this example was useful enough in the quest of reducing the number of pipes and subshells, especially when using network calls. And give jq more credit. It’s a very powerful tool on its own capable of advanced output filtering. You don’t need to use additional grep, sed, awk or tail/head calls.

Leave a Reply

Your email address will not be published. Required fields are marked *