Behavior of while-read loops in different shells


Update: While this document was written around while read, it should be noted that using while line is much faster. You should probably substitute while line wherever you see while read in the discussion below if you have line installed. My Ubuntu systems seem to have it, but my Fedora/CentOS systems do not.

I frequently will store multiple lines of data in a single variable; for example, the output of a grep statement. Then, I'll echo the variable and pipe it to a while-read loop which allows me to individually process each line stored in the variable. However, there is a significant pitfall with this approach depending on whether the script is run with sh, bash, or ksh. Specifically, any change in values assigned to variables within the loop will not persist outside the loop in Bourne-base shells such as sh or bash. Here is a simple script to demonstrate the problem:

#!/bin/sh

# Assign multiple lines of data to a single variable
lines="This is red.
This is blue. 
This is green.
This is purple."

## Show the multiple lines assigned to one variable
echo "$lines\n"

## Process each line individually
count=1
echo "$lines" |while read LINE; do
        echo "Here is line ${count}:"
        echo "$LINE"
        count=$(($count + 1))
done

echo "Outside the while loop, count is $count"

If you run this script, you will notice that $count outside the while loop is still equal to 1, even though it was incremented inside the while loop. This is because the pipe (|) in front of the word 'while' creates a sub-shell process that disappears when the while loop ends and it takes the variables with it. I know of two ways to get around this: 1) Use ksh instead of sh because ksh does not have this problem; 2) Use bash and a slightly different syntax:

...
## Process each line individually
count=1
while read LINE; do
        echo "Here is line ${count}:"
        echo "$LINE"
        count=$(($count + 1))
done < <(echo "$lines")

echo "Outside the while loop, count is $count"

This time, count will be equal to 4. Note: There cannot be any spaces between the second < and the opening paren on the last line of the script, or you'll get an error.

12/26/2007