Arrays
3 min read
Core idea
bash supports two array flavours. Indexed arrays map non-negative integers to values — like a one-column spreadsheet. Associative arrays (bash 4.0+) map strings to values — a dictionary, a hash table, a lookup. Both are one-dimensional; both grow automatically; both are accessed with subscript syntax ${arr[key]}. Together they replace nearly every case where a less-experienced shell programmer would reach for parallel variables, awk, or a temporary file.
Why it matters
Without arrays, shell scripts accumulate scalar state in ugly ways: var1, var2, var3 declared in a loop; tab-separated strings split and re-split on every access; counts kept in one variable while names live in another. Arrays compress all of that into a single named structure with O(1) indexed access. Associative arrays in particular open up tabular reporting, frequency counting, and lookup-by-key patterns that would otherwise demand external tools. They are bash's least-used powerful feature.
Mental model
Indexed vs. associative — same syntax, different keys
The two flavours share virtually all operators. The only differences are the declaration (-A for associative) and what counts as a valid subscript. Once you internalise the table below, everything else is composition.
| Operation | Indexed | Associative |
| ---------------------- | ------------------------ | ------------------------- |
| Declare | declare -a arr (optional) | declare -A arr (required) |
| Assign one | arr[5]=val | arr[red]=val |
| Assign many | arr=(a b c) | arr=([k1]=v1 [k2]=v2) |
| Read one | "${arr[5]}" | "${arr[red]}" |
| All values | "${arr[@]}" | "${arr[@]}" |
| All keys / indexes | "${!arr[@]}" | "${!arr[@]}" |
| Length | ${#arr[@]} | ${#arr[@]} |
| Element length | ${#arr[5]} | ${#arr[red]} |
| Append | arr+=(d e) | arr+=([k3]=v3) |
| Delete one | unset 'arr[5]' | unset 'arr[red]' |
| Delete all | unset arr | unset arr |
Iterating safely
Two parameter expansions dominate array work: "${arr[@]}" for values and "${!arr[@]}" for keys. The ! introduces indirection — give me the keys, not the things at those keys. Quoting matters as much as for $@:
| Form | Behaviour |
| ----------------- | -------------------------------------------------- |
| ${arr[@]} | each element split on $IFS (usually wrong) |
| "${arr[@]}" | each element a separate word (almost always right) |
| "${arr[*]}" | all elements joined into one string |
| ${!arr[@]} | unquoted list of keys (fine for integer keys) |
| "${!arr[@]}" | quoted list of keys (required for string keys) |
Why arrays are sparse
bash indexed arrays are not contiguous — arr[100]=foo is legal, and ${#arr[@]} reports 1, not 101. This surprises people coming from C or Python. The reason is that bash stores arrays as a sparse map internally; subscripts are keys, not memory offsets. ${!arr[@]} is the only way to learn what's actually populated. It also means "append" doesn't mean "to slot ${#arr[@]}" — use arr+=(val) to let bash pick the next slot above the highest existing index.
Practical application
-
Decide indexed or associative up front. Order matters? Indexed. Lookup by name? Associative. If you'd reach for a
dictin Python or aMapin JS, you want associative — and you needdeclare -Abefore any assignment. -
Initialise explicitly when needed. If your script later increments
${counts[$key]}you may want to seed it with zero first; bash treats unset elements as empty, which(( … ))interprets as 0 — but explicit init keeps intent visible. -
Always use
"${arr[@]}"— never bare${arr[@]}. Same rule as"$@"— the quotes preserve element boundaries for values containing spaces. -
Use
mapfileto load file contents.mapfile -t lines < file.txtis the canonical "read every line into an array" idiom. The-ttrims the trailing newline from each entry. -
Sort by piping out and reading back. bash has no sort builtin. The pattern is
mapfile -t sorted < <(printf '%s\n' "${arr[@]}" | sort)— a clean round-trip that survives spaces.
Example
A real ops task: scan a directory of log files and produce a per-owner summary — how many files each user owns, and how many bytes those files total. Associative arrays make this trivially clean compared to the awk-and-sort version:
#!/usr/bin/env bash
shopt -s nullglob
declare -A file_count # owner → count
declare -A byte_total # owner → bytes
for f in /var/log/*.log; do
# stat -c gives us "<owner> <size>" in one call.
read -r owner size < <(stat -c '%U %s' "$f")
(( file_count[$owner] += 1 ))
(( byte_total[$owner] += size ))
done
# Render a sorted report. Note "${!file_count[@]}" gives the keys.
printf '%-12s %6s %12s\n' "OWNER" "FILES" "BYTES"
for owner in $(printf '%s\n' "${!file_count[@]}" | sort); do
printf '%-12s %6d %12d\n' \
"$owner" "${file_count[$owner]}" "${byte_total[$owner]}"
done
The whole script is the two declare -A lines, one accumulator loop, and one render loop. Without associative arrays you'd be juggling parallel arrays of names and counts, doing linear scans to check "have I seen this owner already?", and the script would balloon to twice the length and a third the clarity.
Related lessons
Related concepts
- Arrayslinked concept
- Associative Arrayslinked concept
- Iterationlinked concept