Printing
4 min read
Core idea
Unix printing is older than the desktop. In the 1980s, printers were expensive, slow, and centralized; many users shared one device, so the OS had to queue jobs, schedule them, and spool them. That architecture survived, and on modern Linux it lives in the Common Unix Printing System (CUPS). When you run lpr report.txt, you do not "talk to a printer" — you hand a job to CUPS, which places it in a per-printer queue, calls Ghostscript to rasterize anything that is not already in the printer's native page description language (PDL), and then streams the result to the device when its turn comes. Three small commands (lpr / lp to submit, lpq / lpstat to inspect, lprm / cancel to remove) cover the full lifecycle.
Shotts's argument: The reason printing on Linux can feel "indirect" is that it was never a direct, one-job-one-device action. It is a small distributed system — submit, spool, schedule, rasterize, deliver — and once you see the architecture, every command you reach for slots into place.
Why it matters
Pipelines end at the printer
Every text-processing pipeline in the previous topics is just one more pipe away from being a paper report. du -sh /home/* | sort -hr | pr -3 | lpr prints a top-disk-usage chart at the end of the day. The cost of moving work from screen to printer is one extra command.
A "PDF printer" gives you reproducible artifacts
Most distributions ship a virtual printer (often PDF via cups-pdf) that prints to a file in your home directory. Hooking pipelines to that printer turns every shell session into a built-in PDF generator with no extra dependencies. This is enormously useful for committing reports, archiving log summaries, or producing attachments for ticket systems.
Understanding spooling helps every networked service
The CUPS queue model — receive jobs, queue them per resource, work them off in order — is the same shape you will encounter in task queues (Celery, Sidekiq), CI runners, video transcode farms, and email spoolers. Learning it once in the printer context demystifies all of them.
Key takeaways
Mental model
Submit, spool, rasterize, deliver
The whole subsystem is a four-stage pipeline. Each stage is observable from the command line, which is what makes troubleshooting tractable.
Why a page description language matters
A character-cell printer needed only one byte per visible glyph — a US-letter page was ~4,800 bytes. A 300-DPI laser printer needs the whole page as a bitmap — roughly 900,000 bytes. PostScript collapsed that cost: instead of sending the bitmap, you send a program that draws the page, and a small interpreter inside the printer (or in Ghostscript on the host) renders it locally.
Practical application
For text reports, the canonical recipe is pr plus lpr. pr handles pagination, headers, and multi-column layout; lpr ships the result. For richer output — proportional fonts, two-up layout, syntax-highlighted source code — reach for a2ps (or its cousin enscript). Both take any text and produce typeset PostScript that you can send to a PostScript printer, to the PDF queue, or save to a file with -o.
Two common pitfalls. First, check the default printer. Running lpr file.txt with no -P sends to the system default — which on a multi-printer host may not be the one you want. lpstat -d shows the current default; lpr -P specific-printer file.txt overrides it. Second, remember that lp uses different flags: -d for destination, -n for copies, -o for printer-specific options like landscape or fitplot. Many tutorials mix lpr and lp flags freely; the two programs do the same job but accept different option syntaxes.
Example
Suppose the operations team wants a daily printed snapshot of disk usage for the file servers — top ten directories, dated header, two-column layout, sent to the printer named ops-laser. The full pipeline reuses the formatting techniques from the previous topic and adds one final stage:
du -sh /srv/* 2>/dev/null \
| sort -hr \
| head -10 \
| awk '{ printf "%-40s %10s\n", $2, $1 }' \
| pr -2 -h "Disk usage — $(hostname) — $(date +%F)" -l 60 \
| lpr -P ops-laser
Read top to bottom: collect sizes, sort largest-first, keep ten, reformat columns, paginate with a custom header in two-column layout sixty lines per page, then submit to the ops-laser queue. If ops-laser is offline, the job sits in the queue; lpq -P ops-laser shows its position. If the layout is wrong, swap the last stage for lpr -P PDF, inspect the PDF, and only re-run against the laser when the result is satisfactory.
A second variation: schedule the same command with cron every weekday morning. The pipeline is now an unattended report — same toolkit, no human in the loop. That's the payoff of letting every Unix command speak the same stdin/stdout protocol: the printer is just one more downstream consumer.