A Gentle Introduction To vi(m)
6 min read
Core idea
vi is a modal editor — keys mean different things in different modes
In nano or any GUI editor, the keyboard is monomodal: pressing a always inserts the letter a. In vi, the keyboard is modal: a enters insert mode at the cursor position, A enters insert mode at the end of the line, and the same a in insert mode inserts the literal letter. The cost is the upfront confusion of three modes; the payback is that every key on the keyboard becomes a command in normal mode — letters, digits, punctuation, all of them. Where nano needs Ctrl- and Alt- combinations to expose commands, vi has the whole alphabet free for direct manipulation.
The three modes you need to know first: Normal (the default — every key is a command), Insert (typing inserts text — entered with i, a, o, exited with Esc), and Command-line (a : at the bottom of the screen — for :w, :q, :%s/old/new/g, and other "ex" commands inherited from the line-editor ancestor).
Motions × operators is a language, not a vocabulary
The structural insight that makes vi worth learning is that commands compose. You don't memorize every editing action; you learn operators (d delete, y yank/copy, c change, > indent) and motions (w next word, $ end of line, G end of file, } next paragraph) and combine them. dw deletes a word. d$ deletes to end of line. dG deletes to end of file. y3j yanks three lines down. c} changes a paragraph. Once the grammar clicks, you stop thinking in keystrokes and start thinking in intentions: "delete to the next closing brace" is d/}<Enter>, a single sentence.
Why it matters
vi is everywhere — even when you don't want it
POSIX requires vi to be present on conformant Unix systems. Every Linux distribution ships it. Every BSD ships it. macOS ships it. Tiny embedded busybox systems ship it. The Docker image of even the most minimal Alpine container has it. The remote server you've just been parachuted into may have no nano, no graphical editor, no familiar tools — and you still need to edit a config file. The two-minute investment to learn i, Esc, :wq is the difference between "I can fix this" and "I need to call someone with more access."
It rewards the touch typist forever
The deeper reason to invest in vi is that it is the most ergonomically optimized text editor ever shipped. The motions (h j k l on the home row), the operators (d y c), the modifiers (counts: 5dd deletes five lines), and the text objects (ci( changes inside parentheses) compose to let a fluent user manipulate code at the speed of thought. Every other editor has copied at least a few vi ideas; many ship a vi mode (set -o vi in bash, vim mode in VS Code, IdeaVim in JetBrains). Learning vi once unlocks fast text editing in every environment you'll ever touch.
Key takeaways
Mental model
The mode automaton
Operators meet motions — the grammar of edits
Practical application
-
Survive first: open, edit, save, quit.
vim file.txtopens.ienters insert mode. Type. Esc returns to normal.:wqsaves and quits.:q!quits without saving. That five-keystroke loop is enough to do real work — extend from there as you build comfort. -
Learn motions before you learn anything else. Spend a day using
h j k linstead of the arrow keys. Addw,b,0,$. AddggandG. AddCtrl-dandCtrl-u. Motions are the alphabet — every operator is a verb that expects one. -
Learn
d,y,pnext. Delete a word withdw. Delete a line withdd. Yank (copy) a line withyy. Paste withp. These four compose into 80% of editing actions. The deleted/yanked content goes to vi's unnamed register;pputs it back. -
Learn
/,n,N, and:%s/.../.../g./foojumps to the nextfoo.nrepeats forward,Nbackward.:%s/foo/bar/greplaces everyfoowithbaracross the file. Addc(:%s/foo/bar/gc) for confirm-each. These cover almost every find-and-replace scenario. -
Use text objects, not motions, when editing structured text.
ci"changes inside the nearest pair of double quotes.da(deletes around (including) the nearest parentheses.citchanges inside an HTML tag. Text objects work on the structure you're inside, not where the cursor is — they are by far the most powerful vi feature for code. -
Set up your minimal
~/.vimrc. Three lines that pay back immediately:set number " line numbers set tabstop=4 shiftwidth=4 expandtab " 4-space indent syntax on " syntax highlightingAdd
set incsearch hlsearchfor live-incremental search and highlighted matches. Resist the urge to install plugins for the first month; learn the built-ins first. -
Practice
set -o vion the bash command line. Putset -o viin~/.bashrc. Now in any shell command line, press Esc and edit as if it were a buffer:bbto jump back two words,cwto change a word,ddto clear the line. The skill transfers directly from full vi.
Example
Refactoring a config file in three composable commands
You're editing nginx.conf and need to change every listen 80 directive to listen 8080. You also need to add a server_name line below each listen directive. In a GUI editor this is a multi-pass tedious affair. In vim it's three commands:
:%s/listen 80;/listen 8080;/g
That's the global substitute. Now position the cursor on the first listen line and run:
qa
A
<Enter>server_name _;<Esc>
j0
q
That recorded a macro named a that does: append at line-end, newline, "server_name _;", Esc to normal, down one line, jump to start. To replay it on every remaining listen line:
:g/^\s*listen/normal! @a
:g/pattern/cmd runs cmd on every line matching the pattern. normal! @a replays the recorded macro. The whole refactor — find every listen, change the port, add a server_name beneath — happens in fifteen keystrokes plus the recording. Trying to do the same in a graphical editor with no macros takes ten minutes of manual labor; with a regex find-and-replace it takes more thinking than the macro version does.
The deepest motion: ci( on a function call
You're staring at a function call like:
result = compute_score(player.id, season.year, include_bonus=True, weights=current_weights)
The arguments are wrong and you want to replace them all with a single **kwargs. In a GUI editor you select from after ( to before ) — usually with multiple click attempts to land precisely on the parentheses. In vim, anywhere on that line:
ci(
Three keystrokes. c is "change". i( is "inside the parentheses". vim deletes everything between the parens and drops you into insert mode at the empty spot. Type **kwargs, press Esc, and the line becomes compute_score(**kwargs).
That single example is why people who learn vim refuse to leave it. ci( does not care where the cursor is on the line, how long the argument list is, or how many parentheses are nested in the arguments — it operates on the structure, the nearest enclosing pair. Multiply that capability by dozens of similar text objects (ci", ci{, cit for tags, cip for paragraph) and editing becomes about expressing intent, not about positioning a cursor.
Related lessons
Related concepts
- Editorslinked concept
- Modal Editinglinked concept
- Shelllinked concept