A Gentle Introduction To vi(m)

6 min read

Core idea

vi is a modal editor — keys mean different things in different modes

In nano or any GUI editor, the keyboard is monomodal: pressing a always inserts the letter a. In vi, the keyboard is modal: a enters insert mode at the cursor position, A enters insert mode at the end of the line, and the same a in insert mode inserts the literal letter. The cost is the upfront confusion of three modes; the payback is that every key on the keyboard becomes a command in normal mode — letters, digits, punctuation, all of them. Where nano needs Ctrl- and Alt- combinations to expose commands, vi has the whole alphabet free for direct manipulation.

The three modes you need to know first: Normal (the default — every key is a command), Insert (typing inserts text — entered with i, a, o, exited with Esc), and Command-line (a : at the bottom of the screen — for :w, :q, :%s/old/new/g, and other "ex" commands inherited from the line-editor ancestor).

Motions × operators is a language, not a vocabulary

The structural insight that makes vi worth learning is that commands compose. You don't memorize every editing action; you learn operators (d delete, y yank/copy, c change, > indent) and motions (w next word, $ end of line, G end of file, } next paragraph) and combine them. dw deletes a word. d$ deletes to end of line. dG deletes to end of file. y3j yanks three lines down. c} changes a paragraph. Once the grammar clicks, you stop thinking in keystrokes and start thinking in intentions: "delete to the next closing brace" is d/}<Enter>, a single sentence.

Why it matters

vi is everywhere — even when you don't want it

POSIX requires vi to be present on conformant Unix systems. Every Linux distribution ships it. Every BSD ships it. macOS ships it. Tiny embedded busybox systems ship it. The Docker image of even the most minimal Alpine container has it. The remote server you've just been parachuted into may have no nano, no graphical editor, no familiar tools — and you still need to edit a config file. The two-minute investment to learn i, Esc, :wq is the difference between "I can fix this" and "I need to call someone with more access."

It rewards the touch typist forever

The deeper reason to invest in vi is that it is the most ergonomically optimized text editor ever shipped. The motions (h j k l on the home row), the operators (d y c), the modifiers (counts: 5dd deletes five lines), and the text objects (ci( changes inside parentheses) compose to let a fluent user manipulate code at the speed of thought. Every other editor has copied at least a few vi ideas; many ship a vi mode (set -o vi in bash, vim mode in VS Code, IdeaVim in JetBrains). Learning vi once unlocks fast text editing in every environment you'll ever touch.

Key takeaways

Mental model

The mode automaton

The mode automaton

Operators meet motions — the grammar of edits

Operators meet motions — the grammar of edits

Practical application

  1. Survive first: open, edit, save, quit. vim file.txt opens. i enters insert mode. Type. Esc returns to normal. :wq saves and quits. :q! quits without saving. That five-keystroke loop is enough to do real work — extend from there as you build comfort.

  2. Learn motions before you learn anything else. Spend a day using h j k l instead of the arrow keys. Add w, b, 0, $. Add gg and G. Add Ctrl-d and Ctrl-u. Motions are the alphabet — every operator is a verb that expects one.

  3. Learn d, y, p next. Delete a word with dw. Delete a line with dd. Yank (copy) a line with yy. Paste with p. These four compose into 80% of editing actions. The deleted/yanked content goes to vi's unnamed register; p puts it back.

  4. Learn /, n, N, and :%s/.../.../g. /foo jumps to the next foo. n repeats forward, N backward. :%s/foo/bar/g replaces every foo with bar across the file. Add c (:%s/foo/bar/gc) for confirm-each. These cover almost every find-and-replace scenario.

  5. Use text objects, not motions, when editing structured text. ci" changes inside the nearest pair of double quotes. da( deletes around (including) the nearest parentheses. cit changes inside an HTML tag. Text objects work on the structure you're inside, not where the cursor is — they are by far the most powerful vi feature for code.

  6. Set up your minimal ~/.vimrc. Three lines that pay back immediately:

    set number          " line numbers
    set tabstop=4 shiftwidth=4 expandtab  " 4-space indent
    syntax on           " syntax highlighting
    

    Add set incsearch hlsearch for live-incremental search and highlighted matches. Resist the urge to install plugins for the first month; learn the built-ins first.

  7. Practice set -o vi on the bash command line. Put set -o vi in ~/.bashrc. Now in any shell command line, press Esc and edit as if it were a buffer: bb to jump back two words, cw to change a word, dd to clear the line. The skill transfers directly from full vi.

Example

Refactoring a config file in three composable commands

You're editing nginx.conf and need to change every listen 80 directive to listen 8080. You also need to add a server_name line below each listen directive. In a GUI editor this is a multi-pass tedious affair. In vim it's three commands:

:%s/listen 80;/listen 8080;/g

That's the global substitute. Now position the cursor on the first listen line and run:

qa
A
<Enter>server_name _;<Esc>
j0
q

That recorded a macro named a that does: append at line-end, newline, "server_name _;", Esc to normal, down one line, jump to start. To replay it on every remaining listen line:

:g/^\s*listen/normal! @a

:g/pattern/cmd runs cmd on every line matching the pattern. normal! @a replays the recorded macro. The whole refactor — find every listen, change the port, add a server_name beneath — happens in fifteen keystrokes plus the recording. Trying to do the same in a graphical editor with no macros takes ten minutes of manual labor; with a regex find-and-replace it takes more thinking than the macro version does.

The deepest motion: ci( on a function call

You're staring at a function call like:

result = compute_score(player.id, season.year, include_bonus=True, weights=current_weights)

The arguments are wrong and you want to replace them all with a single **kwargs. In a GUI editor you select from after ( to before ) — usually with multiple click attempts to land precisely on the parentheses. In vim, anywhere on that line:

ci(

Three keystrokes. c is "change". i( is "inside the parentheses". vim deletes everything between the parens and drops you into insert mode at the empty spot. Type **kwargs, press Esc, and the line becomes compute_score(**kwargs).

That single example is why people who learn vim refuse to leave it. ci( does not care where the cursor is on the line, how long the argument list is, or how many parentheses are nested in the arguments — it operates on the structure, the nearest enclosing pair. Multiply that capability by dozens of similar text objects (ci", ci{, cit for tags, cip for paragraph) and editing becomes about expressing intent, not about positioning a cursor.

Continue exploring

Tags