Indexing

Definition

An index is a precomputed auxiliary data structure that lets a system answer queries without scanning the entire underlying dataset. The trade is unambiguous: spend extra storage and pay an upkeep cost on every write so that reads become dramatically faster. A linear scan of N items costs O(N); the right index turns the same lookup into O(log N) with a balanced tree, O(1) with a hash table, or near-instant with a bitmap or inverted index.

Indexing is the engine under every fast lookup in computing. Databases, full-text search engines, filename caches like the Unix locate command, package managers, and even the filesystem's directory structures all rely on some form of index to make queries feasible at scale.

Why it matters

How it works

The simplest index is a sorted copy of a column. To find a value, binary-search the sorted list rather than scan the entire source. This is O(log N) but rebuilding the sorted list on every insert is O(N), so production systems use self-balancing trees — typically B-trees or B+ trees — that keep the data sorted with O(log N) insert and lookup. The B-tree's high branching factor keeps the tree shallow enough that a million-row table needs only three or four disk reads to find any value.

Different query shapes need different indexes. Equality lookups against a single column are best served by hash indexes, which compute a hash of the search key and jump directly to the bucket. Substring or full-text search needs an inverted index, which pre-tokenises every document and stores, for each token, the list of documents containing it. Geographic queries use spatial trees like R-trees or quad-trees. Column-oriented analytics warehouses use bitmap indexes that compress beautifully for low-cardinality columns. The universal lesson is that indexing is not one technique but a family of structures, each optimised for a particular query pattern — and the cost of choosing the wrong index can be greater than no index at all.

Where it goes next

Data Structuresshares tag: algorithms
Act As Ifshares tag: performance
Act of Courageshares tag: performance
Arraysshares tag: data-structures
Associative Arraysshares tag: data-structures
Career-Best Effortshares tag: performance
Chess Treeshares tag: algorithms
Compressionshares tag: algorithms
Computer Chessshares tag: algorithms
Computer Scienceshares tag: algorithms
Creative Incubationshares tag: performance
Crisis Responseshares tag: performance
Dataframeshares tag: data-structures
Decision Procedureshares tag: algorithms
Drawdownshares tag: performance
Effortless Actionshares tag: performance
Embedded Databaseshares tag: databases
Expertiseshares tag: performance
Feedbackshares tag: performance
Feeling as Instrumentshares tag: performance
Filesystemshares tag: data-structures
Flowshares tag: performance
Forward Orientationshares tag: performance
Inner Gameshares tag: performance
Iterationshares tag: algorithms
Latent Learningshares tag: performance
Loopsshares tag: algorithms
Machine Learningshares tag: algorithms
Mental Rehearsalshares tag: performance
Motivation Ritualshares tag: performance
Palindromeshares tag: algorithms
Positive Motivationshares tag: performance
Self-Presentationshares tag: performance
Self-Relianceshares tag: performance
Spontaneityshares tag: performance
Stackshares tag: data-structures
Success-Type Personalityshares tag: performance
Threading Modelshares tag: performance
Winning Feelingshares tag: performance
Yerkes-Dodson Lawshares tag: performance

Indexing

Definition

Why it matters

How it works

Where it goes next

Continue exploring

Tags

Indexing

Definition

Why it matters

How it works

Where it goes next

Related concepts

Continue exploring

Tags