Definition
A neural network is a computational architecture loosely inspired by the brain's connectivity. It consists of interconnected processing units (artificial neurons) organized in layers. Each unit computes a weighted sum of its inputs and applies a non-linear activation function. Networks "learn" by adjusting weights to minimize a loss function on training data — typically via gradient descent and backpropagation.
Why it matters
How it works
Build a network as a directed graph of weighted connections between units. Each unit computes an output as f(Σ wᵢxᵢ + b) where f is a non-linear activation (ReLU, sigmoid, GELU), wᵢ are learnable weights, and b is a bias. To train: present input-output pairs, compute the loss between predicted and target outputs, and adjust weights by backpropagating gradients of the loss. With enough data, parameters, and compute, networks learn to approximate arbitrarily complex input-output mappings.