Visualizing piecewise linear neural networks - News

Neural networks are frequently regarded as opaque, black-box models that approximate functions without clear interpretability. However, theoretical approaches enable us to describe and visualize their inner workings more transparently. A key property that many neural networks share is piecewise linearity, meaning the function represented by the network can be decomposed into multiple linear segments, even though the overall function is nonlinear. This characteristic arises prominently in networks that use the Rectified Linear Unit (ReLU) activation function, which itself is defined by two linear pieces meeting at zero. Given the widespread adoption of ReLU activations, examining networks built by alternating linear transformations and ReLU nonlinearities offers valuable insights into their behavior. To illustrate, consider a simple neural network with two input neurons and one output neuron employing ReLU activation. When visualized in three dimensions, with input variables on the x and y axes and output on the z axis, the ReLU divides the input space into two linear regions: one where the activation is off (output zero) and one where it is on (positive linear output). Importantly, the function learned by such a network must be continuous and piecewise linear, precluding discontinuities where linear pieces do not align at their boundaries. Expanding the network to include eight output neurons in a single layer increases the complexity of the piecewise linear partitioning of the input space. The ReLU activations create multiple boundaries, dividing the input plane into polygons, each corresponding to a unique pattern of neuron activations (some neurons on, others off). While theoretically there could be 2^8, or 256, activation patterns, geometric constraints limit this number to 37 feasible regions in two dimensions, known as the 8th central polygonal number, with 32 regions typically visible. This arrangement of lines and polygons is referred to as a polyhedral complex, representing the network’s piecewise linear decomposition. Adding a second layer with eight ReLU neurons further refines this partitioning. The decision boundaries of the second layer remain linear within each region defined by the first layer but introduce "kinks" when crossing from one region to another, reflecting changes in activation patterns. Some activation patterns become infeasible depending on biases and inputs, effectively terminating certain boundaries. The process to compute these regions involves iterating through all possible activation patterns within each parent region and testing feasibility by solving linear inequalities derived from network parameters. By the third layer, the piecewise linear partitions develop intricate, curved-like structures, despite being composed of linear segments. Visualization of these regions, colored by output magnitude, reveals how certain areas yield higher output values, appearing brighter in the graphical representation. Transitioning back to three-dimensional views offers an intuitive understanding of the network’s function landscape. This exploration has so far focused on a single neural network trained to approximate the Jane Street rings pattern. However, the polyhedral complex evolves dynamically as network weights change. Starting from a randomly initialized, untrained network, the input space divides into only a few large polygons, representing coarse piecewise linear approximations. As training progresses and weights adjust, the complexity increases, resulting in many smaller polygons and more nuanced decision boundaries that better approximate the target shapes. Ricson, who has been part of the Jane Street research desk since 2020, conducts this work alongside his interests in astrophotography and language modeling. His research highlights both the power and interpretability of piecewise linear neural networks, providing tools to visualize and understand how such models partition input spaces through layered ReLU activations.

Loading...

Editors' Choice