Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
This paper establishes non-constant lower bounds on the depth of ReLU networks for representing certain piecewise linear functions, specifically proving $\Omega(\log\log d)$ layers are needed for the max function on $d$ numbers under assumptions related to the braid fan. It also provides a combinatorial proof for the necessity of 3 layers for computing the max of 5 numbers and shows limitations of generalizing upper bounds to maxout networks.
Contributes to a fundamental understanding of neural network expressivity, which can guide the design of more efficient and powerful architectures in the future.