We discuss various ways of improving scalability and performance of optimisation algorithms, especially when special structure is present: from linear embeddings to nonlinear ones, from deterministic to stochastic approaches. A particularly beneficial structure is that of low-rank functions, a common occurrence in applications involving overparameterized models and that can serve as an insightful proxy for the training landscape in neural networks. We discuss the presence of this structure in neural networks, and its implications and connections to feature learning.