Researchers have argued for decades that functional programming simplifies parallel programming, in particular by helping programmers avoid difficult concurrency bugs arising from destructive in-place updates. However, parallel functional languages have historically underperformed in comparison to parallel programs written in lower-level languages. The difficulty is that functional programs have high demand for memory, and this demand only grows with parallelism, causing traditional parallel memory management techniques to buckle under the increased pressure. Recent work has made progress on this problem by identifying a broadly applicable memory property called disentanglement. To exploit disentanglement for improved efficiency and scalability, we show how to partition memory into a tree of heaps, mirroring the dynamic nesting of parallel tasks. This design allows for task-local allocations and garbage collections to proceed independently and in parallel. The result is a provably efficient parallel memory manager. These ideas have been incorporated into the MPL (“maple") compiler for Parallel ML, which offers practical efficiency and scalability for parallel functional programs. Our empirical evaluations show that, at scale (on 72 processors), MPL outperforms modern implementations of both functional and imperative languages, including Java and Go. Additionally, we show that MPL is competitive with low-level, memory-unsafe languages such as C++, in terms of both space and time.