Researchers have argued for decades that functional programming can greatly simplify writing parallel programs, for example by controlling side-effects and avoiding race-conditions. However, parallel functional programs have historically performed poorly in comparison to their imperative counterparts. The primary reason is that functional programs allocate memory at a high rate, which only grows with parallelism, causing traditional memory management techniques to buckle under the increased demand for memory. In this talk, I identify a memory property called disentanglement and show that it emerges naturally in race-free parallel programs. To exploit disentanglement for improved efficiency, I describe how to integrate memory management with thread scheduling, ultimately enabling each processor to allocate memory and perform GC independently and in parallel. This approach yields a provably efficient GC policy: for any race-free program with R* sequential live memory, execution on P processors with GC requires O(P R*) memory in total, and only O(P R*) additional work. We implemented these techniques in the MPL compiler, and our experimental results show good performance and scalability. With respect to a sequential baseline, MPL uses only 1-5x additional memory while obtaining 12-51x speedup on 70 cores. In a small sorting competition, MPL is 50% faster than Go and nearly 2x faster than Java while using comparable space.