I think that there is a different reason that an emphasis on simple code often results in faster systems. When you write simple code, you spend less time writing code. Therefore, you have more time left to invest in optimizing the very small subset of your overall system that actually matters. You didn't burn engineering resources for speed where it didn't matter.
I'd argue that is a big part of the point I am making. If you take too big of a bite the time it takes to build it optimally goes up in an NP manor. If the bites are the right size then it balances the time/resources you have compared to all the other bites you make to get a locally optimal answer given all resource constraints. Long story short, cutting a problem into manageable pieces is a solid strategy. I will add one thing though, and that is that most people think they have cut things into manageable pieces but in reality they have left them too intertwined and they aren't really independent pieces. For divide and conquer to actually work requires that the pieces have clearly defined, and very limited, communication.
> When you write simple code, you spend less time writing code.
I have found that many engineers write complex code faster than simple code.
You're given requirements like: "the program should do W when the user does A, X when the user does B, Y when the user does C, and Z when the user does D." And a naive programmer will happily trot off and write a pile of code for each of those cases, often with a lot of redundancy between them.
It takes more time and judgement to analyze those cases, see what they have in common, and distill a simpler underlying model for the behavior that encompasses all of the requirements.