Hacker News new | past | comments | ask | show | jobs | submit login

The default usage of perf does this. There's also a few profilers I know of that will show the functions taking the most time.

IMO, those are (generally) nowhere near as useful as a flame/icicle graph.

Not saying they are never useful; Sometimes people do really dumb things in 1 function. However, the actual performance bottleneck often lives at least a few levels up the stack.




Which is why the defaults for perf always drive me crazy. You want to see the entire call tree with the cumulative and exclusive time spent in all the functions.

I’m honestly curious why the defaults are the way they are. I have basically never found them to be what I want. Surely the perf people aren’t doing something completely different than I am?

I almost never find graph usage useful, TBH (and flamegraphs are worse than useless). And perf's support for stack traces is always wonky _somehow_, so it's not easy to find good defaults for the cases where I need them (I tend to switch between fp, lbr and dwarf depending on a whole lot of factors).

Tell me about it!

I think I've only been able to get good call stacks when I build everything myself with the right compilation options. This is a big contrast with what I remember working with similar tools under MSFT environments (MS Profiler or vTune).

You can get it to work though but it's a pain.


To be honest I don't like Linux profiling tools at all. Clearly the people working on them have a very different set of problems than I do

I think it boils down to what Brendan Gregg likes. He must be doing somewhat different type of work and so he likes these defaults.



Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact