I don't think we're in disagreement. You have to consider big-O cost, memory foo...

inetknght · 2025-04-30T00:11:50 1745971910

> I don't think we're in disagreement.

Very likely we agree on many points but perhaps disagree at the outcome :)

So to expand & reply...

> do you compression data sent over the network?

Depends on size of data, complexity of data, and performance of your system, and performance of the receiving system.

The choice you make might be correct for one set of variables but there might be a better choice for a different set of variables. If the receiving system is overloaded, then decompression might add to that. If the sending system is overloaded, then compressing might add to that.

But compression over the network is often a moot point since most networks should be encrypted whereas compression can be used to defeat encryption (see compression-oracle-attacks such as BREACH [0] and CRIME [1]).

[0]: https://en.wikipedia.org/wiki/BREACH_(security_exploit)

[1]: https://en.wikipedia.org/wiki/CRIME

> using a higher compression level means you can keep the underlying data simpler

Why? Moreover, I would never rely on data to be compressed-well to then leave data simpler (but perhaps more data).

> would giving that memory to the database for use as cache be better?

As in, shared-memory? Or do you mean to implement your own database?

> The individual choices are simple, but they compound and affect the overall performance in unpredictable ways.

I disagree about unpredictability. I've rarely found some effect to be unpredictable. Generally when something is unpredictable it actually represented something I didn't understand.

> The only way to be sure you aren't missing something obvious is to check all, or at least most combinations.

Would you check all combinations of inputs for a given continuous function? No, of course not, that would be effectively impossible.

The engineer should identify the edge cases and the corner cases and ensure that the product works within acceptable behavior for the systems being targeted. There's a saying: "you only care about performance of the things you benchmark, and be careful what you benchmark". So if you don't have any performance benchmarks then you don't care about performance. If you do have performance benchmarks, then use them in the systems you intend to deploy to.

Ultimately, it takes a seasoned engineer to understand a whole system and especially to understand what a given change might incur in terms of performance to that whole system. If a seasoned engineer can do it then AI is just around the corner to do it automatically. That's not a brute-force task. We just don't have a whole lot of seasoned engineers; but we do have a ton of engineers with deep domain-specific knowledge though, and those domain-specific knowledge engineers often would brute-force some aspect they don't understand. They can usually fix it in a patch, after all.

Performance is a trade-off between various aspects of various systems. That doesn't make it a brute-force task. The trade-off decisions can be made with the right data. The right data can be found either: via brute force with benchmarks, or with proper documentation about the system. The system might tell you about its performance if you ask it (eg, the system is self-documenting), but you also have to beware that the performance can change -- and neither benchmark (brute force) nor documentation will tell you unless you look for it (re-run benchmarks or re-query documentation).