Skip to main content
Redhat Developers  Logo
  • Products

    Featured

    • Red Hat Enterprise Linux
      Red Hat Enterprise Linux Icon
    • Red Hat OpenShift AI
      Red Hat OpenShift AI
    • Red Hat Enterprise Linux AI
      Linux icon inside of a brain
    • Image mode for Red Hat Enterprise Linux
      RHEL image mode
    • Red Hat OpenShift
      Openshift icon
    • Red Hat Ansible Automation Platform
      Ansible icon
    • Red Hat Developer Hub
      Developer Hub
    • View All Red Hat Products
    • Linux

      • Red Hat Enterprise Linux
      • Image mode for Red Hat Enterprise Linux
      • Red Hat Universal Base Images (UBI)
    • Java runtimes & frameworks

      • JBoss Enterprise Application Platform
      • Red Hat build of OpenJDK
    • Kubernetes

      • Red Hat OpenShift
      • Microsoft Azure Red Hat OpenShift
      • Red Hat OpenShift Virtualization
      • Red Hat OpenShift Lightspeed
    • Integration & App Connectivity

      • Red Hat Build of Apache Camel
      • Red Hat Service Interconnect
      • Red Hat Connectivity Link
    • AI/ML

      • Red Hat OpenShift AI
      • Red Hat Enterprise Linux AI
    • Automation

      • Red Hat Ansible Automation Platform
      • Red Hat Ansible Lightspeed
    • Developer tools

      • Red Hat Trusted Software Supply Chain
      • Podman Desktop
      • Red Hat OpenShift Dev Spaces
    • Developer Sandbox

      Developer Sandbox
      Try Red Hat products and technologies without setup or configuration fees for 30 days with this shared Openshift and Kubernetes cluster.
    • Try at no cost
  • Technologies

    Featured

    • AI/ML
      AI/ML Icon
    • Linux
      Linux Icon
    • Kubernetes
      Cloud icon
    • Automation
      Automation Icon showing arrows moving in a circle around a gear
    • View All Technologies
    • Programming Languages & Frameworks

      • Java
      • Python
      • JavaScript
    • System Design & Architecture

      • Red Hat architecture and design patterns
      • Microservices
      • Event-Driven Architecture
      • Databases
    • Developer Productivity

      • Developer productivity
      • Developer Tools
      • GitOps
    • Secure Development & Architectures

      • Security
      • Secure coding
    • Platform Engineering

      • DevOps
      • DevSecOps
      • Ansible automation for applications and services
    • Automated Data Processing

      • AI/ML
      • Data Science
      • Apache Kafka on Kubernetes
      • View All Technologies
    • Start exploring in the Developer Sandbox for free

      sandbox graphic
      Try Red Hat's products and technologies without setup or configuration.
    • Try at no cost
  • Learn

    Featured

    • Kubernetes & Cloud Native
      Openshift icon
    • Linux
      Rhel icon
    • Automation
      Ansible cloud icon
    • Java
      Java icon
    • AI/ML
      AI/ML Icon
    • View All Learning Resources

    E-Books

    • GitOps Cookbook
    • Podman in Action
    • Kubernetes Operators
    • The Path to GitOps
    • View All E-books

    Cheat Sheets

    • Linux Commands
    • Bash Commands
    • Git
    • systemd Commands
    • View All Cheat Sheets

    Documentation

    • API Catalog
    • Product Documentation
    • Legacy Documentation
    • Red Hat Learning

      Learning image
      Boost your technical skills to expert-level with the help of interactive lessons offered by various Red Hat Learning programs.
    • Explore Red Hat Learning
  • Developer Sandbox

    Developer Sandbox

    • Access Red Hat’s products and technologies without setup or configuration, and start developing quicker than ever before with our new, no-cost sandbox environments.
    • Explore Developer Sandbox

    Featured Developer Sandbox activities

    • Get started with your Developer Sandbox
    • OpenShift virtualization and application modernization using the Developer Sandbox
    • Explore all Developer Sandbox activities

    Ready to start developing apps?

    • Try at no cost
  • Blog
  • Events
  • Videos

Customize the compilation process with Clang: Optimization options

August 5, 2019
Serge Guelton
Related topics:
Linux

Share:

    When using C++, developers generally aim to keep a high level of abstraction without sacrificing performance. That's the famous motto "costless abstractions." Yet the C++ language actually doesn't give a lot of guarantees to developers in terms of performance. You can have the guarantee of copy-elision or compile-time evaluation, but key optimizations like inlining, unrolling, constant propagation or, dare I say, tail call elimination are subject to the goodwill of the standard's best friend: the compiler.

    This article focuses on the Clang compiler and the various flags it offers to customize the compilation process. I've tried to keep this from being a boring list, and it certainly is not an exhaustive one.

    This write-up is an expanded version of the talk "Merci le Compilo" given at CPPP on June 15, 2019.

    The clang version used is based on trunk, running on RHEL 7.

    Every now and then, I'll be using the SQLite Amalgamation C source as a large third-party code. Let's assume that the following line has been sourced:

    sq=https://raw.githubusercontent.com/azadkuh/sqlite-amalgamation/master/sqlite3.c
    

    Introduction: Stating goals

    The following source code is a relatively dumb version of a program that sums up numbers read from standard input. It's most likely memory bound, but there's still some processing going on:

    #include <iostream>
    int main(int argc, char** argv) {
      long s = 0;
      while (std::cin) {
        long tmp = 0;
        std::cin >> tmp;
        s += tmp;
      }
      std::cout << s << std::endl;
      return 0;
    }
    

    This is a relatively similar—but not equivalent—program written in Python. Python uses big integers by default so it behaves differently with respect to overflow, but it's enough for our purposes.

    import sys
    print(sum(int(x) for x in sys.stdin.readlines()))
    

    Let's take a dumb approach and measure the execution time of these two programs on a relatively large input set:

    $ seq 1000000 > numbers
    $ clang++ sum.cpp -o sum
    $ time ./sum < numbers
    0.61s user 0.01s system 94% cpu 0.659 total
    
    $ time python sum.py < numbers
    0.77s user 0.04s system 99% cpu 0.818 total
    

    The native code certainly is faster, but not by much. We can't draw too many conclusions from a single run, but there's at least one sure thing: The clang user has not specified their intent, so the compiler just generated a valid binary—this is thankfully a hard constraint—and didn't try to optimize it for whatever metric its user is interested in.

    Had the user wanted to optimize for execution speed, they should have specified that intent, say, through the -O2 flag:

    $ clang++ -O2 sum.cpp -o sum
    $ time ./count < numbers
    0.34s user 0.00s system 99% cpu 0.348 total
    

    Multi-criteria optimization

    For a wide range of codebases, there's something more than just optimize for speed. Sometimes, you want to limit the size of the binary; sometimes, you're okay with trading speed for extra security. This also depends on where you are in the development life cycle. During code editing, for example, you want a fast analysis of your code, and during bug tracking, you want as much debug information as possible, etc.

     #
     ##                           #
     ##                           ##
     ##            ##             ##
     ##            ##             ##
     ##            ##             ##
     ##    ##      ##             ##
     ##    ##      ##      #      ##
     ##    ##      ##      ##     ##
    PERF  DEBUG   EDIT    SECU   SIZE
    

    Performance

    I want the generated binary to run fast is a very common query for the compiler, so the following flags are among the most used ones:

    • -O0: No optimization at all.
    • -O1: O1 = (O0 + O2)/(2). I scarcely use this flag.
    • -O2: Optimize as much as possible, without taking the risk of significantly increasing the binary size or degrading performance.
    • -O3: Optimize even more, trading binary size for speed, and sometimes making decisions that may negatively impact performance.
    • -O4: O3 = O4. This is a myth.

    Bonus: -O3 -mllvm -polly activates polyhedral optimizations, if Clang was compiled with Polly support.

    Debug

    I want to debug my code, I don't care about performance is sadly a common request too :-/

    • -g: Include debug information.
    • -Og: == -O1 -g. That's already a trade-off between performance and debuggability.

    For the curious ones, the following snippet verifies that debug information sections are actually generated when passing the -g flag:

    $ curl $sq | clang -xc -c -g - -o sq.o
    $ objdump -h sq.o | grep debug
      #  name            size      ...
       9 .debug_str      00012b2d  ...
      10 .debug_abbrev   0000038d  ...
      11 .debug_info     0005056c  ...
      12 .debug_ranges   00000240  ...
      13 .debug_macinfo  00000001  ...
      14 .debug_pubnames 0000c73a  ...
      15 .debug_pubtypes 00001068  ...
      19 .debug_line     00073402  ...
    

    Security

    I want to protect my code from others—and myself is growing in importance these days. There aren't a lot of flags that impact security without impacting performance, but it's worth mentioning -D_FORTIFY_SOURCE=2. This picks a different declaration for a few functions, for example:

    $ clang -xc -c -O2 - -S -emit-llvm -o - -D_FORTIFY_SOURCE=2 << EOF
    #include <stdio.h>
    void foo(char *s) {
      printf(s, s);
    }
    EOF
    define void @foo(i8*) {
      %2 = tail call i32 (i32, i8*, ...) @__printf_chk(i32 1, i8* %0, i8* %0)
      ret void
    }
    

    The macro definition enables a hardened version of printf, namely __printf_chk, that also checks the number of variadic argument.

    Size

    I want to do some kind of weight control over my binary may be a valid requirement for some embedded system. In that case, you can use:

    • -Os: Same as -O2 with extra code size optimization, including different parameters for transformations like inlining.
    • -Oz: Same as -Os with more size optimizations, at the price of less performance.

    Let's showcase the impact of theses flags on the amalgamation binary:

    $ curl $sq|clang -xc - -O2 -c -o-|wc -c
    1488400
    $ curl $sq|clang -xc - -Os -c -o-|wc -c
    850696
    $ curl $sq|clang -xc - -Oz -c -o-|wc -c
    796976
    

    Editing

    The compiler also helps to produce better code through a bunch of warning and code-editing features:

    • -Wall: (Almost) all warnings.
    • -Werror[=...]: If you believe that a warning should be an error, you can selectively enable that feature, per warning.
    • -w: If you don't know what it does, you probably don't want to :-)
    • -Xclang -code-completion-at: An internal flag that can be used by IDE to provide smart code completion.
    $ cat hello.cpp
    #include <iostream>
    int main(int argc, char**argv) {
      std::co
    $ clang++ -Xclang -code-completion-at=hello.cpp:3:10 -fsyntax-only hello.cpp
    COMPLETION: codecvt : codecvt<<#typename _InternT#>, <#typename _ExternT#>, <#typename _StateT#>>
    COMPLETION: codecvt_base : codecvt_base
    ...
    COMPLETION: cout : [#ostream#]cout
    

    In this case, clang outputs all identifiers starting with co available in namespace std.

    In the next article, we'll look at various compromises and tradeoffs involved in optimization, such as debug precision versus binary size, the impact of the optimization level on compilation time, and performance versus security. Stay tuned.

    Last updated: July 29, 2019

    Recent Posts

    • LLM Compressor: Optimize LLMs for low-latency deployments

    • How to set up NVIDIA NIM on Red Hat OpenShift AI

    • Leveraging Ansible Event-Driven Automation for Automatic CPU Scaling in OpenShift Virtualization

    • Python packaging for RHEL 9 & 10 using pyproject RPM macros

    • Kafka Monthly Digest: April 2025

    Red Hat Developers logo LinkedIn YouTube Twitter Facebook

    Products

    • Red Hat Enterprise Linux
    • Red Hat OpenShift
    • Red Hat Ansible Automation Platform

    Build

    • Developer Sandbox
    • Developer Tools
    • Interactive Tutorials
    • API Catalog

    Quicklinks

    • Learning Resources
    • E-books
    • Cheat Sheets
    • Blog
    • Events
    • Newsletter

    Communicate

    • About us
    • Contact sales
    • Find a partner
    • Report a website issue
    • Site Status Dashboard
    • Report a security problem

    RED HAT DEVELOPER

    Build here. Go anywhere.

    We serve the builders. The problem solvers who create careers with code.

    Join us if you’re a developer, software engineer, web designer, front-end designer, UX designer, computer scientist, architect, tester, product manager, project manager or team lead.

    Sign me up

    Red Hat legal and privacy links

    • About Red Hat
    • Jobs
    • Events
    • Locations
    • Contact Red Hat
    • Red Hat Blog
    • Inclusion at Red Hat
    • Cool Stuff Store
    • Red Hat Summit

    Red Hat legal and privacy links

    • Privacy statement
    • Terms of use
    • All policies and guidelines
    • Digital accessibility

    Report a website issue