CPU Profiling

Analysing and measuring the performance of a CPU while it is executing a program

  • Understand how efficiently a program uses the CPU
  • Identify performance bottlenecks
  • Optimise the program to run faster and more efficiently

Metrics

  • CPU usage: percentage of CPU capacity being used by the program, low usage might mean the program is waiting for other resources (I/O, …)
  • Execution time: total time taken by the CPU to execute a program or a specific function within the program
  • Call graph: visual representation of function calls made during the execution
  • CPU cycles: number of clock cycles the CPU spends executing a particular section of code, high cycle counts = inefficiencies (complex calculations or frequent memory accesses)
  • Instruction count: number of instructions executed by the CPU, aim to reduce it
  • Cache misses: when the CPU cannot find the required data in its cache and must fetch it from slower main memory
  • Thread performance: how well the CPU manages and executes multiple threads
  • Context switching: overhead when the CPU switches from executing one process or thread to another

Steps

  1. Determine which part of the program you want to profile: entire application, specific function, section of code.
  2. Use a profiling tool to run the program while collecting performance data
  3. Examine the collected data to identify performance bottlenecks: functions that consume a lot of CPU time, have high instruction counts, or cause frequent cache misses
  4. Make changes to the code to improve performance: rewriting inefficient algorithms, optimising loops, reducing function calls, better managing memory
  5. Re-profile and Iterate

Tools

  • gprof: Unix-like systems
  • Visual Studio Profiler: integrated into Microsoft Visual Studio
  • perf: Linux tool for profiling CPU performance
  • Intel VTune Profiler

TODO: DETAIL USAGE OF gprof, perf and VSProfiler