Analysing and measuring the performance of a CPU while it is executing a program
- Understand how efficiently a program uses the CPU
- Identify performance bottlenecks
- Optimise the program to run faster and more efficiently
Metrics
- CPU usage: percentage of CPU capacity being used by the program, low usage might mean the program is waiting for other resources (I/O, …)
- Execution time: total time taken by the CPU to execute a program or a specific function within the program
- Call graph: visual representation of function calls made during the execution
- CPU cycles: number of clock cycles the CPU spends executing a particular section of code, high cycle counts = inefficiencies (complex calculations or frequent memory accesses)
- Instruction count: number of instructions executed by the CPU, aim to reduce it
- Cache misses: when the CPU cannot find the required data in its cache and must fetch it from slower main memory
- Thread performance: how well the CPU manages and executes multiple threads
- Context switching: overhead when the CPU switches from executing one process or thread to another
Steps
- Determine which part of the program you want to profile: entire application, specific function, section of code.
- Use a profiling tool to run the program while collecting performance data
- Examine the collected data to identify performance bottlenecks: functions that consume a lot of CPU time, have high instruction counts, or cause frequent cache misses
- Make changes to the code to improve performance: rewriting inefficient algorithms, optimising loops, reducing function calls, better managing memory
- Re-profile and Iterate
Tools
- gprof: Unix-like systems
- Visual Studio Profiler: integrated into Microsoft Visual Studio
- perf: Linux tool for profiling CPU performance
- Intel VTune Profiler
TODO: DETAIL USAGE OF gprof, perf and VSProfiler