|
IPPL (Independent Parallel Particle Layer)
IPPL
|
In certain applications, you might want to use profiling tools for debugging and testing. Since IPPL uses Kokkos as a backend, you can leverage Kokkos' built-in profiling tools.
This guide explains how to use Kokkos' profiling tools, using the MemoryEvents tool as an example.
MemoryEvents tracks a timeline of allocation and deallocation events in Kokkos Memory Spaces. It records time, pointer, size, memory-space-name, and allocation-name. This is in particular useful for debugging purposes to understand where all the memory is going.
Additionally, the tool provides a timeline of memory usage for each individual Kokkos Memory Space.
The tool is located at: https://github.com/kokkos/kokkos-tools/tree/develop/profiling/memory-events
First, clone the Kokkos tools repository, which contains a variety of profiling tools:
Navigate into the repository and build the tools using CMake:
Before running your application, export the Kokkos Tools environment variable to point to the kp_memory_events.so tool:
Replace {PATH_TO_TOOL_DIRECTORY} with the actual path where the tool is located.
Execute your application normally. The MemoryEvents tool will automatically collect data during execution. For example:
The MemoryEvents tool will generate the following files:
HOSTNAME-PROCESSID.mem_events: Lists memory events.
HOSTNAME-PROCESSID-MEMSPACE.memspace_usage: Provides a utilization timeline for each active memory space.
Here’s an example of how to run the profiling with a SLURM system using sbatch:
In this example:
sbatch -n 2 specifies 2 nodes.
The Kokkos tool is exported and applied to the LandauDamping application.
This guide provides the basic steps for integrating Kokkos profiling tools into your IPPL-based projects. You can adjust the commands as needed depending on your specific application and environment.
Consider the following code:
This will produce the following output:
HOSTNAME-PROCESSID.mem_events
HOSTNAME-PROCESSID-Cuda.memspace_usage
HOSTNAME-PROCESSID-CudaUVM.memspace_usage
HOSTNAME-PROCESSID-CudaHostPinned.memspace_usage
import os import sys import re
def sum_send_bytes_from_file(file_path): total_bytes_sent = 0.0 pattern = re.compile( r'^\s*Send\s+\d+\s+(\d+)\s+\d+\s+[\d.eE+-]+\s+[\d.eE+-]+\s+[\d.eE+-]+\s+([\d.eE+-]+)\s*$' ) with open(file_path, 'r') as f: for line in f: match = pattern.match(line) if match: rank, sum_bytes = match.groups() if rank.isdigit(): total_bytes_sent += float(sum_bytes) return total_bytes_sent
if name == 'main': folder_path = sys.argv[1] if os.path.exists(folder_path):
for filename in os.listdir(folder_path): if filename.endswith('.mpiP'): file_path = os.path.join(folder_path, filename) try: total_bytes = sum_send_bytes_from_file(file_path) except Exception as e: print(f"Error processing {file_path}: {e}") break # Only one histogram file per run assumed print(f"Total bytes = ", total_bytes) else: print(f"Path given does not exist!")