Workloads & Experiments
This page describes the workloads available in OCEAN and how to run experiments with them.
Note
This guide assumes you have completed Getting Started and Configuration & Setup. Your CXL server should be running and VMs should be connected.
Overview
OCEAN includes workloads across multiple computational domains:
GROMACS - Molecular dynamics for scientific computing
TIGON - Distributed database with TPC-C benchmark
OSU Benchmarks - MPI collective communication tests
GAPBS - Graph algorithm benchmarks
How Workloads Use CXL Memory
All workloads access CXL memory transparently through the MPI shim library:
Application
↓
MPI Shim (LD_PRELOAD)
↓
/dev/dax0.0
↓
QEMU VM
↓
CXL Server
↓
Memory Pools
The shim intercepts memory allocation calls (malloc, calloc, etc.) and redirects
them to CXL memory without requiring application code changes.
Common Setup
Environment Variables
Set these variables before running any workload:
export CXL_DAX_PATH="/dev/dax0.0"
export CXL_DAX_RESET=1
export CXL_SHIM_TRACE=1 # Optional: enable tracing
export CXL_SHIM_VERBOSE=1 # Optional: verbose output
Variable |
Purpose |
|---|---|
|
Path to shim library (set per workload) |
|
CXL device path (default: |
|
Reset allocation counter (set to |
|
Enable detailed tracing ( |
|
Enable verbose output ( |
Hostfile Configuration
For multi-host experiments, create a hostfile with node IPs:
# hostfile
192.168.100.10
192.168.100.11
With slot specification (optional):
192.168.100.10 slots=4
192.168.100.11 slots=4
GROMACS
What: Molecular dynamics simulation for biomolecular research.
Why: Tests CXL memory with scientific computing workloads and multi-host MPI execution.
Building
cd workloads/gromacs
./build.sh
Distribute the shim library to all nodes:
scp libmpi_cxl_shim.so root@192.168.100.10:/root
scp libmpi_cxl_shim.so root@192.168.100.11:/root
Running
cd workloads/gromacs
mpirun --allow-run-as-root \
-x CXL_SHIM_TRACE \
-x CXL_DAX_PATH \
-x LD_PRELOAD=$PWD/libmpi_cxl_shim.so \
--hostfile ./hostfile \
./gromacs-2025.3/build/bin/gmx_mpi mdrun \
-s benchMEM.tpr \
-nsteps 10000 \
-ntomp 1
Key Parameters
-s benchMEM.tpr- Simulation input file-nsteps 10000- Number of simulation steps-ntomp 1- OpenMP threads per MPI rank
Expected Output
Starting mdrun on 2 ranks
...
Performance: 123.4 ns/day, 0.194 hours/ns
...
TIGON
What: Distributed database with io_uring integration and TPC-C benchmark.
Why: Tests database workloads with CXL memory and transaction processing.
Setup
cd workloads/tigon
./scripts/setup.sh HOST
./emulation/image/make_vm_img.sh
Start VMs with CXL support:
sudo ./emulation/start_vms.sh --using-old-img --cxl 0 5 2 0 1
Note
--using-old-img reuses existing VM images. --cxl parameters configure
CXL device settings (device ID, memory size, host count, etc.).
Configure VMs:
./scripts/setup.sh VMS 2
./scripts/run.sh COMPILE_SYNC 2
Running TPC-C
./scripts/run_tpcc_dax.sh TwoPLPasha 2 3 mixed 10 15 1 0 1 \
Clock OnDemand 200000000 1 WriteThrough \
None 15 5 GROUP_WAL 20000 0 0
Key Parameters
TwoPLPasha- Concurrency control protocol2- Number of hosts3- Warehouses per hostmixed- Workload type (read/write mix)
Expected Output
Throughput: 1234 txn/s
Latency (mean): 12.3 ms
Latency (p99): 45.6 ms
Abort rate: 2.1%
OSU Benchmarks
What: MPI communication performance benchmarks (latency, bandwidth, collectives).
Why: Tests MPI operations with CXL-backed memory and validates multi-host communication.
Installation
# Inside VM
wget https://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-7.4.tar.gz
tar -xzf osu-micro-benchmarks-7.4.tar.gz
cd osu-micro-benchmarks-7.4
./configure CC=mpicc
make
Running a Benchmark
export CXL_DAX_PATH="/dev/dax0.0"
export CXL_DAX_RESET=1
LD_PRELOAD=/root/libmpi_cxl_shim.so \
mpirun --allow-run-as-root \
-np 2 \
-hostfile hostfile \
-x CXL_DAX_PATH \
-x CXL_DAX_RESET \
-x LD_PRELOAD \
~/osu-micro-benchmarks/mpi/collective/osu_allgather
Available Benchmarks
Collective Operations:
osu_allgather,osu_allreduce,osu_alltoallosu_barrier,osu_bcast,osu_reduce
Point-to-Point:
osu_latency,osu_bw,osu_bibw
Expected Output
# OSU MPI Allgather Test
# Size Avg Latency(us)
1 12.34
2 13.45
4 14.56
...
GAPBS
What: Graph algorithm benchmarks (BFS, PageRank, betweenness centrality).
Why: Tests irregular memory access patterns and graph analytics workloads.
Building
cd workloads/gapbs
make
Running
Breadth-First Search:
LD_PRELOAD=/root/libmpi_cxl_shim.so ./bfs -g 20 -n 16
PageRank:
LD_PRELOAD=/root/libmpi_cxl_shim.so ./pr -g 20 -n 16 -i 20
Key Parameters
-g 20- Graph scale (2^20 vertices)-n 16- Number of trials-i 20- Iterations (PageRank only)
Available Algorithms
bfs- Breadth-First Searchpr- PageRankbc- Betweenness Centralitycc- Connected Componentssssp- Single-Source Shortest Paths
Expected Output
Trial 1 Time: 1.234 seconds
Trial 2 Time: 1.235 seconds
Average Time: 1.234 seconds
Monitoring & Debugging
During Execution
Check CXL device:
ls -l /dev/dax0.0
Monitor resources:
free -h
numactl --hardware
Server logs:
The CXL server outputs connection and memory access statistics to stdout.
Enable Detailed Logging
export SPDLOG_LEVEL=debug
export CXL_SHIM_TRACE=1
export CXL_SHIM_VERBOSE=1
Common Issues
Issue |
Solution |
|---|---|
Workload fails to start |
Verify server running, |
MPI communication errors |
Check hostfile IPs, verify VMs can ping each other |
Memory allocation failures |
Set |
Poor performance |
Check VM resources, network connectivity, topology configuration |
Debugging Commands
Verify environment:
mpirun ... env | grep CXL
Check MPI rank mapping:
mpirun --display-map ...
Test connectivity:
ping 192.168.100.11
Next Steps
After running experiments:
Review Output - Check application performance metrics
Analyze Traces - If
CXL_SHIM_TRACE=1was set, examine memory access patternsCompare Results - Run with/without CXL to measure benefits
Tune Configuration - Adjust topology, memory allocation, or host count
Scale Up - Add more hosts to test larger configurations
For detailed analysis:
Server logs contain CXL fabric statistics
Application output shows workload-specific metrics
System tools (
perf,numactl) provide additional insights