Artifacts in Dr. Liu's Group
ARTIFACTS IN DR. LIU'S GROUP
We open source and maintain all our tools. Our principle is to not only publish high-quality papers, but also release useful tools to the community for better code optimization and influence the hardware venders for better performance monitoring unit (PMU) support. Most of our tools have passed the artifact evaluation if available in the conference.
Highlights: Influence the community from the tool side.
-
The OpenMP Tools API (OMPT) has been adopted in standard OpenMP 5.0.
-
The patch for fast debug register address replacement has been upstreamed in Linux kernel.
Witch: A Lightweight Profiler to Identify Program Inefficiencies.
-
[Summary] Witch is the first tool that integrates PMUs and debug registers to pinpoint software inefficiencies (e.g., useless/redundant operations, false sharing, and others). Witch has very low overhead (<3%) in both runtime and memory. Moreover, Witch is a framework that can be easily extended for other analysis.
-
[Source Code] https://github.com/WitchTools
-
[Related Papers] ASPLOS'18 (Nominated for CACM Research Highlights), PPoPP'18 (Best Paper Award)
CCTLib: A Fine-grained Profiler that Performs Exhaustive Program Analysis.
-
[Summary] CCTLib is a framework that supports fine-grained profiling. CCTLib monitors each instruction instance, extracts its operater and operands, and associates it with the full calling context. Moreover, if this instruction instance is a memory access, CCTLib can associate it with the data structured allocated on heap or the static section. CCTLib, as the framework, can be easily extended to support powerful client tools, such as RedSpy, LoadSpy, DeadSpy, RVN, DataPlacer, and so on.
-
[Source Code] https://github.com/cctlib/cctlib
-
[Related Papers] ASPLOS'17 (Best Paper Candidate, ASPLOS Highlight), ISMM'16, PACT'15, CGO'14
HPCToolkit-datacentric: A Lightweight Data-centric Profiler that Identifies Memory-related Bottlenecks.
-
[Summary] HPCToolkit-datacentric leverages hardware PMUs to analyze bottlenecks in the memory subsystem. Unlike existing tools, HPCToolkit-datacentric analyzes various memory bottlenecks with extremely low overhead (~5%). This software framework is based on an open-source project---HPCToolkit (www.hpctoolkit.org), which support MPI+threads programming models. Based on HPCToolkit-datacentric, we have developed a set of memory profilers: ScaAnalyzer to study memory scaling issues, StructSlim and ArrayTool to guide data layout optimization, SMTAnalyzer and ProfDP to guide the usage of new hardware features (SMT and NVRAM), CCProf to distinguish conflict and capacity cache misses.
-
[Source Code] https://github.com/HPCToolkit/hpctoolkit/tree/hpctoolkit-datacentric
Currently HPCToolkit-datacentric is in a branch of HPCToolkit, which will be integrated into the trunk soon.
-
[Related Papers] TPDS'18, ICS'18, CGO'18, IPDPS'17, CGO'16, HPDC'16, SC'15 (Best Paper Award), PACT'14, PPoPP'14, SC'13
OMPT: OpenMP Tools API to Support Performance Analysis.
-
[Summary] We have proposed OMPT as the standard OpenMP Tools API to support performance anaysis. OMPT has been accepted in standard OpenMP 5.0.
-
[Source Code] We implemented the first version of OMPT in gcc. Currently, OMPT is available in icc, LLVM, and IBM. We have extended HPCToolkit based on OMPT. The extension is available in HPCToolkit trunk.
-
[Related Papers] ICS'13, IWOMP'13
CUDAAdvisor: A GPU Profiler to Identify Performance Issues in CUDA Code Bases.
-
[Summary] CUDAAdvisor is a fine-grained profiler based on LLVM. CUDAAdvisor instruments both CPU and GPU code to construct the data flow between the two. With the data flow, one can optimize data layout and GPU cache bypassing.
-
[Source Code] https://github.com/sderek/CUDAAdvisor
-
[Related Papers] CGO'18
NUMA-Caffe: A NUMA-aware Caffe Deep Learning Framework.
-
[Summary] NUMA-Caffe is a highly optimized Caffe (a popular deep learning framework) on modern NUMA architectures. NUMA-Caffe adopts several novel ideas to minimize the remote accessesfor better performance without accuracy loss.
-
[Source Code] https://github.com/proywm/numa-caffe
-
[Related Papers] TACO'18