Terry Jones

0000-0003-2187-9707

Publications

Linux OS Jitter Measurements at Large Node Counts using a BlueGene/L
Purple L1 Milestone Review Panel - MPI
Design and Implementation of a Scalable Membership Service for Supercomputer Resiliency-Aware Runtime
Analyzing the Interplay of Failures and Workload on a Leadership-Class Supercomputer
A uGNI-Based Asynchronous Message-driven Runtime System for Cray Supercomputers with Gemini Interconnect
Time Distribution Alternatives for the Smart Grid Workshop Report
A Clock Synchronization Strategy for Minimizing Clock Variance at Runtime in High-End Computing Environments
MVAPICH-Aptus: Scalable High-Performance Mult-Transport MPI over InfiniBand
System-Level Support for Composition of Applications
UNITY: Unified Memory and File Space
Advanced Electrical Power System Sensors Workshop Report
Quantifying Scheduling Challenges for Exascale System Software
Digital Object Identifiers For OLCF
Reducing Connection Memory Requirements of MPI for InfiniBand Clusters: A Message Coalescing Approach
Mapping Dense LU Factorization on Multicore Supercomputer Nodes
Filtering log data: Finding the Needles in the Haystack
Linux Kernel Co-Scheduling For Bulk Synchronous Parallel Applications
Time Synchronization in the Electric Power System
Optimizing Fine-grained Communication in a Biomolecular Simulation Application on Cray XK6
Accurate Fault Prediction of BlueGene/P RAS Logs Via Geometric Reduction
HPC-Colony: Services and Interfaces for Very Large Systems
HPC System Call Usage Trends
Providing Runtime Clock Synchronization With Minimal Node-to-Node Time Deviation on XT4s and XT5s
TALC: A Simple C Language Extension For ImprovedPerformance and Code Maintainability
MPI PERUSE: An MPI Extension for Revealing Unexposed Implementation Information
scalable infrastructure to support supercomputer resiliency-aware applications and load balancing
An Alternative Timing and Synchronization Approach for Situational Awareness and Predictive Analytics
Evaluating the effectiveness of program data features for guiding memory management
IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems
Understanding failures through the lifetime of a top-level supercomputer
An evaluation of the state of time synchronization on leadership class supercomputers
Flexible and Effective Object Tiering for Heterogeneous Memory Systems
High Performance Computing
Large-scale distributed deep learning: A study of mechanisms and trade-offs with pytorch
Optimizing I/O forwarding techniques for extreme-scale event tracing
Recent Advances in Precision Clock Synchronization Protocols for Power Grid Control Systems
Online Application Guidance for Heterogeneous Memory Systems
Large-Scale Distributed Deep Learning: A Study of Mechanisms and Trade-Offs with PyTorch
Clock synchronization in high‐end computing environments: a strategy for minimizing clock variance at runtime
Linux kernel co-scheduling and bulk synchronous parallelism
Understanding Soft Error Sensitivity of Deep Learning Models and Frameworks through Checkpoint Alteration
Towards a Model to Estimate the Reliability of Large-Scale Hybrid Supercomputers
3-Dimensional root cause diagnosis via co-analysis
Analyzing a Five-Year Failure Record of a Leadership-Class Supercomputer
Portable application guidance for complex memory systems
Autonomy Loops for Monitoring, Operational Data Analytics, Feedback, and Response in HPC Operations
Enabling event tracing at leadership-class scale through I/O forwarding middleware
Performance Potential of Mixed Data Management Modes for Heterogeneous Memory Systems
The ECP SICM project: Managing complex memory hierarchies for exascale applications
Flexible and Effective Object Tiering for Heterogeneous Memory Systems
Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System
Impacts of Operating Systems on the Scalability of Applications
Performance of an MPI-IO implementation using third-party transfer
Sizing and Tuning GPFS
Performance of the IBM General Parallel File System
An MPI-IO Interface to HPSS
Parallelizing Monte Carlo with PMC

Usage metrics

Co-workers & collaborators

SS
Sameer Shende
JM
John Mellor-Crummey
ML
Michael A. Lang
GE
Greg Eisenhauer
MB
Michael Brim
GV
Geoffroy Vallee

Terry Jones

Publications

Usage metrics

Co-workers & collaborators

Terry Jones's public data