In this guest feature from the HPC Advisory Council, authors Gilad Shainer, Tong Liu, Pak Lui, and Richard Graham explore the advantages of offloading MPI collectives communications from the CPU to ...