Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan, Mohammad Shoeybi, Jared Casper, Patrick LeGresley, Mostofa Patwary, Vijay Korthikanti, Dmitri Vainbrand, Prethvi Kashinkunti, Julie Bernauer, Bryan Catanzaro, Amar Phanishayee, Matei Zaharia.
SuperComputing 2021 (Best Student Paper).
Solving Large-Scale Granular Resource Allocation Problems Efficiently with POP
Deepak Narayanan, Fiodar Kazhamiaka, Firas Abuzaid, Peter Kraft, Akshay Agrawal, Srikanth Kandula, Stephen Boyd, Matei Zaharia.
SOSP 2021.
Memory-Efficient Pipeline-Parallel DNN Training
Deepak Narayanan, Amar Phanishayee, Kaiyu Shi, Xie Chen, Matei Zaharia.
ICML 2021.
Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads
Deepak Narayanan, Keshav Santhanam, Fiodar Kazhamiaka, Amar Phanishayee, Matei Zaharia.
OSDI 2020.
Analysis and Exploitation of Dynamic Pricing in the Public Cloud for ML Training
Deepak Narayanan, Keshav Santhanam, Fiodar Kazhamiaka, Amar Phanishayee, Matei Zaharia.
DISPA 2020.
Offload Annotations: Bringing Heterogeneous Computing to Existing Libraries and Workloads
Gina Yuan, Shoumik Palkar,
Deepak Narayanan, Matei Zaharia.
USENIX ATC 2020.
Willump: A Statistically-Aware End-to-end Optimizer for Machine Learning Inference
Peter Kraft, Daniel Kang,
Deepak Narayanan, Shoumik Palkar, Peter Bailis, Matei Zaharia.
MLSys 2020.
MLPerf Training Benchmark
Peter Mattson, Christine Cheng, Cody Coleman, Greg Diamos, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, David Brooks, Dehao Chen, Debojyoti Dutta, Udit Gupta, Kim Hazelwood, Andrew Hock, Xinyuan Huang, Bill Jia, Daniel Kang, David Kanter, Naveen Kumar, Jeffery Liao,
Deepak Narayanan, Tayo Oguntebi, Gennady Pekhimenko, Lillian Pentecost, Vijay Janapa Reddi, Taylor Robie, Tom St. John, Carole-Jean Wu, Lingjie Xu, Cliff Young, Matei Zaharia.
MLSys 2020.
PipeDream: Generalized Pipeline Parallelism for DNN Training
Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur, Gregory R. Ganger, Phillip B. Gibbons, Matei Zaharia.
SOSP 2019.
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Cody Coleman*, Daniel Kang*,
Deepak Narayanan*, Luigi Nardi, Tian Zhao, Jian Zhang, Peter Bailis, Kunle Olukotun, Chris Ré, Matei Zaharia.
SIGOPS Operating Systems Review July 2019.
Accelerating Deep Learning Workloads through Efficient Multi-Model Execution
Deepak Narayanan, Keshav Santhanam, Amar Phanishayee, Matei Zaharia.
NeurIPS Systems for ML Workshop 2018.
Analysis of the Time-To-Accuracy Metric and Entries in the DAWNBench Deep Learning Benchmark
Cody Coleman*, Daniel Kang*,
Deepak Narayanan*, Luigi Nardi, Tian Zhao, Jian Zhang, Peter Bailis, Kunle Olukotun, Chris Ré, Matei Zaharia.
NeurIPS Systems for ML Workshop 2018.
Evaluating End-to-End Optimization for Data Analytics Applications in Weld
Shoumik Palkar, James Thomas,
Deepak Narayanan, Pratiksha Thaker, Parimarjan Negi, Rahul Palamuttam, Anil Shanbhag, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe, Samuel Madden, Matei Zaharia.
VLDB 2018.
DAWNBench: An End-to-End Deep Learning Benchmark and Competition
Cody Coleman,
Deepak Narayanan, Daniel Kang, Tian Zhao, Jian Zhang, Luigi Nardi, Peter Bailis, Kunle Olukotun, Christopher Re, Matei Zaharia.
NeurIPS Systems for ML Workshop 2017.
MacroBase: Prioritizing Attention in Fast Data
Peter Bailis, Edward Gan, Samuel Madden,
Deepak Narayanan, Kexin Rong, Sahaana Suri.
SIGMOD 2017.
Weld: A Common Runtime for High Performance Data Analytics
Shoumik Palkar, James J. Thomas, Anil Shanbhag,
Deepak Narayanan, Holger Pirk, Malte Schwarzkopf, Saman Amarasinghe, Matei Zaharia.
CIDR 2017.