http://spcl.inf.ethz.ch/Publications/.pdf/ivanov-sten.pdf
... Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan ... Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan ...
Near-Optimal Sparse Allreduce for Distributed Deep Learning
Near-Optimal Sparse Allreduce for Distributed Deep Learning, http://spcl.inf.ethz.ch/Publications/.pdf/shigang-oktopk.pdf, Near-Optimal Sparse Allreduce for Distributed Deep Learning Near-Optimal ...
https://people.inf.ethz.ch/moswald/publications/resources/Takmaz-et...
https://people.inf.ethz.ch/moswald/publications/resources/Takmaz-et-al-3DV-2021-nrd.pdf, Unsupervised Monocular Depth Reconstruction of Non-Rigid Scenes Ayc¸a Takmaz2, Danda Pani Paudel1, Thomas P...
http://spcl.inf.ethz.ch/Publications/.pdf/2023_iclr_gptq.pdf
... , James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al ... , James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al ...
QUIK: Towards End-to-end 4-Bit Inference on Generative Large Langua...
... ., Lin, Z., Gimelshein, N., Antiga, L., et al. Pytorch: An imperative style, high-performance deep ... ., Lin, Z., Gimelshein, N., Antiga, L., et al. Pytorch: An imperative style, high-performance deep ...
Chimera: Efficiently Training Large-Scale Neural Networks with Bidi...
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines, http://spcl.inf.ethz.ch/Publications/.pdf/shigang-chimera.pdf, Chimera: Efficiently Training Large-Scale Neu...
http://spcl.inf.ethz.ch/Publications/.pdf/deinsum.pdf
http://spcl.inf.ethz.ch/Publications/.pdf/deinsum.pdf, Deinsum: Practically I/O Optimal Multilinear Algebra 1st Alexandros Nikolaos Ziogas Department of Computer Science ETH Zurich Zurich, Switzer...
http://spcl.inf.ethz.ch/Publications/.pdf/smoe.pdf
http://spcl.inf.ethz.ch/Publications/.pdf/smoe.pdf, Spatial Mixture-of-Experts Nikoli Dryden ETH Zürich ndryden@ethz.ch Torsten Hoefler ETH Zürich htor@inf.ethz.ch Abstract Many data have an under...
http://spcl.inf.ethz.ch/Publications/.pdf/wagmaSGD.pdf
http://spcl.inf.ethz.ch/Publications/.pdf/wagmaSGD.pdf, Breaking (Global) Barriers in Parallel Stochastic Optimization with Wait-Avoiding Group Averaging 1st Shigang Li Department of Computer Scie...
http://spcl.inf.ethz.ch/Publications/.pdf/yunquiang-trans_pruning.pdf
... Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An ... Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An ...