• Accueil
  • Accueil
  • Accueil
  • Accueil

CNRS

Rechercher




Accueil > Équipes > ACMES > Séminaires ACMES > Séminaires 2022 ACMES

[PDS/HPDA Seminar] 21/1/2022 from 10:00 to 11:30 at 4A312 - Igor Albuquerque Silva (reading group) and Vitor Georgen (reading group)

During the PDS/HPDA Seminar of 21/1/2022 from 10:00 to 11:30, Igor Albuquerque Silva will present a reading group talk and Vitor Georgen will present a reading group talk.

Visio : https://webconf.imt.fr/frontend/fra-vcg-byn-fxd

Location : 4A312

# Reading group : Warehouse-Scale Video Acceleration : Co-design and Deployment in the Wild (ASPLOS’21)\n\nPresented by Igor Albuquerque Silva on 21/1/2022 at 10:00. Attending this presentation is mandatory for the master students.

Paper : https://www.gwern.net/docs/cs/2021-ranganathan.pdf

Full post : https://www.inf.telecom-sudparis.eu/pds/seminars_cpt/warehouse-scale/

## Abstract
Video sharing (e.g., YouTube, Vimeo, Facebook, TikTok) accounts for the majority of internet traffic, and video processing is also foundational to several other key workloads (video conferencing, virtual/augmented reality, cloud gaming, video in Internet-of-Things devices, etc.). The importance of these workloads motivates larger video processing infrastructures and – with the slowing of Moore’s law – specialized hardware accelerators to deliver more computing at higher efficiencies. This paper describes the design and deployment, at scale, of a new accelerator targeted at warehouse-scale video transcoding. We present our hardware design including a new accelerator building block – the video coding unit (VCU) – and discuss key design trade-offs for balanced systems at data center scale and co-designing accelerators with large-scale distributed software systems. We evaluate these accelerators “in the wild" serving live data center jobs, demonstrating 20-33x improved efficiency over our prior well-tuned non-accelerated baseline. Our design also enables effective adaptation to changing bottlenecks and improved failure management, and new workload capabilities not otherwise possible with prior systems. To the best of our knowledge, this is the first work to discuss video acceleration at scale in large warehouse-scale environments.

# Reading group : Horovod : fast and easy distributed deep learning in TensorFlow (MLSys’19)\n\nPresented by Vitor Georgen on 21/1/2022 at 10:30. Attending this presentation is mandatory for the master students.

Paper : https://arxiv.org/pdf/1802.05799.pdf

Full post : https://www.inf.telecom-sudparis.eu/pds/seminars_cpt/horovod/

## Abstract
Training modern deep learning models requires large amounts of computation, often provided by GPUs. Scaling computation from one GPU to many can enable much faster training and research progress but entails two complications. First, the training library must support inter-GPU communication. Depending on the particular methods employed, this communication may entail anywhere from negligible to significant overhead. Second, the user must modify his or her training code to take advantage of inter-GPU communication. Depending on the training library’s API, the modification required may be either significant or minimal. Existing methods for enabling multi-GPU training under the TensorFlow library entail non-negligible communication overhead and require users to heavily modify their model-building code, leading many researchers to avoid the whole mess and stick with slower single-GPU training. In this paper we introduce Horovod, an open source library that improves on both obstructions to scaling : it employs efficient inter-GPU communication via ring reduction and requires only a few lines of modification to user code, enabling faster, easier distributed training in TensorFlow. Horovod is available under the Apache 2.0 license at https://github.com/uber/horovod.