[PDS/HPDA Seminar] 20/1/2023 from 10:00 to 12:00 at 4A312 – Etienne Devaux (reading group), Hatem Mnaouer (reading group), Aleksandar Maksimovic (reading group) and Mohamed Iyed El Baouab (reading group)

During the PDS/HPDA Seminar of 20/1/2023 from 10:00 to 12:00, Etienne Devaux will present a reading group talk, Hatem Mnaouer will present a reading group talk, Aleksandar Maksimovic will present a reading group talk and Mohamed Iyed El Baouab will present a reading group talk.

Visio: https://webconf.imt.fr/frontend/fra-vcg-byn-fxd

Location: 4A312

# Reading group: rFaaS: RDMA-Enabled FaaS Platform for Serverless High-Performance Computing (IPDPS’23)\n\nPresented by Etienne Devaux on 20/1/2023 at 10:00. Attending this presentation is mandatory for the master students.

Paper: http://ww.unixer.de/publications/img/2021_copik_rfaas_preprint.pdf

Full post: https://www.inf.telecom-sudparis.eu/pds/seminars_cpt/reading-group-22/

## Abstract
The rigid MPI programming model and batch scheduling dominate high-performance computing. While clouds brought new levels of elasticity into the world of computing, supercomputers still suffer from low resource utilization rates. To enhance supercomputing clusters with the benefits of serverless computing, a modern cloud programming paradigm for pay-as-you-go execution of stateless functions, we present rFaaS, the first RDMA-aware Function-as-a-Service (FaaS) platform. With hot invocations and decentralized function placement, we overcome the major performance limitations of FaaS systems and provide low-latency remote invocations in multi-tenant environments. We evaluate the new serverless system through a series of microbenchmarks and show that remote functions execute with negligible performance overheads. We demonstrate how serverless computing can bring elastic resource management into MPI-based high-performance applications. Overall, our results show that MPI applications can benefit from modern cloud programming paradigms to guarantee high performance at lower resource costs.

# Reading group: Interactive Molecular Dynamics: Scaling up to Large Systems (ICCS’13)\n\nPresented by Hatem Mnaouer on 20/1/2023 at 10:30. Attending this presentation is mandatory for the master students.

Paper: https://reader.elsevier.com/reader/sd/pii/S1877050913003086?token=F32781BA24C5411C919E4FE98C9717F4755B8FAD6CFC52E444D9CBF0FC2E97A6348C843B8BED167BFF8F9E89C183FB7D&originRegion=eu-west-1&originCreation=20230118094643

Full post: https://www.inf.telecom-sudparis.eu/pds/seminars_cpt/reading-group-23/

## Abstract
Combining molecular dynamics simulations with user interaction would have various applications in both education and re- search. By enabling interactivity the scientist will be able to visualize the experiment in real time and drive the simulation to a desired state more easily. However, interacting with systems of interesting size requires significant computing resources due to the complexity of the simulation. In this paper, we propose an approach to combine a classical parallel molecular dynamics simulator, Gromacs, to a 3D virtual reality environment allowing to steer the simulation through external user forces applied with an haptic device to a selection of atoms. We specifically focused on minimizing the intrusion in the simulator code, on efficient parallel data extraction and filtering to transfer only the necessary data to the visualization environment, and on a controlled asynchronism between various components to improve interactivity. We managed to steer molecular systems of 1.7 M atoms at about 25 Hz using 384 CPU cores. This framework allowed us to study a concrete scientific problem by testing one hypothesis of the transport of an iron complex from the exterior of the bacteria to the periplasmic space through the FepA membrane protein.

# Reading group: Corey: An Operating System for Many Cores (OSDI’08)\n\nPresented by Aleksandar Maksimovic on 20/1/2023 at 11:00. Attending this presentation is mandatory for the master students.

Paper: https://www.usenix.org/legacy/event/osdi08/tech/full_papers/boyd-wickizer/boyd_wickizer.pdf

Full post: https://www.inf.telecom-sudparis.eu/pds/seminars_cpt/reading-group-25/

## Abstract
Multiprocessor application performance can be limited by the operating system when the application uses the operating system frequently and the operating system services use data structures shared and modified by multiple processing cores. If the application does not need the sharing, then the operating system will become an unnecessary bottleneck to the application’s performance. This paper argues that applications should control sharing: the kernel should arrange each data structure so that only a single processor need update it, unless directed otherwise by the application. Guided by this design principle, this paper proposes three operating system abstractions (address ranges, kernel cores, and shares) that allow applications to control inter-core sharing and to take advantage of the likely abundance of cores by dedicating cores to specific operating system functions. Measurements of microbenchmarks on the Corey prototype operating system, which embodies the new abstractions, show how control over sharing can improve performance. Application benchmarks, using MapReduce and a Web server, show that the improvements can be significant for overall performance: MapReduce on Corey performs 25% faster than on Linux when using 16 cores. Hardware event counters confirm that these improvements are due to avoiding operations that are expensive on multicore machines.

# Reading group: Photons: Lambdas on a diet (SOCC’20)\n\nPresented by Mohamed Iyed El Baouab on 20/1/2023 at 11:30. Attending this presentation is mandatory for the master students.

Paper: https://rodrigo-bruno.github.io/papers/vdukic-socc20.pdf

Full post: https://www.inf.telecom-sudparis.eu/pds/seminars_cpt/reading-group-28/

## Abstract
Serverless computing allows users to create short, stateless functions and invoke hundreds of them concurrently to tackle massively parallel workloads. We observe that even though most of the footprint of a serverless function is fixed across its invocations — language runtime, libraries, and other application state — today’s serverless platforms do not exploit this redundancy. Such an inefficiency has cascading negative impacts: longer startup times, lower throughput, higher latency, and higher cost. To mitigate these problems, we have built Photons, a framework leveraging workload parallelism to co-locate multiple instances of the same function within the same runtime. Concurrent invocations can then share the runtime and application state transparently, without compromising execution safety. Photons reduce function’s memory consumption by 25% to 98% per invocation, with no performance degradation compared to today’s serverless platforms. We also show that our approach can reduce the overall memory utilization by 30%, and the total number of cold starts by 52%.