• Accueil
  • Accueil
  • Accueil
  • Accueil

CNRS

Rechercher




Accueil >

[PDS/HPDA Seminar] 11/2/2022 from 10:00 to 11:30 at 4A312 - Jana Ismail (reading group) and Ewa Turska (reading group)

During the PDS/HPDA Seminar of 11/2/2022 from 10:00 to 11:30, Jana Ismail will present a reading group talk and Ewa Turska will present a reading group talk.

Visio : https://webconf.imt.fr/frontend/fra-vcg-byn-fxd

Location : 4A312

# Reading group : Rabia : Simplifying State-Machine Replication Through Randomization (SOSP’21)\n\nPresented by Jana Ismail on 11/2/2022 at 10:00. Attending this presentation is mandatory for the master students.

Paper : https://dl.acm.org/doi/10.1145/3477132.3483582

Full post : https://www.inf.telecom-sudparis.eu/pds/seminars_cpt/rabia/

Abstract
We introduce Rabia, a simple and high performance framework for implementing state-machine replication (SMR) within a datacenter. The main innovation of Rabia is in using randomization to simplify the design. Rabia provides the following two features : (i) It does not need any fail-over protocol and supports trivial auxiliary protocols like log compaction, snapshotting, and reconfiguration, components that are often considered the most challenging when developing SMR systems ; and (ii) It provides high performance, up to 1.5x higher throughput than the closest competitor (i.e., EPaxos) in a favorable setup (same availability zone with three replicas) and is comparable with a larger number of replicas or when deployed in multiple availability zones.

# Reading group : VHT : Vertical Hoeffding Tree (IEEE Big Data’16)\n\nPresented by Ewa Turska on 11/2/2022 at 10:30. Attending this presentation is mandatory for the master students.

Paper : https://arxiv.org/pdf/1607.08325.pdf

Full post : https://www.inf.telecom-sudparis.eu/pds/seminars_cpt/vht-vertical-hoeffding-tree-ieee-big-data16/

Abstract
IoT Big Data requires new machine learning methods able to scale to large size of data arriving at high speed. Decision trees are popular machine learning models since they are very effective, yet easy to interpret and visualize. In the literature, we can find distributed algorithms for learning decision trees, and also streaming algorithms, but not algorithms that combine both features. In this paper we present the Vertical Hoeffding Tree (VHT), the first distributed streaming algorithm for learning decision trees. It features a novel way of distributing decision trees via vertical parallelism. The algorithm is implemented on top of Apache SAMOA, a platform for mining distributed data streams, and thus able to run on real-world clusters. We run several experiments to study the accuracy and throughput performance of our new VHT algorithm, as well as its ability to scale while keeping its superior performance with respect to non-distributed decision trees.