{"id":1079,"date":"2018-09-18T15:55:00","date_gmt":"2018-09-18T13:55:00","guid":{"rendered":"https:\/\/samovar2022.int-evry.fr\/index.php\/2018\/09\/18\/optimisation-dynamique-des-ressources-des-reseaux-cellulaires-base-sur-les-techniques-danalyses-de-donnees-et-des-techniques-dapprentissage-automatiques\/"},"modified":"2020-09-04T18:45:45","modified_gmt":"2020-09-04T16:45:45","slug":"optimisation-dynamique-des-ressources-des-reseaux-cellulaires-base-sur-les-techniques-danalyses-de-donnees-et-des-techniques-dapprentissage-automatiques","status":"publish","type":"post","link":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/2018\/09\/18\/optimisation-dynamique-des-ressources-des-reseaux-cellulaires-base-sur-les-techniques-danalyses-de-donnees-et-des-techniques-dapprentissage-automatiques\/","title":{"rendered":"Optimisation dynamique des ressources des r\u00e9seaux cellulaires bas\u00e9 sur les techniques d&rsquo;analyses de donn\u00e9es et des techniques d&rsquo;apprentissage automatiques"},"content":{"rendered":"<p>AVIS DE SOUTENANCE de Monsieur Seif Eddine HAMMAMI<br \/>\nAutoris\u00e9 \u00e0 pr\u00e9senter ses travaux en vue de l\u2019obtention du Doctorat de T\u00e9l\u00e9com SudParis avec l&rsquo;Universit\u00e9 Paris 6 en : Informatique &#038; R\u00e9seaux<br \/>\n\u00abOptimisation dynamique des ressources des r\u00e9seaux cellulaires bas\u00e9 sur les techniques d&rsquo;analyses de donn\u00e9es et des techniques d&rsquo;apprentissage automatiques\u00bb<br \/>\n<strong><br \/>\nle 20 septembre 2018 \u00e0 14:00 &#8211; Salle Amphi 34-Batiment 862<br \/>\nAdresse : CEA Saclay Nano-INNOV, 8 Avenue de la Vauve, 91120 Palaiseau <\/strong> <\/p>\n<p><strong>Membres du jury :<\/strong><\/p>\n<p>Directeur de th\u00e8se : Hossam AFIFI &#8211; Professeur HDR<\/p>\n<p><strong>Rapporteurs <\/strong> :<\/p>\n<table>\n<tbody>\n<tr class='row_even'>\n<td>Hac\u00e8ne FOUCHAL <\/td>\n<td> Professeur &#8211; Universit\u00e9 de Reims-Champagne-Ardenne<\/td>\n<\/tr>\n<tr class='row_odd'>\n<td>Mathieu BOUET <\/td>\n<td> Directeur d&rsquo;\u00e9tudes &#8211; Thal\u00e8s &#8211; France<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong>Examinateurs :<\/strong><\/p>\n<table>\n<tbody>\n<tr class='row_even'>\n<td>Houda LABIOD <\/td>\n<td>Professeure &#8211; T\u00e9l\u00e9com ParisTech<\/td>\n<\/tr>\n<tr class='row_odd'>\n<td>Yvon GOURHANT <\/td>\n<td> Ing\u00e9nieur de recherche &#8211; Orange Labs<\/td>\n<\/tr>\n<tr class='row_even'>\n<td>Hassine MOUNGLA <\/td>\n<td>Ma\u00eetre de conf\u00e9rences &#8211; Universit\u00e9 Paris Descartes<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong><br \/>\nR\u00e9sum\u00e9 :<\/strong><br \/>\nLes traces r\u00e9elles des r\u00e9seaux cellulaire est la cl\u00e9 de voute de ma th\u00e8se de doctorat. En effet, je propose dans cette th\u00e8se des nouvelles approches dans l\u2019\u00e9tude et l\u2019analyse des probl\u00e9matiques des r\u00e9seaux de t\u00e9l\u00e9communications en utilisant ces traces r\u00e9elles contrairement aux approches classiques bas\u00e9es sur des jeux de donn\u00e9es simul\u00e9s ou g\u00e9n\u00e9r\u00e9es par des processus al\u00e9atoires. Ces traces cellulaires sont pr\u00e9sentes sous la forme de jeux de donn\u00e9es de CDR (Call Detail Records ou statistiques d\u2019appels) repr\u00e9sent\u00e9s par des information horodat\u00e9es sur chaque interaction de l\u2019abonn\u00e9 avec l\u2019infrastructure des r\u00e9seaux mobile, quelques soient les appels re\u00e7us\/\u00e9mis, des SMS ou des sessions d\u2019internet. Vu leur richesse et le fait qu\u2019ils refl\u00e8tent des cas d\u2019usages r\u00e9els, les informations massives qui peuvent \u00eatre extraites et analys\u00e9es de ces jeux donn\u00e9s, ont \u00e9t\u00e9 exploit\u00e9s intensivement dans mes travaux de th\u00e8se pour d\u00e9velopper de nouveaux algorithmes qui ont pour but de changer litt\u00e9ralement les m\u00e9canismes de gestion et d\u2019optimisation dans le cadre de l\u2019usage des ressources r\u00e9seaux. Outre les informations temporelles, les CDRs contiennent aussi les informations g\u00e9ographiques qui projettent l\u2019emplacement instantan\u00e9 de l\u2019abonn\u00e9 durant ses interactions. En combinant les \u00e9chelles temporelles et g\u00e9ographiques, nous pouvons d\u00e9duire les dynamicit\u00e9s spatio-temporelle de l\u2019usage r\u00e9seaux de chaque abonn\u00e9e ainsi que les mod\u00e8les dynamiques de l\u2019utilisation de la bande passante sur les stations de bases.<br \/>\nLes jeux de donn\u00e9es des CDR sont g\u00e9n\u00e9ralement des donn\u00e9es brutes et qui n\u00e9cessitent des outils avanc\u00e9s d\u2019analyse de donn\u00e9es et d\u2019intelligence artificielle afin d\u2019extraire les informations les plus importantes. Dans ce contexte, on propose dans cette th\u00e8se une \u00e9tude structur\u00e9e pour analyser des traces r\u00e9elles de CDRs r\u00e9els comme les traces du \u00ab D4D challenge \u00bb contenant les donn\u00e9es du r\u00e9seau cellulaire d\u2019Orange S\u00e9n\u00e9gal et les traces du \u00ab Big Data challenge \u00bb fournis par l\u2019op\u00e9rateur Telecom Italia. Notre m\u00e9thode consiste, en premier lieu, \u00e0 regrouper intelligemment les s\u00e9ries temporelles journali\u00e8res de charge sur les stations de bases dans des classes pertinentes. Nous proposons pour \u00e7a d\u2019utiliser un algorithme modifi\u00e9 de K-means bas\u00e9 sur la distance DTW (Dynamic Time Warping) qui a \u00e9t\u00e9 montr\u00e9 plus performante que la distance euclidienne classique. Cet algorithme, nous a permis, de classer les s\u00e9ries temporelles de charge pour chaque station de base dans trois classes principales. Une premi\u00e8re classe pour les profils de \u00ab Pic de charge matinale \u00bb, une classe pour les profils de \u00ab Charge constante \u00bb et une derni\u00e8re classe pour les \u00ab Pic de charge nocturne \u00bb. Cette premi\u00e8re classification, nous permet de proposer notre algorithme de classification automatique et massive des profiles journali\u00e8res des stations de bases bas\u00e9 sur la machine d\u2019apprentissage SVM (Support Vector Machine). Cette classification automatique est importante pour les op\u00e9rateurs de r\u00e9seaux et peut leur servir \u00e0 adapter l\u2019allocation de ressource radio selon ces profiles.<br \/>\nAfin de garantir la continuit\u00e9 du service pour les abonn\u00e9es, il est important d\u2019estimer avec pr\u00e9cision la dynamicit\u00e9 de la bande passante sa migration instantan\u00e9e entre les diff\u00e9rents endroits dans le futur. Ceci revient \u00e0 \u00e9tudier les d\u00e9placements des abonn\u00e9es, qui refl\u00e8tent aussi un potentiel d\u00e9placement de demande de bande passante, entre les zones classifi\u00e9es pr\u00e9c\u00e9demment. On propose pour cet objectif, une nouvelle forme de matrice \u00ab Origine-Destination \u00bb bas\u00e9e sur les r\u00e9sultats de classification, qui nous permet d\u2019estimer les futurs<br \/>\ntaux de d\u00e9placement de la demande de bande passante entre les classes de zones. En d\u2019autres termes, elle projette la mobilit\u00e9 de bande passante durant la journ\u00e9e.<br \/>\nLe deuxi\u00e8me chapitre de cette th\u00e8se r\u00e9pond \u00e0 une question importante : Est-t-il possible d\u2019exploiter les traces de CDRs pour impl\u00e9menter des algorithmes capables de pr\u00e9dire avec pr\u00e9cision les futurs taux de charge sur chaque station de base ? Dans la continuit\u00e9 du premier chapitre, nous abordons cette probl\u00e9matique en proposant une \u00e9tude pour les caract\u00e9ristiques des s\u00e9ries temporelles de charge journali\u00e8re et en impl\u00e9mentant un mod\u00e8le de pr\u00e9diction bas\u00e9 sur l\u2019algorithme d\u2019apprentissage SVR (Support Vector Regression). Nous fournissons une comparaison des performances avec d\u2019autres algorithmes de pr\u00e9dictions connus qui montrent l\u2019efficacit\u00e9 de notre mod\u00e8le.<br \/>\nNous int\u00e9grons par la suite les mod\u00e8les que nous avons propos\u00e9 dans un outil flexible qui permet l\u2019optimisation dynamique des ressource r\u00e9seaux bas\u00e9 sur les traces r\u00e9elles. Nous \u00e9valuons notre solution en l\u2019appliquant sur une architecture bas\u00e9e sur un r\u00e9seau sans fil mesh propos\u00e9 dans le projet national LCI4D. l\u2019optimisation de ce r\u00e9seau est faite par un algorithme qui exploite les r\u00e9sultats des modules d\u2019analyse de donn\u00e9es. Une deuxi\u00e8me \u00e9valuation pour notre outil est propos\u00e9e et qui consiste \u00e0 l\u2019appliquer sur une topologie dynamique bas\u00e9 sur des cellules-drones (des drones embarquant des femto-cells). Nous proposons pour \u00e7a un algorithme d\u2019apprentissage renforc\u00e9 multi-agent qui exploite aussi les r\u00e9sultats des modules d\u2019analyse de donn\u00e9es pour optimiser dynamiquement et en temps r\u00e9el cette topologie.<br \/>\nDans la continuit\u00e9 du contexte d\u2019analyse des traces r\u00e9elles de CDRs, nous proposons dans un dernier chapitre, un deuxi\u00e8me outil qui sera capable de d\u00e9tecter proactivement les anomalies dans les r\u00e9seaux cellulaire qui peuvent se produire suite \u00e0 un pic de consommation brusque ou une chute due \u00e0 des probl\u00e8mes techniques. Cet outil est bas\u00e9 sur les algorithmes OCSVM (One-class SVM) et SVR qui permettent de distinguer en temps r\u00e9el les profile de charge anormale. L\u2019outil est test\u00e9 en utilisant les traces du \u00ab D4D challenge \u00bb et \u00ab Big challenge\u00bb et en le comparant \u00e0 d\u2019autre technique de d\u00e9tection d\u2019anomalies et les r\u00e9sultats montrent qu\u2019il est plus efficace. Nous validons aussi le mod\u00e8le pour analyser l\u2019impact des donn\u00e9es prolif\u00e9rantes issues des nouvelles applications comme celle de l\u2019e-sant\u00e9. Notre mod\u00e8le est capable de d\u00e9tecter les anomalies due \u00e0 l\u2019injection de ces nouvelles sources de donn\u00e9es et qui impactent \u00e9videment l\u2019usage normal des r\u00e9seaux cellulaire.<\/p>\n<p><strong>Abstract:<\/strong><br \/>\nMobile phone datasets is the central keystone of my Phd. Where I propse new approaches in the study of networking problems using those real dynamic data rather than the old conventional approaches based on simulations and random inputs. Most of these datasets consist of Call Data Records (CDRs) metadata, i.e. a time-stamped dataset of all interactions between the subscribers of a mobile operator and the network infrastructure during a given period. Given their large size and the fact that these are real-world datasets, information extracted from these datasets have intensively been used in my work to develop new algorithms that aim to revolutionize the infrastructure management mechanisms and optimize the usage of resource. CDR metadata contains also, in addition to temporal information, other information about the geographic scale subscribers\u2019 network usage. Combining the temporal and geographical information certainly helps to infer the spatio-temporal dynamics of subscribers use of the network resource as well as the dynamic patterns of the base-station throughout the day.<br \/>\nThe issue with these CDR metadata is that they are provided in a raw format and the most relevant information are hidden within the large scale of datasets. This needs advanced tools, such as data mining technique and machine learning algorithms, to extract the relevant knowledge. In this context, we provide in this thesis a data mining study of a real-world CDRs dataset such as D4D challenge dataset provided by Orange Senegal and the big data challenge dataset provided by Telecom Italia. Our analysis method consists in clustering the base stations daily load time-series into relevant classes. We use for that a modified k-means clustering algorithm based on the dynamic time warping (DTW) distance. This clustering results in dividing the base station load time-series, extracted from the D4D challenge dataset, into three relevant classes. Each class belong to a specific base station load profile, such as a \u201cday-peak load\u201d profile, \u201cConstant load\u201d profile and \u201cNight-peak load\u201d profile. This first analysis phase permits to tag each base station with its corresponding profile class. The profiled data are used then to implement an automatic classification machine learning based on support vector machine (SVM). The classification algorithm allowed us to infer automatically the daily class of each base station time-series contained into the large-scale dataset. These information are important for network operators to propose dynamic algorithms for radio resource allocation that follow the instantaneous load fluctuation.<br \/>\nTo enhance the continuity of network services, it is important to estimate with high confidence how the bandwidth demand on a base station at a given time is shared among all the base stations in the following instants. We exploit then the classification of base stations profiles to analyze the mobility of the network bandwidth between areas. We use for this objective a novel form of the \u201corigin-destination\u201d matrix based on the classification. This classified OD matrix provides aggregate information about the mobility of the load usage. In other words, it projects the mobility of the bandwidth between areas.<br \/>\nThe second chapter of this thesis respond to the following question: Is it possible to use the CDRs dataset to implement an algorithm able to predict with higher accuracy the future network load? In the continuity of the first chapter, we address this issue on our thesis and we<br \/>\nComment\u00e9 [SH1]: DTW provides more accurate similarity measurement for time-series data contrary to other techniques such as the classical Euclidean distance or the coefficient of correlation<br \/>\nprovide an analysis to study the characteristics of the base stations load time-series and we propose a prediction model based on support vector regression. Our solution is compared to other prediction techniques and the results proved the high efficiency of the SVR-based prediction model.<br \/>\nWe combine the network classification, bandwidth mobility and load prediction algorithms into a global framework that propose a dynamic network resource allocation techniques based on real data analysis. We evaluate the framework in the third chapter where we optimize the planning of a wireless mesh network proposed in the LCI4D project. In this chapter, we propose a MILP algorithm that provide a dynamic and fault-tolerant planning for a wireless mesh network that takes as input the cell load time-series resulting from the machine learning tools presented previously.<br \/>\nWe also validate our data analysis framework with an innovative network architecture based on drones-cells. Hence, we propose a dynamic solution for drone-cells networks that exploit real traces of demand profiles, output from the framework, and adapt in real time the deployment of drones-cell according these demands. In this part, we propose to optimize the deployment using the machine learning paradigm instead of classical linear programming models. Our solution is based on a multi-agent reinforcement learning (MARL) approach.<br \/>\nIn the continuity of the CDRs dataset analysis and the load prediction, we propose in our thesis a second framework that consists in detecting pro-actively the anomalous load patterns of the network that may occur during mass events or network technical issues. Our anomaly detection framework is based on One-class SVM (OCSVM) and SVR algorithms. It is tested and validated with D4D challenge CDR and Italia telecom datasets. Comparison results shows that our model outperforms other techniques. We use our framework to analyze the impact of the proliferous e-health data generated by the medical smart-phone applications.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>AVIS DE SOUTENANCE de Monsieur Seif Eddine HAMMAMI Autoris\u00e9 \u00e0 pr\u00e9senter ses travaux en vue de l\u2019obtention du Doctorat de T\u00e9l\u00e9com SudParis avec l&rsquo;Universit\u00e9 Paris 6 en : Informatique &#038; R\u00e9seaux \u00abOptimisation dynamique des ressources des r\u00e9seaux cellulaires bas\u00e9 sur les techniques d&rsquo;analyses de donn\u00e9es et des techniques d&rsquo;apprentissage automatiques\u00bb le 20 septembre 2018 \u00e0 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":1078,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ocean_post_layout":"","ocean_both_sidebars_style":"","ocean_both_sidebars_content_width":0,"ocean_both_sidebars_sidebars_width":0,"ocean_sidebar":"","ocean_second_sidebar":"","ocean_disable_margins":"enable","ocean_add_body_class":"","ocean_shortcode_before_top_bar":"","ocean_shortcode_after_top_bar":"","ocean_shortcode_before_header":"","ocean_shortcode_after_header":"","ocean_has_shortcode":"","ocean_shortcode_after_title":"","ocean_shortcode_before_footer_widgets":"","ocean_shortcode_after_footer_widgets":"","ocean_shortcode_before_footer_bottom":"","ocean_shortcode_after_footer_bottom":"","ocean_display_top_bar":"default","ocean_display_header":"default","ocean_header_style":"","ocean_center_header_left_menu":"","ocean_custom_header_template":"","ocean_custom_logo":0,"ocean_custom_retina_logo":0,"ocean_custom_logo_max_width":0,"ocean_custom_logo_tablet_max_width":0,"ocean_custom_logo_mobile_max_width":0,"ocean_custom_logo_max_height":0,"ocean_custom_logo_tablet_max_height":0,"ocean_custom_logo_mobile_max_height":0,"ocean_header_custom_menu":"","ocean_menu_typo_font_family":"","ocean_menu_typo_font_subset":"","ocean_menu_typo_font_size":0,"ocean_menu_typo_font_size_tablet":0,"ocean_menu_typo_font_size_mobile":0,"ocean_menu_typo_font_size_unit":"px","ocean_menu_typo_font_weight":"","ocean_menu_typo_font_weight_tablet":"","ocean_menu_typo_font_weight_mobile":"","ocean_menu_typo_transform":"","ocean_menu_typo_transform_tablet":"","ocean_menu_typo_transform_mobile":"","ocean_menu_typo_line_height":0,"ocean_menu_typo_line_height_tablet":0,"ocean_menu_typo_line_height_mobile":0,"ocean_menu_typo_line_height_unit":"","ocean_menu_typo_spacing":0,"ocean_menu_typo_spacing_tablet":0,"ocean_menu_typo_spacing_mobile":0,"ocean_menu_typo_spacing_unit":"","ocean_menu_link_color":"","ocean_menu_link_color_hover":"","ocean_menu_link_color_active":"","ocean_menu_link_background":"","ocean_menu_link_hover_background":"","ocean_menu_link_active_background":"","ocean_menu_social_links_bg":"","ocean_menu_social_hover_links_bg":"","ocean_menu_social_links_color":"","ocean_menu_social_hover_links_color":"","ocean_disable_title":"default","ocean_disable_heading":"default","ocean_post_title":"","ocean_post_subheading":"","ocean_post_title_style":"","ocean_post_title_background_color":"","ocean_post_title_background":0,"ocean_post_title_bg_image_position":"","ocean_post_title_bg_image_attachment":"","ocean_post_title_bg_image_repeat":"","ocean_post_title_bg_image_size":"","ocean_post_title_height":0,"ocean_post_title_bg_overlay":0.5,"ocean_post_title_bg_overlay_color":"","ocean_disable_breadcrumbs":"default","ocean_breadcrumbs_color":"","ocean_breadcrumbs_separator_color":"","ocean_breadcrumbs_links_color":"","ocean_breadcrumbs_links_hover_color":"","ocean_display_footer_widgets":"default","ocean_display_footer_bottom":"default","ocean_custom_footer_template":"","ocean_post_oembed":"","ocean_post_self_hosted_media":"","ocean_post_video_embed":"","ocean_link_format":"","ocean_link_format_target":"self","ocean_quote_format":"","ocean_quote_format_link":"post","ocean_gallery_link_images":"on","ocean_gallery_id":[],"footnotes":""},"categories":[314],"tags":[],"class_list":["post-1079","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-theses-2018-fr","entry","has-media"],"_links":{"self":[{"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/posts\/1079","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/comments?post=1079"}],"version-history":[{"count":1,"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/posts\/1079\/revisions"}],"predecessor-version":[{"id":1514,"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/posts\/1079\/revisions\/1514"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/media\/1078"}],"wp:attachment":[{"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/media?parent=1079"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/categories?post=1079"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/samovar.telecom-sudparis.eu\/index.php\/wp-json\/wp\/v2\/tags?post=1079"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}