class: center, middle, inverse, title-slide # Dealing with the change of administrative divisions over time with
### Kim Antunez
antuki
antuki13 ### uRos 2021 - Online event 24 November 2021 --- .pull-left[ ## Who am I? ] .pull-right[ <center><img src="img/avatar_antuki_v2_small.png" height="150px"></center> ] - Kim Antunez * *Now* : civil servant for the French National Institute of Statistics (Insee) * *Before* : Student at ENSAE * *Before again* : worked for the French Observatory of territories (USER of spatial datasets) - **R packages** on how to deal with the change of administrative divisions over time, in the case of French territories. * [`COGugaison`](https://antuki.github.io/COGugasion/) * [`CARTElette`](https://antuki.github.io/CARTElette/) --- ## Administrative divisions... .left-column[ <img src="img/millefeuille2.png" height="60px"></img> <img src="img/millefeuille1.png" height="400px"> ] .right-column[ <center>The French territorial « mille-feuille »</center></br> <center><img src="img/carte_millefeuille_v2.png"></center> ] --- ## ... change over time <center><img src="img/map_fusions.png" height="500px"></center> --- ## ... change over time The most annoying municipality of France!!!
--- ## COGugaison * **COG** = **O**fficial **G**eographical **C**ode (**OGC**) an R package for manipulating french **spatial databases** produced at **different dates** <center><img src="img/COGugaison.png" width=100%"></center> --- .pull-left[ ### See modifications over the years ] .pull-right[ <center><img src="img/trajectoires.png" height="100px"></center> ] <svg viewBox="0 0 576 512" style="position:relative;display:inline-block;top:.1em;fill:red;height:1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M569.517 440.013C587.975 472.007 564.806 512 527.94 512H48.054c-36.937 0-59.999-40.055-41.577-71.987L246.423 23.985c18.467-32.009 64.72-31.951 83.154 0l239.94 416.028zM288 354c-25.405 0-46 20.595-46 46s20.595 46 46 46 46-20.595 46-46-20.595-46-46-46zm-43.673-165.346l7.418 136c.347 6.364 5.609 11.346 11.982 11.346h48.546c6.373 0 11.635-4.982 11.982-11.346l7.418-136c.375-6.874-5.098-12.654-11.982-12.654h-63.383c-6.884 0-12.356 5.78-11.981 12.654z"></path></svg> All functions and parameters are translated from French for this presentation ```r municipality_evolution_graph(code = "76108", year = 2014) # uses {visNetwork} ``` <center><img src="img/trajectoire_commune.png" height="100px"></center> ```r evol <- municipalities_evolutions(begin_date="01-01-2011", end_date="01-01-2014") ``` ```r grep("(76095)|(76108)", evol$mergers, value = TRUE) ``` ``` ## 2012-01-01: Bois-Guillaume-Bihorel (76108) is a merger of Bihorel (76095), Bois-Guillaume (76108). ``` ```r grep("(76095)|(76108)", evol$divisions, value = TRUE) ``` ``` ## 2014-01-01: Bois-Guillaume (76108) divided into Bois-Guillaume (76108), Bihorel ## (76095). ``` --- .pull-left[ ### Guess the year of a database (1) ] .pull-right[ <center><img src="img/tableau_year.png" height="100px"></center> ] ```r head(db, 2) ``` ``` ## code men women ## 1 01001 389.6600 384.5949 ## 2 01002 147.9835 117.3663 ``` ```r OGC_guess(db$code) # guesses the year of the Official Geographic Code (OGC) ``` ``` ## [1] "2019" ``` ```r # creates a vector of ID of municipalities codes <- c(db$code[-1],"75101",NA,"ZZZZZ","98756") merge_OGC(codes = codes, OGC = 2019)$not_in_db # ID not in the db ``` ``` ## 01001 97601 97602 97603 97604 97605 97606 97607 97608 97609 97610 97611 97612 97613 97614 97615 97616 97617 ``` ```r merge_OGC(codes = codes, OGC = 2019)$not_in_OGC # ID not in the OGC ``` ``` ## 75101 NA ZZZZZ 98756 ``` --- .pull-left[ ### Guess the year of a database (2) ] .pull-right[ <center><img src="img/tableau_year.png" height="100px"></center> ] *Author: Constance Lecomte (Observatoire des Territoires)* ```r diagnostic1 <- diag_OGC(db) ``` ``` [1] "# Synthesis" [1] "# OGC 2019" [1] "# ------------------------------" [1] "# Detailed diagnostic" [1] "# The database contains 34953 municipality codes." ``` --- .pull-left[ ### Guess the year of a database (3) ] .pull-right[ <center><img src="img/tableau_year.png" height="100px"></center> ] ```r db_modified <- db %>% add_row(code = c("75101",NA,"01091","98756","ZZZZZ")) diagnostic2 <- diag_OGC(db_modified, hypothesis_OGC = 2019) ``` ``` [1] "# unidentified OGC" [1] "# ------------------------------" [1] "# Detailed diagnostic" [1] "# The database contains 34953 municipality codes." | | Number of obs.| |:------------------------|--------------:| |2019 | 34952| |2018 | 1| |municipality districts | 1| |unknown codes | 1| |missing codes | 1| |overseas municipalities | 1| |unique codes | 34957| ``` ```r diagnostic2[which(diagnostic2$diag_ogc!=2019),] ``` ``` ## code diag_cog men women ## 34954 75101 municipality district NA NA ## 34955 <NA> missing code NA NA ## 34956 01091 2018 NA NA ## 34957 98756 overseas municipality NA NA ## 34958 ZZZZZ unknown code NA NA ``` --- .pull-left[ ### Guess the year of a database (4) ] .pull-right[ <center><img src="img/tableau_year.png" height="100px"></center> ] An [online shiny interface](https://observatoire-des-territoires.shinyapps.io/diaCOG/) for `diag_OGC` : <center><img src="img/diaCOG.png" width="100%"></center> --- .pull-left[ ### Change the year of a db (1) ] .pull-right[ <center><img src="img/tableau_year2.png" height="100px"></center> ] * **quantitative variable** [numeric]</br> <svg viewBox="0 0 512 512" style="position:relative;display:inline-block;top:.1em;fill:#562457;height:1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M512 199.652c0 23.625-20.65 43.826-44.8 43.826h-99.851c16.34 17.048 18.346 49.766-6.299 70.944 14.288 22.829 2.147 53.017-16.45 62.315C353.574 425.878 322.654 448 272 448c-2.746 0-13.276-.203-16-.195-61.971.168-76.894-31.065-123.731-38.315C120.596 407.683 112 397.599 112 385.786V214.261l.002-.001c.011-18.366 10.607-35.889 28.464-43.845 28.886-12.994 95.413-49.038 107.534-77.323 7.797-18.194 21.384-29.084 40-29.092 34.222-.014 57.752 35.098 44.119 66.908-3.583 8.359-8.312 16.67-14.153 24.918H467.2c23.45 0 44.8 20.543 44.8 43.826zM96 200v192c0 13.255-10.745 24-24 24H24c-13.255 0-24-10.745-24-24V200c0-13.255 10.745-24 24-24h48c13.255 0 24 10.745 24 24zM68 368c0-11.046-8.954-20-20-20s-20 8.954-20 20 8.954 20 20 20 20-8.954 20-20z"></path></svg> See `change_OGC_numeric` + *mergers * <svg viewBox="0 0 192 512" style="position:relative;display:inline-block;top:.1em;fill:#88398A;height:1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M0 384.662V127.338c0-17.818 21.543-26.741 34.142-14.142l128.662 128.662c7.81 7.81 7.81 20.474 0 28.284L34.142 398.804C21.543 411.404 0 402.48 0 384.662z"></path></svg> easy: sum the lines + *divisions * <svg viewBox="0 0 192 512" style="position:relative;display:inline-block;top:.1em;fill:#88398A;height:1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M0 384.662V127.338c0-17.818 21.543-26.741 34.142-14.142l128.662 128.662c7.81 7.81 7.81 20.474 0 28.284L34.142 398.804C21.543 411.404 0 402.48 0 384.662z"></path></svg> divide lines proportionally to population ```r nrow(db) ``` ``` ## [1] 34953 ``` ```r db_2021 <- db %>% # changes the year of a numeric variable (from 2016 to 2019) change_OGC_numeric(2019:2021) ``` ```r str(db_2021) ``` ``` ## 'data.frame': 34948 obs. of 3 variables: ## $ code : chr "01001" "01002" "01004" "01005" ... ## $ men : num 389.7 148 6818.6 850.4 55.5 ... ## $ women: num 384.6 117.4 7227.6 813.3 50.5 ... ``` --- .pull-left[ ### Change the year of a db (2) ] .pull-right[ <center><img src="img/tableau_year2.png" height="100px"></center> ] * **qualitative variable** [character]</br> <svg viewBox="0 0 512 512" style="position:relative;display:inline-block;top:.1em;fill:#562457;height:1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M512 199.652c0 23.625-20.65 43.826-44.8 43.826h-99.851c16.34 17.048 18.346 49.766-6.299 70.944 14.288 22.829 2.147 53.017-16.45 62.315C353.574 425.878 322.654 448 272 448c-2.746 0-13.276-.203-16-.195-61.971.168-76.894-31.065-123.731-38.315C120.596 407.683 112 397.599 112 385.786V214.261l.002-.001c.011-18.366 10.607-35.889 28.464-43.845 28.886-12.994 95.413-49.038 107.534-77.323 7.797-18.194 21.384-29.084 40-29.092 34.222-.014 57.752 35.098 44.119 66.908-3.583 8.359-8.312 16.67-14.153 24.918H467.2c23.45 0 44.8 20.543 44.8 43.826zM96 200v192c0 13.255-10.745 24-24 24H24c-13.255 0-24-10.745-24-24V200c0-13.255 10.745-24 24-24h48c13.255 0 24 10.745 24 24zM68 368c0-11.046-8.954-20-20-20s-20 8.954-20 20 8.954 20 20 20 20-8.954 20-20z"></path></svg> See `change_OGC_typology` + *divisions * <svg viewBox="0 0 192 512" style="position:relative;display:inline-block;top:.1em;fill:#88398A;height:1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M0 384.662V127.338c0-17.818 21.543-26.741 34.142-14.142l128.662 128.662c7.81 7.81 7.81 20.474 0 28.284L34.142 398.804C21.543 411.404 0 402.48 0 384.662z"></path></svg> easy: copy the lines + *mergers * <svg viewBox="0 0 192 512" style="position:relative;display:inline-block;top:.1em;fill:#88398A;height:1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M0 384.662V127.338c0-17.818 21.543-26.741 34.142-14.142l128.662 128.662c7.81 7.81 7.81 20.474 0 28.284L34.142 398.804C21.543 411.404 0 402.48 0 384.662z"></path></svg> several hypotheses: assign the class that contains the most population, define an absorbing or absorbed class... Example : .pull-left[ <center><img src="img/com53_2019.png" width="100%"></center> ] .pull-right[ ```r db53 <- db %>% filter(code%in%c("53120","53239", "53249","53274")) %>% mutate(women_in_majority = ifelse(women>men, "yes","no")) db53_2021_maxpop <- db53 %>% change_OGC_typo(2019:2021, typos="women_in_majority", method = "method_max_pop") ``` ] --- .pull-left[ ### Change the year of a db (3) ] .pull-right[ <center><img src="img/tableau_year2.png" height="100px"></center> ] .pull-left[ <center><img src="img/com53_2021_1.png" width="100%"></center> <center><img src="img/com53_2021_3.png" width="100%"></center> ] .pull-right[ <center><img src="img/com53_2021_4.png" width="100%"></center> <center><img src="img/com53_2021_2.png" width="100%"></center> ] --- .pull-left[ ### Aggregate a database ] .pull-right[ <center><img src="img/agreger.png" height="100px"></center> ] ```r db_dep <- db_2021 %>% # aggregates the database (municipalities -> départements) aggregate_OGC(OGC = 2019, administrative_division = "DEP") ``` ```r str(db_dep) ``` ``` ## 'data.frame': 100 obs. of 4 variables: ## $ DEP : chr "01" "02" "03" "04" ... ## $ LIBGEO: chr "Ain" "Aisne" "Allier" "Alpes-de-Haute-Provence" ... ## $ men : num 317820 260811 160815 80043 68993 ... ## $ women : num 325644 273593 176879 84241 72343 ... ``` --- ## CARTElette a repository which contains the **geographical layers** that correspond to the situation of the division of the French territories (France and Overseas) on the first January of each year and R functions to load them. <center><img src="img/CARTElette.png" width=100%"></center> --- .pull-left[ ### Load a map layer (1) ] .pull-right[ <center><img src="img/charger.png" height="100px"></center> ] ```r library(CARTElette) DEP_sf <- load_map(OGC=2021, administrative_division = "DEP") ``` ```r DEP_sf <- left_join(DEP_sf,db_dep,by=c("DEP"="DEP")) %>% mutate(prop=100*women/(men+women)) ``` <center><img src="img/draw_prop.png" height="400px"></center> --- .pull-left[ ### Load a map layer (2) ] .pull-right[ <center><img src="img/charger.png" height="100px"></center> ] A [shiny app](https://data.nozav.org/app/prenoms/) using `CARTElette` map layers : <center><img src="img/prenoms.png" width="80%"></center> --- .pull-left[ ### Move overseas territories ] .pull-right[ <center><img src="img/charger.png" height="100px"></center> ] function `position_overseas()` </br> <center><img src="img/dom.gif" width="100%"></center> --- ## Future improvements? * **Create new functions** + add your own geographical levels + add your own distribution keys * **Expand to other countries** + in Europe : Nomenclature of Territorial Units for Statistics, NUTS (see `rOpenGov/gicosR`) + identify common functionnalities VS local (French!) specificities * **Reach non R-users** + Shiny Apps + API --- class: center, middle <center><img src="img/avatar.png" height="300px"></center> <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css"> <a href="http://twitter.com/antuki13" class="social"><i class="fa fa-twitter fa-2x" title="Mon twitter"></i>@antuki13</a> <a href="http://github.com/antuki" class="social"><i class="fa fa-github fa-2x" title="Mon github"></i>antuki</a> <a href="http://antuki.github.io" class="social"><i class="fa fa-bold fa-2x" title="Mon blog"></i>antuki.github.io</a> **Packages on github: [antuki/COGugaison](https://github.com/antuki/COGugaison) and [antuki/CARTElette](https://github.com/antuki/CARTElette).** Slides created with R package [**xaringan**](https://github.com/yihui/xaringan) with the [R-Ladies theme](https://alison.rbind.io/post/r-ladies-slides/). Also with [remark.js](https://remarkjs.com), [knitr](http://yihui.name/knitr), and [R Markdown](https://rmarkdown.rstudio.com).