--- title: "Exploratory Analysis of Electric Vehicle Charger Placement in California" author: "Nick Hepler" date: "`r Sys.Date()`" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` # {.tabset .tabset-fade .tabset-pills} ## Summary ### Purpose This repository contains an exploratory statistical analysis of electric vehicle (EV) charger placement across California. The findings will provide insights to support the growth of EV infrastructure and enhance accessibility for electric vehicle users in the state. This analysis will be used to ## Data Sources This analysis used data from the following sources: ### Alternative Fuels Data Center The alternative fueling station locations was obtained through the [Alternative Fuels Data Center](https://afdc.energy.gov/stations). It contains available, public electric fuel stations with Level 2 or DC Fast charger types. ### 2020 Census Tabulation Block TIGER/Line Shapefiles The urban-rural classification was obtained through the [2020 Census Tabulation Block TIGER/Line Shapefiles](https://www2.census.gov/geo/tiger/TIGER2023/TABBLOCK20/tl_2023_06_tabblock20.zip) for California which contain urban and rural classification information. ### QGIS Spatial Join A spatial join was completed to add the Census urban/rural indicator (UR20) to the alternative fueling station locations. This was the data was exported to a CSV file and is used in this analysis. ## Data Processing ### Raw Data ```{r} library(readr) raw <- read_csv("data/alt_fuel_stations (Sep 25 2024)_joined.csv", show_col_types = FALSE) ``` The raw data contains ```r nrow(raw)``` observations and ```r ncol(raw)``` variables. ```{r} problems(raw) ``` *readr* returned several warnings indicating potential parsing problem with the data. ```{r} spec(raw) ``` Examination of the column specifications for the data frame indicates the type assumed by *readr* for these columns was incorrect. Further, the columns will not be used in the analysis and can be ignored. ### Data Transformation ```{r warning= FALSE, message = FALSE} library(tidyverse) dataset <- raw %>% select( `Station Name`, `City`, `EV Level2 EVSE Num`, `EV DC Fast Count`, `Latitude`, `Longitude`, `EV Connector Types`, `EV Workplace Charging`, `UR20` ) ``` Starting with data transformation, columns (observations) that are unnecessary for the analysis were removed. The dataset now contains ```r nrow(dataset)``` observations and ```r ncol(dataset)``` variables. ```{r} has_null_ur20 <- dataset %>% summarise(any_null = any(is.na(UR20))) %>% print() ``` Determine if any value failed the spatial join. This indicates 1 station has a null value. This value will be removed. ```{r} dataset <- dataset %>% filter(!is.na(UR20)) %>% mutate(UR20 = recode(UR20, "R" = "Rural", "U" = "Urban")) %>% mutate('Total Chargers' = rowSums(select(., `EV Level2 EVSE Num`, `EV DC Fast Count`), na.rm = TRUE)) ``` To make the dataset more relevant and manageable, the UR20 field was coded to use a more meaningful factor name. Additionally,the record with value of NA discovered previously is removed. The dataset now contains ```r nrow(dataset)``` observations and ```r ncol(dataset)``` variables. ## Exploratory ```{r, message=FALSE, fig.align = "center"} library(ggplot2) library(scales) ggplot(dataset, aes(y = factor(UR20))) + geom_bar(aes(x = after_stat(count))) + geom_text(stat = 'count', aes(label = comma(after_stat(count))), # Use comma for thousands separators. position = position_stack(vjust = 0.5), color = "white") + labs(title = "Distribution of Alternative Fueling Stations by Urban-Rural Classification", x = "Count", y = "Classification") + theme_minimal() ``` This figure shows the distribution of alternative fueling station locations by urban-rural classification. ```{r} charger_by_UR20_summary <- dataset %>% group_by(UR20) %>% summarise(`Total Chargers` = sum(`Total Chargers`)) ``` A new data frame is created to summarize the Total Chargers based on urban-rural classification. ```{r, fig.align = "center"} ggplot(charger_by_UR20_summary, aes(x = UR20, y = `Total Chargers`)) + geom_bar(stat = "identity", fill = "skyblue", color = "black") + geom_text(aes(label = comma(`Total Chargers`)), # Use comma for thousands separators. position = position_stack(vjust = 0.5), color = "white") + labs(title = "Total Chargers by Urban-Rural Classification", x = "Classification", y = "Total Chargers") + theme_minimal() ``` This figure shows the total number of electric vehicle chargers based on urban-rural classification. ```{r} charger_by_city_summary <- dataset %>% group_by(City) %>% summarise( `Total Stations` = n(), `Total Level 2 Chargers` = sum(`EV Level2 EVSE Num`, na.rm = TRUE), `Total DC Fast Chargers` = sum(`EV DC Fast Count`, na.rm = TRUE), `Total Chargers` = sum(`Total Chargers`)) %>% arrange(desc(`Total Chargers`)) ``` A new data frame is created to summarize the total chargers and total charger types based on City in descending order of total chargers. ```{r, message=FALSE} charger_by_city_summary %>% slice_head(n = 5) %>% print() ``` This is table provides the top 5 cities with the most total chargers.