--- title: "Exploratory Analysis of Electric Vehicle Charger Placement in California" author: "Nick Hepler" date: "2024-09-25" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Purpose This repository contains an exploratory statistical analysis of electric vehicle (EV) charger placement across California. The findings will provide insights to support the growth of EV infrastructure and enhance accessibility for electric vehicle users in the state. ## Data Sources ### Alternative Fuels Data Center The alternative fueling station locations was obtained through the [Alternative Fuels Data Center](https://afdc.energy.gov/stations). It contains available, public electric fuel stations with Level 2 or DC Fast charger types. ### 2020 Census Tabulation Block TIGER/Line Shapefiles The urban-rural classification was obtained through the [2020 Census Tabulation Block TIGER/Line Shapefiles](https://www2.census.gov/geo/tiger/TIGER2023/TABBLOCK20/tl_2023_06_tabblock20.zip) for California which contain urban and rural classification information. ## Data Processing ### Raw Data ```{r, warning= FALSE, message = FALSE} library(readr) raw <- read_csv("data/alt_fuel_stations (Sep 25 2024)_joined.csv") ``` A spatial join was completed to add the Census urban/rural indicator (UR20) to the alternative fueling station locations.This was the data used in the analysis. The raw data contains ```r nrow(raw)``` observations and ```r ncol(raw)``` variables. ### Data Transformation ```{r warning= FALSE, message = FALSE} library(tidyverse) dataset <- raw %>% select( `Station Name`, `City`, `EV Level2 EVSE Num`, `EV DC Fast Count`, `Latitude`, `Longitude`, `EV Connector Types`, `EV Workplace Charging`, `UR20` ) ``` Starting with data transformation, columns (observations) that are unnecessary for the analysis were removed. The dataset now contains ```r nrow(dataset)``` observations and ```r ncol(dataset)``` variables. ```{r} has_null_ur20 <- dataset %>% summarise(any_null = any(is.na(UR20))) ``` Determine if any value failed the spatial join. This indicates 1 station has a null value. This value will be removed. ```{r} dataset <- dataset %>% filter(!is.na(UR20)) %>% mutate(UR20 = recode(UR20, "R" = "Rural", "U" = "Urban")) ``` To make the dataset more relevant and manageable, the UR20 field was coded to use a more meaningful factor name. Additionally,the record with value of NA discovered previously is removed. The dataset now contains ```r nrow(dataset)``` observations and ```r ncol(dataset)``` variables. ## Exploratory ```{r, message = FALSE} library(ggplot2) library(scales) ggplot(dataset, aes(y = factor(UR20))) + geom_bar(aes(x = after_stat(count))) + geom_text(stat = 'count', aes(label = comma(after_stat(count))), # Use comma for thousands separators position = position_stack(vjust = 0.5), color = "white") + # Set label color to white labs(title = "Urban-Rural Station Histogram", x = "Classification", y = "Count") + theme_minimal() ```