Updates to report. Add bar chart for total chargers, add table for chargers by city.
This commit is contained in:
parent
55be0f38a4
commit
31b8d4b166
89
analysis.Rmd
89
analysis.Rmd
@ -1,35 +1,51 @@
|
|||||||
---
|
---
|
||||||
title: "Exploratory Analysis of Electric Vehicle Charger Placement in California"
|
title: "Exploratory Analysis of Electric Vehicle Charger Placement in California"
|
||||||
author: "Nick Hepler"
|
author: "Nick Hepler"
|
||||||
date: "2024-09-25"
|
date: "`r Sys.Date()`"
|
||||||
output: html_document
|
output: html_document
|
||||||
---
|
---
|
||||||
|
|
||||||
```{r setup, include=FALSE}
|
```{r setup, include=FALSE}
|
||||||
knitr::opts_chunk$set(echo = TRUE)
|
knitr::opts_chunk$set(echo = TRUE)
|
||||||
```
|
```
|
||||||
|
# {.tabset .tabset-fade .tabset-pills}
|
||||||
|
|
||||||
## Purpose
|
## Summary
|
||||||
|
|
||||||
This repository contains an exploratory statistical analysis of electric vehicle (EV) charger placement across California. The findings will provide insights to support the growth of EV infrastructure and enhance accessibility for electric vehicle users in the state.
|
### Purpose
|
||||||
|
This repository contains an exploratory statistical analysis of electric vehicle (EV) charger placement across California. The findings will provide insights to support the growth of EV infrastructure and enhance accessibility for electric vehicle users in the state. This analysis will be used to
|
||||||
|
|
||||||
## Data Sources
|
## Data Sources
|
||||||
|
|
||||||
|
This analysis used data from the following sources:
|
||||||
|
|
||||||
### Alternative Fuels Data Center
|
### Alternative Fuels Data Center
|
||||||
The alternative fueling station locations was obtained through the [Alternative Fuels Data Center](https://afdc.energy.gov/stations). It contains available, public electric fuel stations with Level 2 or DC Fast charger types.
|
The alternative fueling station locations was obtained through the [Alternative Fuels Data Center](https://afdc.energy.gov/stations). It contains available, public electric fuel stations with Level 2 or DC Fast charger types.
|
||||||
|
|
||||||
### 2020 Census Tabulation Block TIGER/Line Shapefiles
|
### 2020 Census Tabulation Block TIGER/Line Shapefiles
|
||||||
The urban-rural classification was obtained through the [2020 Census Tabulation Block TIGER/Line Shapefiles](https://www2.census.gov/geo/tiger/TIGER2023/TABBLOCK20/tl_2023_06_tabblock20.zip) for California which contain urban and rural classification information.
|
The urban-rural classification was obtained through the [2020 Census Tabulation Block TIGER/Line Shapefiles](https://www2.census.gov/geo/tiger/TIGER2023/TABBLOCK20/tl_2023_06_tabblock20.zip) for California which contain urban and rural classification information.
|
||||||
|
|
||||||
|
### QGIS Spatial Join
|
||||||
|
A spatial join was completed to add the Census urban/rural indicator (UR20) to the alternative fueling station locations. This was the data was exported to a CSV file and is used in this analysis.
|
||||||
|
|
||||||
## Data Processing
|
## Data Processing
|
||||||
|
|
||||||
### Raw Data
|
### Raw Data
|
||||||
```{r, warning= FALSE, message = FALSE}
|
```{r}
|
||||||
library(readr)
|
library(readr)
|
||||||
raw <- read_csv("data/alt_fuel_stations (Sep 25 2024)_joined.csv")
|
raw <- read_csv("data/alt_fuel_stations (Sep 25 2024)_joined.csv", show_col_types = FALSE)
|
||||||
```
|
```
|
||||||
|
The raw data contains ```r nrow(raw)``` observations and ```r ncol(raw)``` variables.
|
||||||
|
|
||||||
A spatial join was completed to add the Census urban/rural indicator (UR20) to the alternative fueling station locations.This was the data used in the analysis. The raw data contains ```r nrow(raw)``` observations and ```r ncol(raw)``` variables.
|
```{r}
|
||||||
|
problems(raw)
|
||||||
|
```
|
||||||
|
*readr* returned several warnings indicating potential parsing problem with the data.
|
||||||
|
|
||||||
|
```{r}
|
||||||
|
spec(raw)
|
||||||
|
```
|
||||||
|
Examination of the column specifications for the data frame indicates the type assumed by *readr* for these columns was incorrect. Further, the columns will not be used in the analysis and can be ignored.
|
||||||
|
|
||||||
### Data Transformation
|
### Data Transformation
|
||||||
```{r warning= FALSE, message = FALSE}
|
```{r warning= FALSE, message = FALSE}
|
||||||
@ -52,7 +68,8 @@ Starting with data transformation, columns (observations) that are unnecessary f
|
|||||||
|
|
||||||
```{r}
|
```{r}
|
||||||
has_null_ur20 <- dataset %>%
|
has_null_ur20 <- dataset %>%
|
||||||
summarise(any_null = any(is.na(UR20)))
|
summarise(any_null = any(is.na(UR20))) %>%
|
||||||
|
print()
|
||||||
```
|
```
|
||||||
|
|
||||||
Determine if any value failed the spatial join. This indicates 1 station has a null value. This value will be removed.
|
Determine if any value failed the spatial join. This indicates 1 station has a null value. This value will be removed.
|
||||||
@ -60,22 +77,66 @@ Determine if any value failed the spatial join. This indicates 1 station has a n
|
|||||||
```{r}
|
```{r}
|
||||||
dataset <- dataset %>%
|
dataset <- dataset %>%
|
||||||
filter(!is.na(UR20)) %>%
|
filter(!is.na(UR20)) %>%
|
||||||
mutate(UR20 = recode(UR20, "R" = "Rural", "U" = "Urban"))
|
mutate(UR20 = recode(UR20, "R" = "Rural", "U" = "Urban")) %>%
|
||||||
|
mutate('Total Chargers' = rowSums(select(., `EV Level2 EVSE Num`, `EV DC Fast Count`), na.rm = TRUE))
|
||||||
```
|
```
|
||||||
|
|
||||||
To make the dataset more relevant and manageable, the UR20 field was coded to use a more meaningful factor name. Additionally,the record with value of NA discovered previously is removed. The dataset now contains ```r nrow(dataset)``` observations and ```r ncol(dataset)``` variables.
|
To make the dataset more relevant and manageable, the UR20 field was coded to use a more meaningful factor name. Additionally,the record with value of NA discovered previously is removed. The dataset now contains ```r nrow(dataset)``` observations and ```r ncol(dataset)``` variables.
|
||||||
|
|
||||||
## Exploratory
|
## Exploratory
|
||||||
```{r, message = FALSE}
|
```{r, fig.align = "center"}
|
||||||
library(ggplot2)
|
library(ggplot2)
|
||||||
library(scales)
|
library(scales)
|
||||||
ggplot(dataset, aes(y = factor(UR20))) +
|
ggplot(dataset, aes(y = factor(UR20))) +
|
||||||
geom_bar(aes(x = after_stat(count))) +
|
geom_bar(aes(x = after_stat(count))) +
|
||||||
geom_text(stat = 'count', aes(label = comma(after_stat(count))), # Use comma for thousands separators
|
geom_text(stat = 'count', aes(label = comma(after_stat(count))), # Use comma for thousands separators.
|
||||||
position = position_stack(vjust = 0.5),
|
position = position_stack(vjust = 0.5),
|
||||||
color = "white") + # Set label color to white
|
color = "white") +
|
||||||
labs(title = "Urban-Rural Station Histogram",
|
labs(title = "Distribution of Alternative Fueling Stations by Urban-Rural Classification",
|
||||||
x = "Classification",
|
x = "Count",
|
||||||
y = "Count") +
|
y = "Classification") +
|
||||||
theme_minimal()
|
theme_minimal()
|
||||||
```
|
```
|
||||||
|
|
||||||
|
This figure shows the distribution of alternative fueling station locations by urban-rural classification.
|
||||||
|
|
||||||
|
```{r}
|
||||||
|
charger_by_UR20_summary <- dataset %>%
|
||||||
|
group_by(UR20) %>%
|
||||||
|
summarise(`Total Chargers` = sum(`Total Chargers`))
|
||||||
|
```
|
||||||
|
|
||||||
|
A new data frame is created to summarize the Total Chargers based on urban-rural classification.
|
||||||
|
|
||||||
|
```{r, fig.align = "center"}
|
||||||
|
ggplot(charger_by_UR20_summary, aes(x = UR20, y = `Total Chargers`)) +
|
||||||
|
geom_bar(stat = "identity", fill = "skyblue", color = "black") +
|
||||||
|
geom_text(aes(label = comma(`Total Chargers`)), # Use comma for thousands separators.
|
||||||
|
position = position_stack(vjust = 0.5),
|
||||||
|
color = "white") +
|
||||||
|
labs(title = "Total Chargers by Urban-Rural Classification",
|
||||||
|
x = "Classification",
|
||||||
|
y = "Total Chargers") +
|
||||||
|
theme_minimal()
|
||||||
|
```
|
||||||
|
|
||||||
|
This figure shows the total number of electric vehicle chargers based on urban-rural classification.
|
||||||
|
|
||||||
|
```{r}
|
||||||
|
charger_by_city_summary <- dataset %>%
|
||||||
|
group_by(City) %>%
|
||||||
|
summarise(
|
||||||
|
`Total Stations` = n(),
|
||||||
|
`Total Chargers` = sum(`Total Chargers`)) %>%
|
||||||
|
arrange(desc(`Total Chargers`))
|
||||||
|
```
|
||||||
|
|
||||||
|
A new data frame is created to summarize the Total Chargers based on City in descending order of total chargers.
|
||||||
|
|
||||||
|
```{r, message=FALSE}
|
||||||
|
charger_by_city_summary %>%
|
||||||
|
slice_head(n = 5) %>%
|
||||||
|
print()
|
||||||
|
```
|
||||||
|
|
||||||
|
This is table provides the top 5 cities with the most total chargers.
|
||||||
33
scratch.R
33
scratch.R
@ -1,6 +1,9 @@
|
|||||||
library(readr)
|
library(readr)
|
||||||
raw <- read_csv("data/alt_fuel_stations (Sep 25 2024)_joined.csv")
|
raw <- read_csv("data/alt_fuel_stations (Sep 25 2024)_joined.csv")
|
||||||
|
|
||||||
|
problems(raw) %>%
|
||||||
|
print(n=Inf)
|
||||||
|
|
||||||
library(tidyverse)
|
library(tidyverse)
|
||||||
dataset <- raw %>%
|
dataset <- raw %>%
|
||||||
select(
|
select(
|
||||||
@ -15,12 +18,19 @@ dataset <- raw %>%
|
|||||||
`UR20`
|
`UR20`
|
||||||
)
|
)
|
||||||
|
|
||||||
|
problems(dataset) %>%
|
||||||
|
print(n=Inf)
|
||||||
|
|
||||||
has_null_ur20 <- dataset %>%
|
has_null_ur20 <- dataset %>%
|
||||||
summarise(any_null = any(is.na(UR20)))
|
summarise(any_null = any(is.na(UR20)))
|
||||||
|
|
||||||
dataset <- dataset %>%
|
dataset <- dataset %>%
|
||||||
filter(!is.na(UR20)) %>%
|
filter(!is.na(UR20)) %>%
|
||||||
mutate(UR20 = recode(UR20, "R" = "Rural", "U" = "Urban"))
|
mutate(UR20 = recode(UR20, "R" = "Rural", "U" = "Urban")) %>%
|
||||||
|
mutate('Total Chargers' = rowSums(select(., `EV Level2 EVSE Num`, `EV DC Fast Count`), na.rm = TRUE))
|
||||||
|
|
||||||
|
filtered_df <- dataset %>%
|
||||||
|
filter(!is.na(`EV Level2 EVSE Num`) & !is.na(`EV DC Fast Count`))
|
||||||
|
|
||||||
library(ggplot2)
|
library(ggplot2)
|
||||||
library(scales)
|
library(scales)
|
||||||
@ -34,12 +44,17 @@ ggplot(dataset, aes(y = factor(UR20))) +
|
|||||||
y = "Count") +
|
y = "Count") +
|
||||||
theme_minimal()
|
theme_minimal()
|
||||||
|
|
||||||
ggplot(dataset, aes(y = factor(UR20))) +
|
|
||||||
geom_bar(aes(x = after_stat(count))) +
|
|
||||||
geom_text(stat = 'count', aes(label = comma(after_stat(count))), # Use comma for thousands separators
|
charger_by_UR20_summary <- dataset %>%
|
||||||
position = position_stack(vjust = 0.5),
|
group_by(UR20) %>%
|
||||||
color = "white") + # Set label color to white
|
summarise(`Total Chargers` = sum(`Total Chargers`))
|
||||||
labs(title = "Urban-Rural Charger Histogram",
|
|
||||||
x = "Classification",
|
ggplot(charger_by_UR20_summary, aes(x = UR20, y = `Total Chargers`)) +
|
||||||
y = "Count") +
|
geom_bar(stat = "identity", fill = "skyblue", color = "black") +
|
||||||
|
geom_text(aes(label = comma(`Total Chargers`)),
|
||||||
|
vjust = -0.5, size = 5) + # Label position
|
||||||
|
labs(title = "Total Chargers by UR20",
|
||||||
|
x = "UR20",
|
||||||
|
y = "Total Chargers") +
|
||||||
theme_minimal()
|
theme_minimal()
|
||||||
Loading…
Reference in New Issue
Block a user