Add map, tree species tables.

This commit is contained in:
Nick Heppler 2025-02-18 18:52:41 -05:00
parent cfd4ec0113
commit b8377e3213

View File

@ -8,10 +8,13 @@ author:
date: "`r format(Sys.Date(), '%B, %d, %Y')`" date: "`r format(Sys.Date(), '%B, %d, %Y')`"
keywords: "keyword1, keyword2" keywords: "keyword1, keyword2"
output: output:
html_document html_document:
css: custom.css
code_folding: hide
--- ---
```{r setup, include=FALSE} ```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
# Load necessary libraries # Load necessary libraries
library(tidyverse) library(tidyverse)
library(lubridate) library(lubridate)
@ -103,7 +106,7 @@ The histogram presented below visualizes the number of survey submissions based
This chart helps identify any trends in survey participation, such as whether submissions are more frequent at the beginning or end of the week. This could be valuable for understanding user behavior and improving survey timing or outreach strategies. This chart helps identify any trends in survey participation, such as whether submissions are more frequent at the beginning or end of the week. This could be valuable for understanding user behavior and improving survey timing or outreach strategies.
```{r submission-histogram-survey-submissions-day-of-week, echo=FALSE, message=FALSE, fig.height=6, fig.width=8} ```{r submission-histogram-survey-submissions-day-of-week, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
library(dplyr) library(dplyr)
library(ggplot2) library(ggplot2)
@ -119,7 +122,7 @@ survey_data %>%
theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Angle labels for better readability theme(axis.text.x = element_text(angle = 45, hjust = 1)) # Angle labels for better readability
``` ```
```{r func-plot_submission_trends, echo=FALSE} ```{r func-plot_submission_trends, echo=TRUE}
# Load necessary libraries # Load necessary libraries
library(tidyverse) library(tidyverse)
@ -176,7 +179,7 @@ The plot below visualizes the survey submission trends for the past 30 days. It
The data used for this plot is filtered to include only submissions made in the last 30 days, with the submission count for each date represented by both the line and the points on the graph. A smoothed trend line (dashed) has been added to help visualize the overall submission pattern over this period. The data used for this plot is filtered to include only submissions made in the last 30 days, with the submission count for each date represented by both the line and the points on the graph. A smoothed trend line (dashed) has been added to help visualize the overall submission pattern over this period.
```{r plot-submission-trends-30d, echo=FALSE, message=FALSE, fig.height=6, fig.width=8} ```{r plot-submission-trends-30d, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
plot_submission_trends(survey_data, days_ago = 30) plot_submission_trends(survey_data, days_ago = 30)
``` ```
@ -185,7 +188,7 @@ The plot below visualizes the survey submission trends for the past 90 days. It
The data used for this plot is filtered to include only submissions made in the last 90 days, with the submission count for each date represented by both the line and the points on the graph. A smoothed trend line (dashed) has been added to help visualize the overall submission pattern over this period. The data used for this plot is filtered to include only submissions made in the last 90 days, with the submission count for each date represented by both the line and the points on the graph. A smoothed trend line (dashed) has been added to help visualize the overall submission pattern over this period.
```{r plot-submission-trends-90d, echo=FALSE, message=FALSE} ```{r plot-submission-trends-90d, echo=TRUE, message=FALSE}
plot_submission_trends(survey_data, days_ago = 90) plot_submission_trends(survey_data, days_ago = 90)
``` ```
@ -194,7 +197,7 @@ The table below summarizes the response rates for optional key top-level questio
The "Total Number of Species Planted" question has special handling—only responses greater than 0 are considered valid, whereas for other questions, any non-NA value counts as a response. The "Total Number of Species Planted" question has special handling—only responses greater than 0 are considered valid, whereas for other questions, any non-NA value counts as a response.
```{r optonal-top-level-question-response-rate-table, echo=FALSE, message=FALSE, fig.height=6, fig.width=8} ```{r optonal-top-level-question-response-rate-table, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
# List of fields to check for response rates, with special handling for 'Total Number of Species Planted' # List of fields to check for response rates, with special handling for 'Total Number of Species Planted'
fields <- c("Planter Contact Email", "Funding Source", "Land Ownership", fields <- c("Planter Contact Email", "Funding Source", "Land Ownership",
"Tree Size Planted", "Source of Trees", "Total Number of Species Planted") "Tree Size Planted", "Source of Trees", "Total Number of Species Planted")
@ -249,7 +252,7 @@ The following section contains an analysis of tree planting by participant type.
### Submissions ### Submissions
The following plot shows the distribution of survey submissions based on participant type. This breakdown highlights the contributions of each participant group to the tree planting initiative. The following plot shows the distribution of survey submissions based on participant type. This breakdown highlights the contributions of each participant group to the tree planting initiative.
```{r participant-type-surveys, echo=FALSE, message=FALSE, fig.height=6, fig.width=8} ```{r participant-type-surveys, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
ggplot(survey_data, aes(x = `Who Planted The Tree(s)?`)) + ggplot(survey_data, aes(x = `Who Planted The Tree(s)?`)) +
geom_bar(fill = "#233f28", color = "#7e9084") + geom_bar(fill = "#233f28", color = "#7e9084") +
geom_text(stat = "count", aes(label = scales::comma(after_stat(count))), geom_text(stat = "count", aes(label = scales::comma(after_stat(count))),
@ -283,7 +286,7 @@ ggplot(survey_data, aes(x = `Who Planted The Tree(s)?`)) +
### Trees Planted ### Trees Planted
This plot visualizes the total number of trees planted by each participant type, helping to evaluate the overall impact of different groups in the tree planting program. This plot visualizes the total number of trees planted by each participant type, helping to evaluate the overall impact of different groups in the tree planting program.
```{r participant-type-planted, echo=FALSE, message=FALSE, fig.height=6, fig.width=8} ```{r participant-type-planted, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
summary_data <- survey_data %>% summary_data <- survey_data %>%
group_by(`Who Planted The Tree(s)?`) %>% group_by(`Who Planted The Tree(s)?`) %>%
@ -320,7 +323,7 @@ ggplot(summary_data, aes(x = `Who Planted The Tree(s)?`, y = total_trees)) +
The following table provides a breakdown of the total number of trees planted by participant type. It shows both the total number of trees planted by each group and their proportional contribution to the overall planting efforts. This information helps assess which participant types have contributed the most to the tree planting program. The following table provides a breakdown of the total number of trees planted by participant type. It shows both the total number of trees planted by each group and their proportional contribution to the overall planting efforts. This information helps assess which participant types have contributed the most to the tree planting program.
```{r participant-type-table, echo=FALSE, message=FALSE, fig.height=6, fig.width=8} ```{r participant-type-table, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
# Summarize the data to calculate the total number of trees planted by participant type # Summarize the data to calculate the total number of trees planted by participant type
summary_data <- survey_data %>% summary_data <- survey_data %>%
group_by(`Who Planted The Tree(s)?`) %>% group_by(`Who Planted The Tree(s)?`) %>%
@ -368,7 +371,7 @@ summary_data_formatted %>%
## Location Analysis{.tabset} ## Location Analysis{.tabset}
```{r func-create_summary_table, echo=FALSE} ```{r func-create_summary_table, echo=TRUE}
create_summary_table <- function(data, field) { create_summary_table <- function(data, field) {
# Summarize the data based on the field provided # Summarize the data based on the field provided
summary_data <- data %>% summary_data <- data %>%
@ -410,11 +413,129 @@ create_summary_table <- function(data, field) {
``` ```
### By Region ### By Region
```{r create-summary-table-region, echo=FALSE, message=FALSE} ```{r create-summary-table-region, echo=TRUE, message=FALSE}
create_summary_table(survey_data, "Region") create_summary_table(survey_data, "Region")
``` ```
### By County ### By County
```{r create-summary-table-county, echo=FALSE, message=FALSE} This map displays the **total number of trees planted** across each county in **New York State**. The counties are color-coded, with darker shades representing areas where more trees have been planted. This allows users to quickly see which counties have had the most extensive tree planting efforts.
create_summary_table(survey_data, "County")
``` - **What to look for**:
- **Dark colors**: Indicate counties with a higher number of trees planted.
- **Lighter colors**: Represent counties with fewer trees planted.
The map provides a visual overview of tree planting distribution across New York, making it easier to identify areas with the highest impact or need for further action.
```{r create-county-choropleth-map, echo=TRUE, message=FALSE}
library(tigris) # For geographic data
library(sf) # For handling spatial data
library(dplyr) # For data manipulation
library(ggplot2) # For plotting
library(viridis) # For a color palette in the map
# Download New York State counties shapefile
ny_counties <- counties(state = "NY", cb = TRUE, progress = FALSE) %>% st_as_sf()
survey_data_aggregated <- survey_data %>%
group_by(County) %>%
summarise(total_trees = sum(`Number of Trees Planted`, na.rm = TRUE))
ny_counties_merged <- ny_counties %>%
left_join(survey_data_aggregated, by = c("NAME" = "County"))
# Get the system date and format it
current_date <- format(Sys.Date(), "%B %d, %Y") # Format as "Month Day, Year"
ggplot(data = ny_counties_merged) +
geom_sf(aes(fill = total_trees), color = "white") +
scale_fill_viridis_c(option = "plasma") + # Use a color scale like viridis
theme_minimal() +
labs(title = "Number of Trees Planted by County in New York",
fill = "Total Trees Planted") +
theme(axis.text = element_blank(), axis.title = element_blank()) +
annotate("text", x = -77.25, y = 45.25, label = paste("Date:", current_date), size = 4, hjust = 1, color = "black")
```
## Tree Analysis {.tabset}
```{r func-create_species_summary_table, echo=TRUE}
create_species_summary_table <- function(data, field, field_label = NULL) {
# Replace empty strings and NA values with "Not Provided" before summarization
data <- data %>%
mutate(
!!sym(field) := ifelse(!!sym(field) == "" | is.na(!!sym(field)), "Not Provided", !!sym(field)) # Replace empty strings and NAs
)
# Clean up the species names: replace underscores with spaces and convert to title case
data <- data %>%
mutate(
!!sym(field) := gsub("_", " ", !!sym(field)), # Replace underscores with spaces
!!sym(field) := tools::toTitleCase(!!sym(field)) # Convert to title case
)
# Summarize the data based on the field (e.g., Generic.Species.of.Tree)
summary_data <- data %>%
group_by(!!sym(field)) %>%
summarise(
submissions = n(), # Count of surveys for each species (or category)
.groups = "drop" # To prevent issues with group structure
) %>%
mutate(
submissions_percentage = submissions / sum(submissions) * 100 # Proportion of surveys for each category
)
# Format the table for display
summary_data_formatted <- summary_data %>%
mutate(
submissions = scales::comma(submissions), # Format the submission counts with commas
submissions_percentage = paste0(round(submissions_percentage, 1), "%") # Round percentage and append '%'
)
# Determine the label for the field
label <- ifelse(is.null(field_label), field, field_label)
# Create and style the table
summary_data_formatted %>%
knitr::kable(col.names = c(label, "Number of Surveys", "Proportion of Surveys (%)"),
caption = paste("Summary of Surveys by", label),
align = c("l", "c", "c")) %>% # Align the columns (left for the field, center for others)
kableExtra::kable_styling(
full_width = F,
position = "center",
bootstrap_options = c("striped", "hover"),
font_size = 14
) %>%
kableExtra::column_spec(1, width = "20em", bold = TRUE) %>% # First column (Species) bold and wider
kableExtra::column_spec(2, width = "12em") %>% # Number of Surveys column
kableExtra::column_spec(3, width = "12em") %>% # Proportion column
kableExtra::add_footnote("The proportions represent the percentage of surveys for each species relative to the total surveys.")
}
```
### By Genus
The following table shows a breakdown of survey submissions by **Genus**. For each genus, the table provides:
1. **Number of Surveys**: The total number of surveys where this genus was reported.
2. **Proportion of Surveys (%)**: The percentage of total surveys that reported this genus, relative to the entire dataset.
3. **"Not Provided" Category**: Any surveys that did not specify a genus are grouped under the "Not Provided" category.
These figures provide an understanding of which genus are most commonly reported, how prevalent each genus is, and the proportion of surveys where no genus was specified.
```{r create-summary-table-genus, echo=TRUE, message=FALSE}
create_species_summary_table(species_data, "Generic.Species.of.Tree", "Tree Genus")
```
### By Species
The following table shows a breakdown of survey submissions by **Species**. For each species, the table provides:
1. **Number of Surveys**: The total number of surveys where this species was reported.
2. **Proportion of Surveys (%)**: The percentage of total surveys that reported this species, relative to the entire dataset.
3. **"Not Provided" Category**: Any surveys that did not specify a species are grouped under the "Not Provided" category.
These figures provide an understanding of which species are most commonly reported, how prevalent each species is, and the proportion of surveys where no genus was specified.
```{r create-summary-table-species, echo=TRUE, message=FALSE}
create_species_summary_table(species_data, "Precise.Species.of.Tree", "Tree Species")
```