Compare commits
No commits in common. "ad1da91aff7c405b7861f4df4e74d90da70b2784" and "a7b8cea9701d2c2cdc603ffbeb3f034416e37e38" have entirely different histories.
ad1da91aff
...
a7b8cea970
12
custom.css
12
custom.css
@ -1,6 +1,6 @@
|
|||||||
/* Base styles */
|
/* Base styles */
|
||||||
body {
|
body {
|
||||||
background-color: #ffffff; /* Recreation & Environment Tertiary Color */
|
background-color: #f2f2f2; /* Tertiary color */
|
||||||
color: #233f28; /* Primary color */
|
color: #233f28; /* Primary color */
|
||||||
font-family: 'Arial', sans-serif;
|
font-family: 'Arial', sans-serif;
|
||||||
font-size: 16px;
|
font-size: 16px;
|
||||||
@ -9,22 +9,22 @@ body {
|
|||||||
|
|
||||||
/* Header Styles */
|
/* Header Styles */
|
||||||
h1, h2, h3, h4, h5, h6 {
|
h1, h2, h3, h4, h5, h6 {
|
||||||
color: #233f28; /* Recreation & Environment Primary Color */
|
color: #233f28; /* Primary color for all headers */
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Links */
|
/* Links */
|
||||||
a {
|
a {
|
||||||
color: #004dd1; /* Web-Specific New York State Color Palette for links */
|
color: #7e9084; /* Secondary color for links */
|
||||||
}
|
}
|
||||||
|
|
||||||
a:hover {
|
a:hover {
|
||||||
color: #face00; /* Recreation & Environment Accent Color */
|
color: #face00; /* Accent color for links on hover */
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Code Chunks */
|
/* Code Chunks */
|
||||||
code, pre {
|
code, pre {
|
||||||
background-color: #f2f2f2; /* Gray95 for code background */
|
background-color: #233f28; /* Primary color for code background */
|
||||||
color: #233f28; /* Recreation & Environment Primary Color for code text */
|
color: #d9e1dd; /* Tertiary color for code text */
|
||||||
padding: 4px;
|
padding: 4px;
|
||||||
border-radius: 4px;
|
border-radius: 4px;
|
||||||
}
|
}
|
||||||
|
|||||||
598
report.Rmd
598
report.Rmd
@ -1,600 +1,3 @@
|
|||||||
<<<<<<< HEAD
|
|
||||||
---
|
|
||||||
title: "25 Million Trees Initiative Survey Report"
|
|
||||||
author:
|
|
||||||
- name: Nicholas Hepler <nicholas.hepler@its.ny.gov>
|
|
||||||
affiliation: Office of Information Technology Services
|
|
||||||
- name: Annabel Gregg <annabel.gregg@dec.ny.gov>
|
|
||||||
affiliation: Department of Environmental Convervation
|
|
||||||
date: "`r format(Sys.Date(), '%B, %d, %Y')`"
|
|
||||||
keywords: "keyword1, keyword2"
|
|
||||||
output:
|
|
||||||
html_document:
|
|
||||||
toc: true
|
|
||||||
toc_depth: 1
|
|
||||||
toc_float: true
|
|
||||||
number_sections: false
|
|
||||||
css: custom.css
|
|
||||||
code_folding: hide
|
|
||||||
---
|
|
||||||
|
|
||||||
```{r setup, include=FALSE}
|
|
||||||
knitr::opts_chunk$set(
|
|
||||||
echo = TRUE,
|
|
||||||
message = FALSE,
|
|
||||||
warning = FALSE
|
|
||||||
)
|
|
||||||
# Load necessary libraries
|
|
||||||
library(tidyverse)
|
|
||||||
library(lubridate)
|
|
||||||
library(ggplot2)
|
|
||||||
|
|
||||||
# Read the CSV files into a dataframe
|
|
||||||
survey_path <- "data/_25_Million_Trees_Initiative_Survey_0.csv"
|
|
||||||
survey_data <- read_csv(survey_path)
|
|
||||||
|
|
||||||
species_path <- "data/species_planted_4.csv"
|
|
||||||
species_data <- read_csv(species_path)
|
|
||||||
|
|
||||||
# Convert the CreationDate field to a proper datetime object (if applicable)
|
|
||||||
survey_data <- survey_data %>%
|
|
||||||
mutate(CreationDate = mdy_hms(CreationDate))
|
|
||||||
|
|
||||||
# Count the records to be excluded (Exclude Result == 1)
|
|
||||||
excluded_count <- survey_data %>%
|
|
||||||
filter(`Exclude Result` == 1) %>%
|
|
||||||
nrow()
|
|
||||||
|
|
||||||
# Count the records that are used (Exclude Result == 0)
|
|
||||||
used_count <- survey_data %>%
|
|
||||||
filter(`Exclude Result` == 0) %>%
|
|
||||||
nrow()
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
subtitle: "`r format(min(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")` to `r format(max(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`."
|
|
||||||
---
|
|
||||||
|
|
||||||
# Report Overview {.tabset}
|
|
||||||
[Back to Top](#)
|
|
||||||
|
|
||||||
## Background
|
|
||||||
|
|
||||||
The **25 Million Trees Initiative** is a bold commitment launched by **Governor Kathy Hochul** during the 2024 State of the State Address, aiming to plant 25 million trees by 2033 in New York State. This initiative recognizes the critical importance of trees and forests for climate mitigation, enhancing community health, and supporting biodiversity. The New York State Department of Environmental Conservation (DEC) is at the forefront of tracking the progress of this ambitious goal.
|
|
||||||
|
|
||||||
As part of this effort, DEC has launched the **Tree Tracker**, a tool for the public to record the trees they plant. These submissions contribute valuable data on the number, type, and locations of trees being planted across the state, helping to build a comprehensive, real-time dashboard of tree planting activities.
|
|
||||||
|
|
||||||
This report compiles the survey data collected via the Tree Tracker and provides detailed insights into the information submitted by New Yorkers. It aims to support DEC staff and executives in understanding the progress of the initiative and identifying areas for improvement in outreach and engagement.
|
|
||||||
|
|
||||||
## Purpose & Objectives
|
|
||||||
|
|
||||||
This report serves to present an overview of the data collected through the 25 Million Trees Initiative, offering insights into submission patterns, geographic distribution, and trends in tree planting activities. The report aims to:
|
|
||||||
|
|
||||||
- Summarize the overall progress of the initiative.
|
|
||||||
- Provide detailed data analysis on the submitted tree planting information.
|
|
||||||
- Identify areas where more outreach or support may be needed.
|
|
||||||
|
|
||||||
As more individuals contribute their data to the Tree Tracker, the initiative's success will be better understood, and DEC can better align resources to further promote this critical program.
|
|
||||||
|
|
||||||
## Survey Period & Exclusions
|
|
||||||
|
|
||||||
The report covers the survey period from **`r format(min(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`** to **`r format(max(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`**, including a total of **`r nrow(survey_data)`** records. Out of these, **`r used_count`** records were deemed valid and included in the analysis.
|
|
||||||
|
|
||||||
Exclusions were applied to **`r excluded_count`** records, which were removed due to various reasons, such as:
|
|
||||||
|
|
||||||
- **Double Count**: Some submissions were identified as duplicates and excluded to prevent data redundancy.
|
|
||||||
- **Test Data**: Entries that were intended solely for testing purposes were excluded, as they do not represent actual survey data.
|
|
||||||
|
|
||||||
These excluded records are marked with a value of **1** in the `Exclude Result` field. The remaining **`r used_count`** records, marked with a **0**, represent legitimate data points that were included in the analysis.
|
|
||||||
|
|
||||||
## Validation & Data Consistency
|
|
||||||
|
|
||||||
To ensure data integrity, several validation steps are applied to survey submissions:
|
|
||||||
|
|
||||||
- **Required Fields**:
|
|
||||||
- **Who Planted the Tree(s)?**: Describes the participant's role in the tree planting effort.
|
|
||||||
- **Number of Trees**: The number of trees planted during the planting period.
|
|
||||||
- **Start Date of Planting**: The date when planting began.
|
|
||||||
- **End Date of Planting**: The date when planting was completed.
|
|
||||||
- **Location**: Geographic coordinates (latitude and longitude).
|
|
||||||
|
|
||||||
- **Response Validation**:
|
|
||||||
- **Geographic Validation**: Once geographic coordinates are entered, they are checked against official civil boundaries to provide an accurate nominal locality, county, and region data. In rare cases, this check may fail due to service dependency, but such records are corrected before inclusion in the analysis.
|
|
||||||
- **Date Validation and Logic**: Users cannot enter planting dates prior to the start date of the initiative. The system enforces this restriction, and any records with such dates are not allowed to be submitted. Additionally, users cannot enter a planting end date that occurs before the planting start date.
|
|
||||||
- **Optional Questions**: Even optional questions undergo validation to ensure the entered data meets the expected format or logic, providing further consistency and accuracy.
|
|
||||||
- **Email Format**: The email addresses entered in the survey are validated to ensure they follow the correct format.
|
|
||||||
|
|
||||||
By applying these validation checks, the integrity and consistency of the data is ensured, allowing for meaningful analysis of tree planting surveys.
|
|
||||||
|
|
||||||
# Submission Analysis {.tabset}
|
|
||||||
[Back to Top](#)
|
|
||||||
|
|
||||||
## Day of Week
|
|
||||||
The histogram presented below visualizes the number of survey submissions based on the day of the week. Each bar represents the frequency of submissions for a particular day, with the x-axis displaying the days (Monday through Sunday) and the y-axis showing the number of submissions for each corresponding day.
|
|
||||||
|
|
||||||
This chart helps identify any trends in survey participation, such as whether submissions are more frequent at the beginning or end of the week. This could be valuable for understanding user behavior and improving survey timing or outreach strategies.
|
|
||||||
|
|
||||||
```{r submission-histogram-survey-submissions-day-of-week, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
|
||||||
# Color palette
|
|
||||||
color_palette <- c(
|
|
||||||
primary = "#233f28",
|
|
||||||
secondary = "#7e9084",
|
|
||||||
tertiary = "#d9e1dd",
|
|
||||||
accent = "#face00"
|
|
||||||
)
|
|
||||||
|
|
||||||
library(dplyr)
|
|
||||||
library(ggplot2)
|
|
||||||
|
|
||||||
# Assuming 'survey_data' is your tibble
|
|
||||||
survey_data %>%
|
|
||||||
mutate(DayOfWeek = factor(weekdays(CreationDate),
|
|
||||||
levels = c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"))) %>% # Set order of days
|
|
||||||
ggplot(aes(x = DayOfWeek)) +
|
|
||||||
geom_bar(stat = "count", fill = color_palette["primary"], color = "black") + # Use primary color
|
|
||||||
geom_text(aes(label = after_stat(count)), stat = "count", vjust = -0.25, size = 5, color = color_palette["accent"]) + # Use accent color for text
|
|
||||||
xlab("Day of the Week") +
|
|
||||||
ylab("Number of Submissions") +
|
|
||||||
ggtitle("Submissions by Day of the Week") + # Add title
|
|
||||||
theme_minimal() + # Use a cleaner theme
|
|
||||||
theme(axis.text.x = element_text(angle = 45, hjust = 1, size = 12), # Adjust text size for better readability
|
|
||||||
plot.title = element_text(size = 16, hjust = 0.5)) # Center title and adjust size
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
```{r func-plot_submission_trends, echo=TRUE}
|
|
||||||
# Load necessary libraries
|
|
||||||
library(tidyverse)
|
|
||||||
|
|
||||||
# Custom color palette
|
|
||||||
color_palette <- c(
|
|
||||||
"#233f28", # primary
|
|
||||||
"#7e9084", # secondary
|
|
||||||
"#d9e1dd", # tertiary
|
|
||||||
"#face00" # accent
|
|
||||||
)
|
|
||||||
|
|
||||||
survey_data$CreationDate <- as.Date(survey_data$CreationDate)
|
|
||||||
|
|
||||||
# Define the function to plot survey submission trends
|
|
||||||
plot_submission_trends <- function(data, days_ago = 30) {
|
|
||||||
|
|
||||||
# Calculate the start date (days_ago days before today)
|
|
||||||
start_date <- Sys.Date() - days_ago
|
|
||||||
|
|
||||||
# Filter the data based on the calculated start date (up to today)
|
|
||||||
submission_trends <- data %>%
|
|
||||||
filter(CreationDate >= start_date) %>%
|
|
||||||
group_by(CreationDate) %>%
|
|
||||||
summarize(submissions = n())
|
|
||||||
|
|
||||||
# Create the plot
|
|
||||||
ggplot(submission_trends, aes(x = CreationDate, y = submissions)) +
|
|
||||||
geom_line(color = color_palette[1], linewidth = 1) + # Line color from palette
|
|
||||||
geom_point(color = color_palette[1], size = 3, shape = 16) + # Points for visibility
|
|
||||||
labs(
|
|
||||||
title = "Survey Submission Trends by Date",
|
|
||||||
subtitle = paste("Tracking submissions for the last", days_ago, "days"),
|
|
||||||
x = "Submission Date",
|
|
||||||
y = "Number of Submissions"
|
|
||||||
) +
|
|
||||||
theme_minimal() +
|
|
||||||
theme(
|
|
||||||
plot.title = element_text(hjust = 0.5, face = "bold", size = 16),
|
|
||||||
plot.subtitle = element_text(hjust = 0.5, size = 12, color = "grey40"),
|
|
||||||
axis.title.x = element_text(color = "black", size = 12),
|
|
||||||
axis.title.y = element_text(color = "black", size = 12),
|
|
||||||
axis.text = element_text(color = "black", size = 10),
|
|
||||||
panel.grid.major = element_line(color = "grey90"),
|
|
||||||
panel.grid.minor = element_blank(),
|
|
||||||
axis.text.x = element_text(angle = 45, hjust = 1) # Rotate x-axis labels
|
|
||||||
) +
|
|
||||||
# Add a smoothed trend line (loess)
|
|
||||||
geom_smooth(method = "loess", color = color_palette[4], linewidth = 1, linetype = "dashed")
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## 30 Day Trend
|
|
||||||
The plot below visualizes the survey submission trends for the past 30 days. It shows the number of submissions made each day, highlighting variations over the last month. This type of plot is helpful for understanding trends in user activity, such as identifying peak submission days, periods of low activity, or gradual changes over time.
|
|
||||||
|
|
||||||
The data used for this plot is filtered to include only submissions made in the last 30 days, with the submission count for each date represented by both the line and the points on the graph. A smoothed trend line (dashed) has been added to help visualize the overall submission pattern over this period.
|
|
||||||
|
|
||||||
```{r plot-submission-trends-30d, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
|
||||||
plot_submission_trends(survey_data, days_ago = 30)
|
|
||||||
```
|
|
||||||
|
|
||||||
## 90 Day Trend
|
|
||||||
The plot below visualizes the survey submission trends for the past 90 days. It shows the number of submissions made each day, highlighting variations over the last month. This type of plot is helpful for understanding trends in user activity, such as identifying peak submission days, periods of low activity, or gradual changes over time.
|
|
||||||
|
|
||||||
The data used for this plot is filtered to include only submissions made in the last 90 days, with the submission count for each date represented by both the line and the points on the graph. A smoothed trend line (dashed) has been added to help visualize the overall submission pattern over this period.
|
|
||||||
|
|
||||||
```{r plot-submission-trends-90d, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
|
||||||
plot_submission_trends(survey_data, days_ago = 90)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Optional Question Response Rates
|
|
||||||
The table below summarizes the response rates for optional key top-level questions in the survey. These are the questions that all participants are asked, with some triggering additional follow-up questions based on responses. The response rate is the percentage of participants who provided an answer for each question.
|
|
||||||
|
|
||||||
The "Total Number of Species Planted" question has special handling—only responses greater than 0 are considered valid, whereas for other questions, any non-NA value counts as a response.
|
|
||||||
|
|
||||||
```{r optonal-top-level-question-response-rate-table, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
|
||||||
# List of fields to check for response rates, with special handling for 'Total Number of Species Planted'
|
|
||||||
fields <- c("Planter Contact Email", "Funding Source", "Land Ownership",
|
|
||||||
"Tree Size Planted", "Source of Trees", "Total Number of Species Planted")
|
|
||||||
|
|
||||||
# Calculate the response rate for each field
|
|
||||||
response_rates <- sapply(fields, function(field) {
|
|
||||||
if (field == "Total Number of Species Planted") {
|
|
||||||
# For "Total Number of Species Planted", consider answered if value is greater than 0
|
|
||||||
sum(survey_data[[field]] > 0, na.rm = TRUE) / nrow(survey_data) * 100
|
|
||||||
} else {
|
|
||||||
# For other fields, check for non-NA values
|
|
||||||
sum(!is.na(survey_data[[field]])) / nrow(survey_data) * 100
|
|
||||||
}
|
|
||||||
})
|
|
||||||
|
|
||||||
# Round the response rates to 2 decimal places
|
|
||||||
response_rates_rounded <- round(response_rates, 2)
|
|
||||||
|
|
||||||
# Sort the response rates in descending order (highest to lowest)
|
|
||||||
sorted_response_rates <- sort(response_rates_rounded, decreasing = TRUE)
|
|
||||||
|
|
||||||
# Create a clean data frame with the field names and their response rates
|
|
||||||
response_rate_table <- data.frame(
|
|
||||||
"Field" = names(sorted_response_rates),
|
|
||||||
"Response Rate (%)" = sorted_response_rates,
|
|
||||||
stringsAsFactors = FALSE # Ensure the "Field" column is treated as character, not factor
|
|
||||||
)
|
|
||||||
|
|
||||||
# Remove the row names (the extra column that appears as a result of conversion)
|
|
||||||
rownames(response_rate_table) <- NULL
|
|
||||||
|
|
||||||
# Fix column names to ensure proper headers
|
|
||||||
colnames(response_rate_table) <- c("Field", "Response Rate (%)")
|
|
||||||
|
|
||||||
# Display the table using kable for better formatting
|
|
||||||
library(knitr)
|
|
||||||
kable(response_rate_table, caption = "Response Rates for Key Survey Questions", align = "l")
|
|
||||||
```
|
|
||||||
|
|
||||||
The following provides additional context for each survey question/field, detailing what the percentage represents.
|
|
||||||
|
|
||||||
- **Planter Contact Email**: The percentage of respondents who provided their email address.
|
|
||||||
- **Funding Source**: The percentage of respondents who identified their funding source.
|
|
||||||
- **Land Ownership**: The percentage of respondents who indicated their land ownership status.
|
|
||||||
- **Tree Size Planted**: The percentage of respondents who specified the size of trees they planted.
|
|
||||||
- **Source of Trees**: The percentage of respondents who reported the source of the trees they planted.
|
|
||||||
- **Total Number of Species Planted **: The percentage of respondents who provided the species of tree(s) they planted.
|
|
||||||
|
|
||||||
# Participant Analysis {.tabset}
|
|
||||||
[Back to Top](#)
|
|
||||||
|
|
||||||
The following section contains an analysis of tree planting by participant type.
|
|
||||||
|
|
||||||
## Submissions
|
|
||||||
The following plot shows the distribution of survey submissions based on participant type. This breakdown highlights the contributions of each participant group to the tree planting initiative.
|
|
||||||
|
|
||||||
```{r participant-type-surveys, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
|
||||||
ggplot(survey_data, aes(x = `Who Planted The Tree(s)?`)) +
|
|
||||||
geom_bar(fill = "#233f28", color = "#7e9084") +
|
|
||||||
geom_text(stat = "count", aes(label = scales::comma(after_stat(count))),
|
|
||||||
position = position_stack(vjust = 0.5), # Places text in the middle of the bars
|
|
||||||
color = "#face00", size = 5) + # Use accent color for text labels
|
|
||||||
labs(
|
|
||||||
title = "Distribution of Tree Planting Submissions by Participant Type",
|
|
||||||
x = "Participant Type",
|
|
||||||
y = "Number of Submissions"
|
|
||||||
) +
|
|
||||||
scale_x_discrete(labels = c(
|
|
||||||
"agency" = "State Agency",
|
|
||||||
"community" = "Community Organization",
|
|
||||||
"landowner" = "Private Landowner",
|
|
||||||
"municipality" = "Municipal Government",
|
|
||||||
"professional" = "Paid Professional"
|
|
||||||
)) +
|
|
||||||
theme_minimal(base_size = 14) +
|
|
||||||
theme(
|
|
||||||
plot.title = element_text(size = 16, face = "bold", color = "#233f28"),
|
|
||||||
axis.title = element_text(size = 12, color = "#233f28"),
|
|
||||||
axis.text = element_text(size = 10, color = "#233f28"),
|
|
||||||
plot.margin = margin(10, 10, 10, 10),
|
|
||||||
panel.grid.major = element_line(color = "#d9e1dd", linewidth = 0.3),
|
|
||||||
panel.background = element_rect(fill = "#d9e1dd"),
|
|
||||||
axis.text.x = element_text(angle = 45, hjust = 1)
|
|
||||||
)
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
## Trees Planted
|
|
||||||
This plot visualizes the total number of trees planted by each participant type, helping to evaluate the overall impact of different groups in the tree planting program.
|
|
||||||
|
|
||||||
```{r participant-type-planted, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
|
||||||
|
|
||||||
summary_data <- survey_data %>%
|
|
||||||
group_by(`Who Planted The Tree(s)?`) %>%
|
|
||||||
summarise(total_trees = sum(`Number of Trees Planted`, na.rm = TRUE))
|
|
||||||
|
|
||||||
ggplot(summary_data, aes(x = `Who Planted The Tree(s)?`, y = total_trees)) +
|
|
||||||
geom_bar(stat = "identity", fill = "#233f28", color = "#7e9084") +
|
|
||||||
geom_text(aes(label = scales::comma(total_trees)),
|
|
||||||
position = position_stack(vjust = 0.5), # Places text in the middle of the bars
|
|
||||||
color = "#face00", size = 5) + # Accent color for text labels
|
|
||||||
labs(
|
|
||||||
title = "Contribution of Each Participant Type to Total Trees Planted",
|
|
||||||
x = "Participant Type",
|
|
||||||
y = "Total Number of Trees Planted"
|
|
||||||
) +
|
|
||||||
scale_x_discrete(labels = c(
|
|
||||||
"agency" = "State Agency",
|
|
||||||
"community" = "Community Organization",
|
|
||||||
"landowner" = "Private Landowner",
|
|
||||||
"municipality" = "Municipal Government",
|
|
||||||
"professional" = "Paid Professional"
|
|
||||||
)) +
|
|
||||||
theme_minimal(base_size = 14) + # Adjusted base font size for clarity
|
|
||||||
theme(
|
|
||||||
plot.title = element_text(size = 16, face = "bold", color = "#233f28"),
|
|
||||||
axis.title = element_text(size = 12, color = "#233f28"),
|
|
||||||
axis.text = element_text(size = 10, color = "#233f28"),
|
|
||||||
plot.margin = margin(10, 10, 10, 10),
|
|
||||||
panel.grid.major = element_line(color = "#d9e1dd", linewidth = 0.3),
|
|
||||||
panel.background = element_rect(fill = "#d9e1dd"),
|
|
||||||
axis.text.x = element_text(angle = 45, hjust = 1)
|
|
||||||
)
|
|
||||||
```
|
|
||||||
|
|
||||||
The following table provides a breakdown of the total number of trees planted by participant type. It shows both the total number of trees planted by each group and their proportional contribution to the overall planting efforts. This information helps assess which participant types have contributed the most to the tree planting program.
|
|
||||||
|
|
||||||
```{r participant-type-table, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
|
||||||
# Summarize the data to calculate the total number of trees planted by participant type
|
|
||||||
summary_data <- survey_data %>%
|
|
||||||
group_by(`Who Planted The Tree(s)?`) %>%
|
|
||||||
summarise(total_trees = sum(`Number of Trees Planted`, na.rm = TRUE))
|
|
||||||
# Replace the participant type values with more readable labels
|
|
||||||
summary_data <- summary_data %>%
|
|
||||||
mutate(
|
|
||||||
`Who Planted The Tree(s)?` = recode(`Who Planted The Tree(s)?`,
|
|
||||||
"agency" = "State Agency",
|
|
||||||
"community" = "Community Organization",
|
|
||||||
"landowner" = "Private Landowner",
|
|
||||||
"municipality" = "Municipal Government",
|
|
||||||
"professional" = "Paid Professional")
|
|
||||||
)
|
|
||||||
|
|
||||||
# Add percentage column
|
|
||||||
summary_data <- summary_data %>%
|
|
||||||
mutate(percentage = total_trees / sum(total_trees) * 100)
|
|
||||||
|
|
||||||
# Format the table to display the number of trees and percentage
|
|
||||||
summary_data_formatted <- summary_data %>%
|
|
||||||
mutate(
|
|
||||||
total_trees = scales::comma(total_trees), # Add commas to the total number of trees
|
|
||||||
percentage = paste0(round(percentage, 1), "%") # Round percentage and append '%'
|
|
||||||
)
|
|
||||||
|
|
||||||
summary_data_formatted %>%
|
|
||||||
knitr::kable(col.names = c("Participant Type", "Total Trees Planted", "Percentage of Total Trees"),
|
|
||||||
caption = "Breakdown of Total Trees Planted by Participant Type and Their Contribution to the Overall Tree Planting Effort",
|
|
||||||
align = c("l", "c", "c")) %>% # Align Participant Type left, and others center
|
|
||||||
kableExtra::kable_styling(
|
|
||||||
full_width = F,
|
|
||||||
position = "center",
|
|
||||||
bootstrap_options = c("striped", "hover"),
|
|
||||||
font_size = 14,
|
|
||||||
fixed_thead = TRUE
|
|
||||||
) %>%
|
|
||||||
kableExtra::column_spec(1, width = "20em", bold = TRUE) %>% # Participant Type column bold and wider
|
|
||||||
kableExtra::column_spec(2, width = "12em", color = "black") %>% # Total Trees column
|
|
||||||
kableExtra::column_spec(3, width = "12em", color = "black") %>% # Percentage column
|
|
||||||
kableExtra::add_footnote("Total number of trees and percentage represent each participant's contribution to the overall tree planting effort.")
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
## User Activity
|
|
||||||
```{r named-user-table}
|
|
||||||
library(tidyverse)
|
|
||||||
|
|
||||||
user_activity <- survey_data %>%
|
|
||||||
mutate(Creator = ifelse(is.na(Creator), "Public User", Creator)) %>%
|
|
||||||
group_by(Creator) %>%
|
|
||||||
summarise(
|
|
||||||
record_count = n(),
|
|
||||||
total_trees_planted = sum(`Number of Trees Planted`, na.rm = TRUE)
|
|
||||||
)
|
|
||||||
knitr::kable(user_activity, caption = "Survey Submissions by Named User", align = "l")
|
|
||||||
```
|
|
||||||
|
|
||||||
## Patricipant Activity
|
|
||||||
```{r participant-email-table}
|
|
||||||
library(tidyverse)
|
|
||||||
|
|
||||||
user_activity_email <- survey_data %>%
|
|
||||||
mutate(Creator = ifelse(is.na(`Planter Contact Email`), "Not Provided", `Planter Contact Email`)) %>%
|
|
||||||
group_by(`Planter Contact Email`) %>%
|
|
||||||
summarise(
|
|
||||||
record_count = n(),
|
|
||||||
total_trees_planted = sum(`Number of Trees Planted`, na.rm = TRUE)
|
|
||||||
)
|
|
||||||
knitr::kable(user_activity_email, caption = "Survey Submissions by E-mail", align = "l")
|
|
||||||
```
|
|
||||||
|
|
||||||
# Location Analysis{.tabset}
|
|
||||||
[Back to Top](#)
|
|
||||||
|
|
||||||
```{r func-create_summary_table, echo=TRUE}
|
|
||||||
create_summary_table <- function(data, field) {
|
|
||||||
# Summarize the data based on the field provided
|
|
||||||
summary_data <- data %>%
|
|
||||||
group_by(!!sym(field)) %>% # Dynamically use the provided field name
|
|
||||||
summarise(
|
|
||||||
submissions = n(), # Count of submissions
|
|
||||||
total_trees = sum(`Number of Trees Planted`, na.rm = TRUE) # Sum of trees planted
|
|
||||||
) %>%
|
|
||||||
mutate(
|
|
||||||
submissions_percentage = submissions / sum(submissions) * 100, # Proportion of submissions
|
|
||||||
trees_percentage = total_trees / sum(total_trees) * 100 # Proportion of trees planted
|
|
||||||
)
|
|
||||||
|
|
||||||
# Format the table to display commas for the totals and round percentages
|
|
||||||
summary_data_formatted <- summary_data %>%
|
|
||||||
mutate(
|
|
||||||
submissions = scales::comma(submissions),
|
|
||||||
total_trees = scales::comma(total_trees),
|
|
||||||
submissions_percentage = paste0(round(submissions_percentage, 1), "%"),
|
|
||||||
trees_percentage = paste0(round(trees_percentage, 1), "%")
|
|
||||||
)
|
|
||||||
|
|
||||||
# Create and style the table
|
|
||||||
summary_data_formatted %>%
|
|
||||||
knitr::kable(col.names = c(field, "Number of Submissions", "Number of Trees Planted", "Proportion of Submissions (%)", "Proportion of Trees Planted (%)"),
|
|
||||||
caption = paste("Summary of Submissions and Trees Planted by", field),
|
|
||||||
align = c("l", "c", "c", "c", "c")) %>%
|
|
||||||
kableExtra::kable_styling(
|
|
||||||
full_width = F,
|
|
||||||
position = "center",
|
|
||||||
bootstrap_options = c("striped", "hover"),
|
|
||||||
font_size = 14
|
|
||||||
) %>%
|
|
||||||
kableExtra::column_spec(1, width = "20em", bold = TRUE) %>% # First column bold and wider
|
|
||||||
kableExtra::column_spec(2, width = "12em") %>% # Total Trees column
|
|
||||||
kableExtra::column_spec(3, width = "12em") %>% # Percentage column
|
|
||||||
kableExtra::add_footnote("The proportions represent the percentage of submissions and trees planted for each category relative to the overall dataset.")
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## By Region
|
|
||||||
```{r create-summary-table-region, echo=TRUE, message=FALSE}
|
|
||||||
create_summary_table(survey_data, "Region")
|
|
||||||
```
|
|
||||||
|
|
||||||
## By County
|
|
||||||
This map displays the **total number of trees planted** across each county in **New York State**. The counties are color-coded, with darker shades representing areas where more trees have been planted. This allows users to quickly see which counties have had the most extensive tree planting efforts.
|
|
||||||
|
|
||||||
- **What to look for**:
|
|
||||||
- **Dark colors**: Indicate counties with a higher number of trees planted.
|
|
||||||
- **Lighter colors**: Represent counties with fewer trees planted.
|
|
||||||
|
|
||||||
The map provides a visual overview of tree planting distribution across New York, making it easier to identify areas with the highest impact or need for further action.
|
|
||||||
|
|
||||||
```{r create-county-choropleth-map, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
|
||||||
library(tigris) # For geographic data
|
|
||||||
library(sf) # For handling spatial data
|
|
||||||
library(dplyr) # For data manipulation
|
|
||||||
library(ggplot2) # For plotting
|
|
||||||
library(viridis) # For a color palette in the map
|
|
||||||
|
|
||||||
# Download New York State counties shapefile
|
|
||||||
ny_counties <- counties(state = "NY", cb = TRUE, progress = FALSE) %>% st_as_sf()
|
|
||||||
|
|
||||||
survey_data_aggregated <- survey_data %>%
|
|
||||||
group_by(County) %>%
|
|
||||||
summarise(total_trees = sum(`Number of Trees Planted`, na.rm = TRUE))
|
|
||||||
|
|
||||||
ny_counties_merged <- ny_counties %>%
|
|
||||||
left_join(survey_data_aggregated, by = c("NAME" = "County"))
|
|
||||||
|
|
||||||
# Get the system date and format it
|
|
||||||
current_date <- format(Sys.Date(), "%B %d, %Y") # Format as "Month Day, Year"
|
|
||||||
|
|
||||||
ggplot(data = ny_counties_merged) +
|
|
||||||
geom_sf(aes(fill = total_trees), color = "white") +
|
|
||||||
scale_fill_viridis_c(option = "plasma") + # Use a color scale like viridis
|
|
||||||
theme_minimal() +
|
|
||||||
labs(title = "Number of Trees Planted by County in New York",
|
|
||||||
fill = "Total Trees Planted") +
|
|
||||||
theme(axis.text = element_blank(), axis.title = element_blank()) +
|
|
||||||
annotate("text", x = -77.25, y = 45.25, label = paste("Date:", current_date), size = 4, hjust = 1, color = "black")
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
```{r create-summary-table-county, echo=TRUE, message=FALSE}
|
|
||||||
create_summary_table(survey_data, "County")
|
|
||||||
```
|
|
||||||
|
|
||||||
# Tree Analysis {.tabset}
|
|
||||||
[Back to Top](#)
|
|
||||||
```{r func-create_species_summary_table, echo=TRUE}
|
|
||||||
create_species_summary_table <- function(data, field, field_label = NULL) {
|
|
||||||
# Replace empty strings and NA values with "Not Provided" before summarization
|
|
||||||
data <- data %>%
|
|
||||||
mutate(
|
|
||||||
!!sym(field) := ifelse(!!sym(field) == "" | is.na(!!sym(field)), "Not Provided", !!sym(field)) # Replace empty strings and NAs
|
|
||||||
)
|
|
||||||
|
|
||||||
# Clean up the species names: replace underscores with spaces and convert to title case
|
|
||||||
data <- data %>%
|
|
||||||
mutate(
|
|
||||||
!!sym(field) := gsub("_", " ", !!sym(field)), # Replace underscores with spaces
|
|
||||||
!!sym(field) := tools::toTitleCase(!!sym(field)) # Convert to title case
|
|
||||||
)
|
|
||||||
|
|
||||||
# Summarize the data based on the field (e.g., Generic.Species.of.Tree)
|
|
||||||
summary_data <- data %>%
|
|
||||||
group_by(!!sym(field)) %>%
|
|
||||||
summarise(
|
|
||||||
submissions = n(), # Count of surveys for each species (or category)
|
|
||||||
.groups = "drop" # To prevent issues with group structure
|
|
||||||
) %>%
|
|
||||||
mutate(
|
|
||||||
submissions_percentage = submissions / sum(submissions) * 100 # Proportion of surveys for each category
|
|
||||||
)
|
|
||||||
|
|
||||||
# Format the table for display
|
|
||||||
summary_data_formatted <- summary_data %>%
|
|
||||||
mutate(
|
|
||||||
submissions = scales::comma(submissions), # Format the submission counts with commas
|
|
||||||
submissions_percentage = paste0(round(submissions_percentage, 1), "%") # Round percentage and append '%'
|
|
||||||
)
|
|
||||||
|
|
||||||
# Determine the label for the field
|
|
||||||
label <- ifelse(is.null(field_label), field, field_label)
|
|
||||||
|
|
||||||
# Create and style the table
|
|
||||||
summary_data_formatted %>%
|
|
||||||
knitr::kable(col.names = c(label, "Number of Surveys", "Proportion of Surveys (%)"),
|
|
||||||
caption = paste("Summary of Surveys by", label),
|
|
||||||
align = c("l", "c", "c")) %>% # Align the columns (left for the field, center for others)
|
|
||||||
kableExtra::kable_styling(
|
|
||||||
full_width = F,
|
|
||||||
position = "center",
|
|
||||||
bootstrap_options = c("striped", "hover"),
|
|
||||||
font_size = 14
|
|
||||||
) %>%
|
|
||||||
kableExtra::column_spec(1, width = "20em", bold = TRUE) %>% # First column (Species) bold and wider
|
|
||||||
kableExtra::column_spec(2, width = "12em") %>% # Number of Surveys column
|
|
||||||
kableExtra::column_spec(3, width = "12em") %>% # Proportion column
|
|
||||||
kableExtra::add_footnote("The proportions represent the percentage of surveys for each species relative to the total surveys.")
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## By Genus
|
|
||||||
|
|
||||||
The following table shows a breakdown of survey submissions by **Genus**. For each genus, the table provides:
|
|
||||||
|
|
||||||
1. **Number of Surveys**: The total number of surveys where this genus was reported.
|
|
||||||
2. **Proportion of Surveys (%)**: The percentage of total surveys that reported this genus, relative to the entire dataset.
|
|
||||||
3. **"Not Provided" Category**: Any surveys that did not specify a genus are grouped under the "Not Provided" category.
|
|
||||||
|
|
||||||
These figures provide an understanding of which genus are most commonly reported, how prevalent each genus is, and the proportion of surveys where no genus was specified.
|
|
||||||
|
|
||||||
```{r create-summary-table-genus, echo=TRUE, message=FALSE}
|
|
||||||
create_species_summary_table(species_data, "Generic Species of Tree", "Tree Genus")
|
|
||||||
```
|
|
||||||
|
|
||||||
## By Species
|
|
||||||
|
|
||||||
The following table shows a breakdown of survey submissions by **Species**. For each species, the table provides:
|
|
||||||
|
|
||||||
1. **Number of Surveys**: The total number of surveys where this species was reported.
|
|
||||||
2. **Proportion of Surveys (%)**: The percentage of total surveys that reported this species, relative to the entire dataset.
|
|
||||||
3. **"Not Provided" Category**: Any surveys that did not specify a species are grouped under the "Not Provided" category.
|
|
||||||
|
|
||||||
These figures provide an understanding of which species are most commonly reported, how prevalent each species is, and the proportion of surveys where no genus was specified.
|
|
||||||
|
|
||||||
```{r create-summary-table-species, echo=TRUE, message=FALSE}
|
|
||||||
create_species_summary_table(species_data, "Precise Species of Tree", "Tree Species")
|
|
||||||
```
|
|
||||||
=======
|
|
||||||
---
|
---
|
||||||
title: "25 Million Trees Initiative Survey Report"
|
title: "25 Million Trees Initiative Survey Report"
|
||||||
author:
|
author:
|
||||||
@ -1188,4 +591,3 @@ These figures provide an understanding of which species are most commonly report
|
|||||||
```{r create-summary-table-species, echo=TRUE, message=FALSE}
|
```{r create-summary-table-species, echo=TRUE, message=FALSE}
|
||||||
create_species_summary_table(species_data, "Precise Species of Tree", "Tree Species")
|
create_species_summary_table(species_data, "Precise Species of Tree", "Tree Species")
|
||||||
```
|
```
|
||||||
>>>>>>> a7b8cea9701d2c2cdc603ffbeb3f034416e37e38
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user