Update report.
This commit is contained in:
parent
0a1d3a92a6
commit
25ba31bc3d
134
report.Rmd
134
report.Rmd
@ -9,12 +9,20 @@ date: "`r format(Sys.Date(), '%B, %d, %Y')`"
|
||||
keywords: "keyword1, keyword2"
|
||||
output:
|
||||
html_document:
|
||||
toc: true
|
||||
toc_depth: 1
|
||||
toc_float: true
|
||||
number_sections: false
|
||||
css: custom.css
|
||||
code_folding: hide
|
||||
---
|
||||
|
||||
```{r setup, include=FALSE}
|
||||
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE)
|
||||
knitr::opts_chunk$set(
|
||||
echo = TRUE,
|
||||
message = FALSE,
|
||||
warning = FALSE
|
||||
)
|
||||
# Load necessary libraries
|
||||
library(tidyverse)
|
||||
library(lubridate)
|
||||
@ -25,7 +33,7 @@ survey_path <- "data/_25_Million_Trees_Initiative_Survey_0.csv"
|
||||
survey_data <- read_csv(survey_path)
|
||||
|
||||
species_path <- "data/species_planted_4.csv"
|
||||
species_data <- read.csv(species_path)
|
||||
species_data <- read_csv(species_path)
|
||||
|
||||
# Convert the CreationDate field to a proper datetime object (if applicable)
|
||||
survey_data <- survey_data %>%
|
||||
@ -43,14 +51,13 @@ used_count <- survey_data %>%
|
||||
```
|
||||
|
||||
---
|
||||
abstract: "This report was generated on: **`r format(Sys.Date(), '%B, %d, %Y')`**. For the period beginning : **`r format(min(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`** and ending: **`r format(max(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`**. **`r used_count`** records were used in this analysis."
|
||||
subtitle: "`r format(min(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")` to `r format(max(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`."
|
||||
---
|
||||
|
||||
# {.tabset .tabset-fade .tabset-pills}
|
||||
# Report Overview {.tabset}
|
||||
[Back to Top](#)
|
||||
|
||||
## Report Overview
|
||||
|
||||
### Background
|
||||
## Background
|
||||
|
||||
The **25 Million Trees Initiative** is a bold commitment launched by **Governor Kathy Hochul** during the 2024 State of the State Address, aiming to plant 25 million trees by 2033 in New York State. This initiative recognizes the critical importance of trees and forests for climate mitigation, enhancing community health, and supporting biodiversity. The New York State Department of Environmental Conservation (DEC) is at the forefront of tracking the progress of this ambitious goal.
|
||||
|
||||
@ -58,7 +65,7 @@ As part of this effort, DEC has launched the **Tree Tracker**, a tool for the pu
|
||||
|
||||
This report compiles the survey data collected via the Tree Tracker and provides detailed insights into the information submitted by New Yorkers. It aims to support DEC staff and executives in understanding the progress of the initiative and identifying areas for improvement in outreach and engagement.
|
||||
|
||||
### Purpose & Objectives
|
||||
## Purpose & Objectives
|
||||
|
||||
This report serves to present an overview of the data collected through the 25 Million Trees Initiative, offering insights into submission patterns, geographic distribution, and trends in tree planting activities. The report aims to:
|
||||
|
||||
@ -68,7 +75,7 @@ This report serves to present an overview of the data collected through the 25 M
|
||||
|
||||
As more individuals contribute their data to the Tree Tracker, the initiative's success will be better understood, and DEC can better align resources to further promote this critical program.
|
||||
|
||||
### Survey Period and Data Exclusions
|
||||
## Survey Period & Exclusions
|
||||
|
||||
The report covers the survey period from **`r format(min(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`** to **`r format(max(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`**, including a total of **`r nrow(survey_data)`** records. Out of these, **`r used_count`** records were deemed valid and included in the analysis.
|
||||
|
||||
@ -79,7 +86,7 @@ Exclusions were applied to **`r excluded_count`** records, which were removed du
|
||||
|
||||
These excluded records are marked with a value of **1** in the `Exclude Result` field. The remaining **`r used_count`** records, marked with a **0**, represent legitimate data points that were included in the analysis.
|
||||
|
||||
### Survey Validation Process and Data Consistency
|
||||
## Validation & Data Consistency
|
||||
|
||||
To ensure data integrity, several validation steps are applied to survey submissions:
|
||||
|
||||
@ -98,10 +105,10 @@ To ensure data integrity, several validation steps are applied to survey submiss
|
||||
|
||||
By applying these validation checks, the integrity and consistency of the data is ensured, allowing for meaningful analysis of tree planting surveys.
|
||||
|
||||
# Submission Analysis {.tabset}
|
||||
[Back to Top](#)
|
||||
|
||||
## Submission Analysis {.tabset}
|
||||
|
||||
### Submissions by Day of Week
|
||||
## Day of Week
|
||||
The histogram presented below visualizes the number of survey submissions based on the day of the week. Each bar represents the frequency of submissions for a particular day, with the x-axis displaying the days (Monday through Sunday) and the y-axis showing the number of submissions for each corresponding day.
|
||||
|
||||
This chart helps identify any trends in survey participation, such as whether submissions are more frequent at the beginning or end of the week. This could be valuable for understanding user behavior and improving survey timing or outreach strategies.
|
||||
@ -139,7 +146,7 @@ survey_data %>%
|
||||
library(tidyverse)
|
||||
|
||||
# Custom color palette
|
||||
custom_palette <- c(
|
||||
color_palette <- c(
|
||||
"#233f28", # primary
|
||||
"#7e9084", # secondary
|
||||
"#d9e1dd", # tertiary
|
||||
@ -162,8 +169,8 @@ plot_submission_trends <- function(data, days_ago = 30) {
|
||||
|
||||
# Create the plot
|
||||
ggplot(submission_trends, aes(x = CreationDate, y = submissions)) +
|
||||
geom_line(color = custom_palette[1], linewidth = 1) + # Line color from palette
|
||||
geom_point(color = custom_palette[1], size = 3, shape = 16) + # Points for visibility
|
||||
geom_line(color = color_palette[1], linewidth = 1) + # Line color from palette
|
||||
geom_point(color = color_palette[1], size = 3, shape = 16) + # Points for visibility
|
||||
labs(
|
||||
title = "Survey Submission Trends by Date",
|
||||
subtitle = paste("Tracking submissions for the last", days_ago, "days"),
|
||||
@ -182,11 +189,11 @@ plot_submission_trends <- function(data, days_ago = 30) {
|
||||
axis.text.x = element_text(angle = 45, hjust = 1) # Rotate x-axis labels
|
||||
) +
|
||||
# Add a smoothed trend line (loess)
|
||||
geom_smooth(method = "loess", color = custom_palette[4], linewidth = 1, linetype = "dashed")
|
||||
geom_smooth(method = "loess", color = color_palette[4], linewidth = 1, linetype = "dashed")
|
||||
}
|
||||
```
|
||||
|
||||
### 30 Day Trend
|
||||
## 30 Day Trend
|
||||
The plot below visualizes the survey submission trends for the past 30 days. It shows the number of submissions made each day, highlighting variations over the last month. This type of plot is helpful for understanding trends in user activity, such as identifying peak submission days, periods of low activity, or gradual changes over time.
|
||||
|
||||
The data used for this plot is filtered to include only submissions made in the last 30 days, with the submission count for each date represented by both the line and the points on the graph. A smoothed trend line (dashed) has been added to help visualize the overall submission pattern over this period.
|
||||
@ -195,16 +202,16 @@ The data used for this plot is filtered to include only submissions made in the
|
||||
plot_submission_trends(survey_data, days_ago = 30)
|
||||
```
|
||||
|
||||
### 90 Day Trend
|
||||
## 90 Day Trend
|
||||
The plot below visualizes the survey submission trends for the past 90 days. It shows the number of submissions made each day, highlighting variations over the last month. This type of plot is helpful for understanding trends in user activity, such as identifying peak submission days, periods of low activity, or gradual changes over time.
|
||||
|
||||
The data used for this plot is filtered to include only submissions made in the last 90 days, with the submission count for each date represented by both the line and the points on the graph. A smoothed trend line (dashed) has been added to help visualize the overall submission pattern over this period.
|
||||
|
||||
```{r plot-submission-trends-90d, echo=TRUE, message=FALSE}
|
||||
```{r plot-submission-trends-90d, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
||||
plot_submission_trends(survey_data, days_ago = 90)
|
||||
```
|
||||
|
||||
### Response Rates to Top-Level Optional Questions
|
||||
## Optional Question Response Rates
|
||||
The table below summarizes the response rates for optional key top-level questions in the survey. These are the questions that all participants are asked, with some triggering additional follow-up questions based on responses. The response rate is the percentage of participants who provided an answer for each question.
|
||||
|
||||
The "Total Number of Species Planted" question has special handling—only responses greater than 0 are considered valid, whereas for other questions, any non-NA value counts as a response.
|
||||
@ -258,38 +265,12 @@ The following provides additional context for each survey question/field, detail
|
||||
- **Source of Trees**: The percentage of respondents who reported the source of the trees they planted.
|
||||
- **Total Number of Species Planted **: The percentage of respondents who provided the species of tree(s) they planted.
|
||||
|
||||
### User Data
|
||||
```{r named-user-table}
|
||||
library(tidyverse)
|
||||
# Participant Analysis {.tabset}
|
||||
[Back to Top](#)
|
||||
|
||||
user_activity <- survey_data %>%
|
||||
mutate(Creator = ifelse(is.na(Creator), "Public User", Creator)) %>%
|
||||
group_by(Creator) %>%
|
||||
summarise(
|
||||
record_count = n(),
|
||||
total_trees_planted = sum(`Number of Trees Planted`, na.rm = TRUE)
|
||||
)
|
||||
knitr::kable(user_activity, caption = "Survey Submissions by Named User", align = "l")
|
||||
```
|
||||
|
||||
### User Data
|
||||
```{r participant-email-table}
|
||||
library(tidyverse)
|
||||
|
||||
user_activity_email <- survey_data %>%
|
||||
mutate(Creator = ifelse(is.na(`Planter Contact Email`), "Not Provided", `Planter Contact Email`)) %>%
|
||||
group_by(`Planter Contact Email`) %>%
|
||||
summarise(
|
||||
record_count = n(),
|
||||
total_trees_planted = sum(`Number of Trees Planted`, na.rm = TRUE)
|
||||
)
|
||||
knitr::kable(user_activity_email, caption = "Survey Submissions by E-mail", align = "l")
|
||||
```
|
||||
|
||||
## Participant Analysis {.tabset}
|
||||
The following section contains an analysis of tree planting by participant type.
|
||||
|
||||
### Submissions
|
||||
## Submissions
|
||||
The following plot shows the distribution of survey submissions based on participant type. This breakdown highlights the contributions of each participant group to the tree planting initiative.
|
||||
|
||||
```{r participant-type-surveys, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
||||
@ -323,7 +304,7 @@ ggplot(survey_data, aes(x = `Who Planted The Tree(s)?`)) +
|
||||
|
||||
```
|
||||
|
||||
### Trees Planted
|
||||
## Trees Planted
|
||||
This plot visualizes the total number of trees planted by each participant type, helping to evaluate the overall impact of different groups in the tree planting program.
|
||||
|
||||
```{r participant-type-planted, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
||||
@ -408,8 +389,36 @@ summary_data_formatted %>%
|
||||
|
||||
```
|
||||
|
||||
## User Activity
|
||||
```{r named-user-table}
|
||||
library(tidyverse)
|
||||
|
||||
## Location Analysis{.tabset}
|
||||
user_activity <- survey_data %>%
|
||||
mutate(Creator = ifelse(is.na(Creator), "Public User", Creator)) %>%
|
||||
group_by(Creator) %>%
|
||||
summarise(
|
||||
record_count = n(),
|
||||
total_trees_planted = sum(`Number of Trees Planted`, na.rm = TRUE)
|
||||
)
|
||||
knitr::kable(user_activity, caption = "Survey Submissions by Named User", align = "l")
|
||||
```
|
||||
|
||||
## Patricipant Activity
|
||||
```{r participant-email-table}
|
||||
library(tidyverse)
|
||||
|
||||
user_activity_email <- survey_data %>%
|
||||
mutate(Creator = ifelse(is.na(`Planter Contact Email`), "Not Provided", `Planter Contact Email`)) %>%
|
||||
group_by(`Planter Contact Email`) %>%
|
||||
summarise(
|
||||
record_count = n(),
|
||||
total_trees_planted = sum(`Number of Trees Planted`, na.rm = TRUE)
|
||||
)
|
||||
knitr::kable(user_activity_email, caption = "Survey Submissions by E-mail", align = "l")
|
||||
```
|
||||
|
||||
# Location Analysis{.tabset}
|
||||
[Back to Top](#)
|
||||
|
||||
```{r func-create_summary_table, echo=TRUE}
|
||||
create_summary_table <- function(data, field) {
|
||||
@ -452,12 +461,12 @@ create_summary_table <- function(data, field) {
|
||||
}
|
||||
```
|
||||
|
||||
### By Region
|
||||
## By Region
|
||||
```{r create-summary-table-region, echo=TRUE, message=FALSE}
|
||||
create_summary_table(survey_data, "Region")
|
||||
```
|
||||
|
||||
### By County
|
||||
## By County
|
||||
This map displays the **total number of trees planted** across each county in **New York State**. The counties are color-coded, with darker shades representing areas where more trees have been planted. This allows users to quickly see which counties have had the most extensive tree planting efforts.
|
||||
|
||||
- **What to look for**:
|
||||
@ -466,7 +475,7 @@ This map displays the **total number of trees planted** across each county in **
|
||||
|
||||
The map provides a visual overview of tree planting distribution across New York, making it easier to identify areas with the highest impact or need for further action.
|
||||
|
||||
```{r create-county-choropleth-map, echo=TRUE, message=FALSE}
|
||||
```{r create-county-choropleth-map, echo=TRUE, message=FALSE, fig.height=6, fig.width=8}
|
||||
library(tigris) # For geographic data
|
||||
library(sf) # For handling spatial data
|
||||
library(dplyr) # For data manipulation
|
||||
@ -497,7 +506,12 @@ ggplot(data = ny_counties_merged) +
|
||||
|
||||
```
|
||||
|
||||
## Tree Analysis {.tabset}
|
||||
```{r create-summary-table-county, echo=TRUE, message=FALSE}
|
||||
create_summary_table(survey_data, "County")
|
||||
```
|
||||
|
||||
# Tree Analysis {.tabset}
|
||||
[Back to Top](#)
|
||||
```{r func-create_species_summary_table, echo=TRUE}
|
||||
create_species_summary_table <- function(data, field, field_label = NULL) {
|
||||
# Replace empty strings and NA values with "Not Provided" before summarization
|
||||
@ -552,7 +566,7 @@ create_species_summary_table <- function(data, field, field_label = NULL) {
|
||||
}
|
||||
```
|
||||
|
||||
### By Genus
|
||||
## By Genus
|
||||
|
||||
The following table shows a breakdown of survey submissions by **Genus**. For each genus, the table provides:
|
||||
|
||||
@ -563,10 +577,10 @@ The following table shows a breakdown of survey submissions by **Genus**. For ea
|
||||
These figures provide an understanding of which genus are most commonly reported, how prevalent each genus is, and the proportion of surveys where no genus was specified.
|
||||
|
||||
```{r create-summary-table-genus, echo=TRUE, message=FALSE}
|
||||
create_species_summary_table(species_data, "Generic.Species.of.Tree", "Tree Genus")
|
||||
create_species_summary_table(species_data, "Generic Species of Tree", "Tree Genus")
|
||||
```
|
||||
|
||||
### By Species
|
||||
## By Species
|
||||
|
||||
The following table shows a breakdown of survey submissions by **Species**. For each species, the table provides:
|
||||
|
||||
@ -577,5 +591,5 @@ The following table shows a breakdown of survey submissions by **Species**. For
|
||||
These figures provide an understanding of which species are most commonly reported, how prevalent each species is, and the proportion of surveys where no genus was specified.
|
||||
|
||||
```{r create-summary-table-species, echo=TRUE, message=FALSE}
|
||||
create_species_summary_table(species_data, "Precise.Species.of.Tree", "Tree Species")
|
||||
create_species_summary_table(species_data, "Precise Species of Tree", "Tree Species")
|
||||
```
|
||||
|
||||
Loading…
Reference in New Issue
Block a user