Initial commit.
This commit is contained in:
parent
a48323ce01
commit
b25e1ba2c3
30
join.R
Normal file
30
join.R
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
# Load necessary libraries
|
||||||
|
library(tidyverse)
|
||||||
|
|
||||||
|
# Load the CSV files into R
|
||||||
|
survey_data <- read_csv("data/_25_Million_Trees_Initiative_Survey_0.csv")
|
||||||
|
location_points <- read_csv("data/location_points_1.csv")
|
||||||
|
location_polygons <- read_csv("data/location_polygons_2.csv")
|
||||||
|
participant_organizations <- read_csv("data/participant_organizations_3.csv")
|
||||||
|
species_planted <- read_csv("data/species_planted_4.csv")
|
||||||
|
vendors <- read_csv("data/vendors_5.csv")
|
||||||
|
|
||||||
|
# View the structure of each data frame to check the relevant columns for joining
|
||||||
|
glimpse(survey_data)
|
||||||
|
glimpse(location_points)
|
||||||
|
glimpse(location_polygons)
|
||||||
|
glimpse(participant_organizations)
|
||||||
|
glimpse(species_planted)
|
||||||
|
glimpse(vendors)
|
||||||
|
|
||||||
|
# Join the data based on the ParentGlobalID, ensuring all rows from survey_data are retained
|
||||||
|
combined_data <- survey_data %>%
|
||||||
|
left_join(location_points, by = c("GlobalID" = "ParentGlobalID")) %>%
|
||||||
|
left_join(location_polygons, by = c("GlobalID" = "ParentGlobalID")) %>%
|
||||||
|
left_join(participant_organizations, by = c("GlobalID" = "ParentGlobalID")) %>%
|
||||||
|
left_join(species_planted, by = c("GlobalID" = "ParentGlobalID")) %>%
|
||||||
|
left_join(vendors, by = c("GlobalID" = "ParentGlobalID"))
|
||||||
|
|
||||||
|
# View the combined data to ensure everything is merged correctly
|
||||||
|
glimpse(combined_data)
|
||||||
|
|
||||||
378
report.Rmd
Normal file
378
report.Rmd
Normal file
@ -0,0 +1,378 @@
|
|||||||
|
---
|
||||||
|
title: "25 Million Trees Initiative Survey Report"
|
||||||
|
author:
|
||||||
|
- name: Nicholas Hepler <nicholas.hepler@its.ny.gov>
|
||||||
|
affiliation: Office of Information Technology Services
|
||||||
|
- name: Annabel Gregg <annabel.gregg@dec.ny.gov>
|
||||||
|
affiliation: Department of Environmental Conservation
|
||||||
|
date: "`r format(Sys.Date(), '%B, %d, %Y')`"
|
||||||
|
output: html_document
|
||||||
|
---
|
||||||
|
|
||||||
|
```{r setup, include=FALSE}
|
||||||
|
# Load necessary libraries
|
||||||
|
library(tidyverse)
|
||||||
|
library(lubridate)
|
||||||
|
library(ggplot2)
|
||||||
|
|
||||||
|
# Read the CSV file into a dataframe
|
||||||
|
file_path <- "data/_25_Million_Trees_Initiative_Survey_0.csv"
|
||||||
|
survey_data <- read_csv(file_path)
|
||||||
|
|
||||||
|
# Convert the CreationDate field to a proper datetime object (if applicable)
|
||||||
|
survey_data <- survey_data %>%
|
||||||
|
mutate(CreationDate = mdy_hms(CreationDate))
|
||||||
|
|
||||||
|
# Count the records to be excluded (Exclude Result == 1)
|
||||||
|
excluded_count <- survey_data %>%
|
||||||
|
filter(`Exclude Result` == 1) %>%
|
||||||
|
nrow()
|
||||||
|
|
||||||
|
# Count the records that are used (Exclude Result == 0)
|
||||||
|
used_count <- survey_data %>%
|
||||||
|
filter(`Exclude Result` == 0) %>%
|
||||||
|
nrow()
|
||||||
|
|
||||||
|
survey_data <- survey_data %>%
|
||||||
|
filter(`Exclude Result` == 0)
|
||||||
|
```
|
||||||
|
|
||||||
|
# {.tabset .tabset-fade .tabset-pills}
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
### Background
|
||||||
|
|
||||||
|
The **25 Million Trees Initiative** is a bold commitment launched by **Governor Kathy Hochul** during the 2024 State of the State Address, aiming to plant 25 million trees by 2033 in New York State. This initiative recognizes the critical importance of trees and forests for climate mitigation, enhancing community health, and supporting biodiversity. The New York State Department of Environmental Conservation (DEC) is at the forefront of tracking the progress of this ambitious goal.
|
||||||
|
|
||||||
|
As part of this effort, DEC has launched the **Tree Tracker**, a tool for the public to record the trees they plant. These submissions contribute valuable data on the number, type, and locations of trees being planted across the state, helping to build a comprehensive, real-time dashboard of tree planting activities.
|
||||||
|
|
||||||
|
This report compiles the survey data collected via the Tree Tracker and provides detailed insights into the information submitted by New Yorkers. It aims to support DEC staff and executives in understanding the progress of the initiative and identifying areas for improvement in outreach and engagement.
|
||||||
|
|
||||||
|
### Purpose
|
||||||
|
|
||||||
|
This report serves to present an overview of the data collected through the 25 Million Trees Initiative, offering insights into submission patterns, geographic distribution, and trends in tree planting activities. The report aims to:
|
||||||
|
|
||||||
|
- Summarize the overall progress of the initiative.
|
||||||
|
- Provide detailed data analysis on the submitted tree planting information.
|
||||||
|
- Identify areas where more outreach or support may be needed.
|
||||||
|
|
||||||
|
As more individuals contribute their data to the Tree Tracker, the initiative's success will be better understood, and DEC can better align resources to further promote this critical program.
|
||||||
|
|
||||||
|
## Submission Overview
|
||||||
|
|
||||||
|
This section contains information about surveys to the Tree Tracker Tool.
|
||||||
|
|
||||||
|
### Survey Period and Exclusions
|
||||||
|
|
||||||
|
This report covers the period from **`r format(min(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`** to **`r format(max(survey_data$CreationDate, na.rm = TRUE), "%B %d, %Y")`**, with a total of **`r nrow(survey_data)`** records. Of these, **`r used_count`** records were included in the analysis.
|
||||||
|
|
||||||
|
- **Exclusions**: **`r excluded_count`** records have been excluded from the analysis. The primary reasons for exclusion include:
|
||||||
|
- **Double Count**: Some records were excluded to prevent duplication of data (e.g., surveys or submissions that were entered multiple times).
|
||||||
|
- **Test Data**: Some submissions were excluded as they were entered for testing purposes and do not represent actual survey submissions.
|
||||||
|
|
||||||
|
These records are identified by the `Exclude Result` field, where a value of **1** indicates the record was excluded due to one of these reasons.
|
||||||
|
|
||||||
|
- **Included Records**: **`r used_count`** records have been included in the report. These are valid survey submissions, marked with a value of **0** in the `Exclude Result` field, indicating they are legitimate data points.
|
||||||
|
|
||||||
|
### Survey Validation and Data Consistency
|
||||||
|
|
||||||
|
To ensure data integrity, several validation steps are applied to survey submissions:
|
||||||
|
|
||||||
|
- **Required Fields**:
|
||||||
|
- **Number of Trees**: The number of trees planted is a required field, and users cannot submit the survey without providing this information.
|
||||||
|
- **Geographic Data**: Geographic coordinates (latitude and longitude) are also required, and users must provide this data when submitting their survey.
|
||||||
|
|
||||||
|
- **Geographic Validation**: Once geographic coordinates are entered, they are checked against official civil boundaries to ensure the accuracy of locality, county, and region data. In rare cases, this check may fail due to discrepancies in coordinates, but such records are corrected before inclusion in the analysis.
|
||||||
|
|
||||||
|
- **Data Correction for Missing Information**: In cases where certain critical fields (such as geographic location or number of trees planted) are missing due to system issues, records are corrected prior to report generation. This ensures that only complete and accurate records are included in the analysis.
|
||||||
|
|
||||||
|
- **Date Logic**:
|
||||||
|
- **Program Start Date**: Users cannot enter planting dates prior to the official Program Start Date. The system enforces this restriction, and any records with such dates are not allowed to be submitted.
|
||||||
|
|
||||||
|
- **Format and Consistency Checks**:
|
||||||
|
- **Email Format**: The email addresses entered in the survey are validated to ensure they follow the correct format.
|
||||||
|
- **Optional Questions**: Even optional questions undergo validation to ensure the entered data meets the expected format or logic, providing further consistency and accuracy.
|
||||||
|
|
||||||
|
By applying these validation checks, the integrity and consistency of the data is ensured, allowing for meaningful analysis of tree planting surveys.
|
||||||
|
|
||||||
|
### Submission Trend
|
||||||
|
|
||||||
|
With this context in mind, the following visualization shows the trend in the total number of submissions over the survey period, highlighting any notable patterns.
|
||||||
|
|
||||||
|
```{r submission-trend, echo=FALSE, message=FALSE, fig.height=6, fig.width=8}
|
||||||
|
|
||||||
|
library(ggplot2)
|
||||||
|
library(dplyr)
|
||||||
|
|
||||||
|
survey_data$CreationDate <- as.Date(survey_data$CreationDate)
|
||||||
|
|
||||||
|
# Summarize the data to calculate the total number of submissions by CreationDate
|
||||||
|
summary_data <- survey_data %>%
|
||||||
|
group_by(CreationDate) %>%
|
||||||
|
summarise(total_submissions = n())
|
||||||
|
|
||||||
|
ggplot(summary_data, aes(x = CreationDate, y = total_submissions)) +
|
||||||
|
geom_line(color = "#233f28", linewidth = 1) + # Change 'size' to 'linewidth'
|
||||||
|
geom_point(color = "#7e9084", size = 3) +
|
||||||
|
geom_smooth(method = "loess", color = "#face00", linewidth = 1, linetype = "dashed") +
|
||||||
|
labs(
|
||||||
|
title = "Total Number of Submissions by Date",
|
||||||
|
x = "Submission Date",
|
||||||
|
y = "Total Number of Submissions"
|
||||||
|
) +
|
||||||
|
theme_minimal(base_size = 14) +
|
||||||
|
theme(
|
||||||
|
plot.title = element_text(size = 16, face = "bold", color = "#233f28"),
|
||||||
|
axis.title = element_text(size = 12, color = "#233f28"),
|
||||||
|
axis.text = element_text(size = 10, color = "#233f28"),
|
||||||
|
plot.margin = margin(10, 10, 10, 10),
|
||||||
|
panel.grid.major = element_line(color = "#d9e1dd", linewidth = 0.3), # Change 'size' to 'linewidth'
|
||||||
|
panel.background = element_rect(fill = "#d9e1dd"),
|
||||||
|
axis.text.x = element_text(angle = 45, hjust = 1)
|
||||||
|
) +
|
||||||
|
scale_x_date(date_labels = "%b %Y", date_breaks = "1 months")
|
||||||
|
```
|
||||||
|
|
||||||
|
### Response Rates
|
||||||
|
The following table provides the response rates for a set of optional fields in the survey dataset. Each field represents a different aspect of the survey, and the response rate is calculated as the percentage of respondents who provided a valid answer.
|
||||||
|
|
||||||
|
- **Planter Contact Email**: The percentage of respondents who provided their email address.
|
||||||
|
- **Funding Source**: The percentage of respondents who identified their funding source.
|
||||||
|
- **Land Ownership**: The percentage of respondents who indicated their land ownership status.
|
||||||
|
- **Tree Size Planted**: The percentage of respondents who specified the size of trees they planted.
|
||||||
|
- **Source of Trees**: The percentage of respondents who reported the source of the trees they planted.
|
||||||
|
- **Species Planted**: The percentage of respondents who provided the species of tree(s) they planted.
|
||||||
|
|
||||||
|
The data is sorted from the highest to the lowest response rate, allowing for easy identification of fields with higher or lower levels of respondent engagement. This helps to identify areas where respondents may have been more likely to provide answers, as well as fields that could benefit from clarification or further encouragement to respond.
|
||||||
|
|
||||||
|
```{r response-rate, echo=FALSE, message=FALSE}
|
||||||
|
# List of fields to check for response rates, with special handling for 'Total Number of Species Planted'
|
||||||
|
fields <- c("Planter Contact Email", "Funding Source", "Land Ownership",
|
||||||
|
"Tree Size Planted", "Source of Trees", "Total Number of Species Planted")
|
||||||
|
|
||||||
|
# Calculate the response rate for each field
|
||||||
|
response_rates <- sapply(fields, function(field) {
|
||||||
|
if (field == "Total Number of Species Planted") {
|
||||||
|
# For "Total Number of Species Planted", consider answered if value is greater than 0
|
||||||
|
sum(survey_data[[field]] > 0, na.rm = TRUE) / nrow(survey_data) * 100
|
||||||
|
} else {
|
||||||
|
# For other fields, check for non-NA values
|
||||||
|
sum(!is.na(survey_data[[field]])) / nrow(survey_data) * 100
|
||||||
|
}
|
||||||
|
})
|
||||||
|
|
||||||
|
# Round the response rates to 2 decimal places
|
||||||
|
response_rates_rounded <- round(response_rates, 2)
|
||||||
|
|
||||||
|
# Sort the response rates in descending order (highest to lowest)
|
||||||
|
sorted_response_rates <- sort(response_rates_rounded, decreasing = TRUE)
|
||||||
|
|
||||||
|
# Print the sorted, rounded response rates
|
||||||
|
sorted_response_rates
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
## Participant Type Overview
|
||||||
|
|
||||||
|
This section provides an overview of the different types of participants involved in tree planting surveys. The data collected includes submissions from various categories of participants, including state agencies, community organizations, private landowners, and municipal governments. By understanding the distribution of these participant types and the scope of their contributions, we can gain insights into the reach and diversity of the program. The following visualizations highlight the number of surveys and total trees planted by each participant type.
|
||||||
|
|
||||||
|
### Participant Type: Number of Submissions
|
||||||
|
The first visualization shows the distribution of the number of tree planting surveys based on the participant type. This breakdown helps highlight which groups are contributing most to the tree planting initiative.
|
||||||
|
|
||||||
|
```{r participant-type-surveys, echo=FALSE, message=FALSE}
|
||||||
|
library(ggplot2)
|
||||||
|
library(dplyr)
|
||||||
|
|
||||||
|
ggplot(survey_data, aes(x = `Who Planted The Tree(s)?`)) +
|
||||||
|
geom_bar(fill = "#233f28", color = "#7e9084") +
|
||||||
|
geom_text(stat = "count", aes(label = scales::comma(after_stat(count))),
|
||||||
|
position = position_stack(vjust = 0.5), # Places text in the middle of the bars
|
||||||
|
color = "#ffffff", size = 5) + # Adjust label size
|
||||||
|
labs(
|
||||||
|
title = "Number of Tree Planting Submissions by Participant Type",
|
||||||
|
x = "Participant Type",
|
||||||
|
y = "Number of Submissions"
|
||||||
|
) +
|
||||||
|
scale_x_discrete(labels = c(
|
||||||
|
"agency" = "State Agency",
|
||||||
|
"community" = "Community Organization",
|
||||||
|
"landowner" = "Private Landowner",
|
||||||
|
"municipality" = "Municipal Government"
|
||||||
|
)) +
|
||||||
|
theme_minimal(base_size = 14) +
|
||||||
|
theme(
|
||||||
|
plot.title = element_text(size = 16, face = "bold", color = "#233f28"),
|
||||||
|
axis.title = element_text(size = 12, color = "#233f28"),
|
||||||
|
axis.text = element_text(size = 10, color = "#233f28"),
|
||||||
|
plot.margin = margin(10, 10, 10, 10),
|
||||||
|
panel.grid.major = element_line(color = "#d9e1dd", linewidth = 0.3),
|
||||||
|
panel.background = element_rect(fill = "#d9e1dd"),
|
||||||
|
axis.text.x = element_text(angle = 45, hjust = 1)
|
||||||
|
)
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
### Participant Type: Total Trees Planted
|
||||||
|
This second plot provides a breakdown of the total number of trees planted by participant type. This visualization helps to assess the contribution of each participant group to the overall impact of the tree planting program.
|
||||||
|
|
||||||
|
```{r participant-type-planted, echo=FALSE, message=FALSE}
|
||||||
|
library(ggplot2)
|
||||||
|
library(dplyr)
|
||||||
|
|
||||||
|
summary_data <- survey_data %>%
|
||||||
|
group_by(`Who Planted The Tree(s)?`) %>%
|
||||||
|
summarise(total_trees = sum(`Number of Trees Planted`, na.rm = TRUE))
|
||||||
|
|
||||||
|
library(ggplot2)
|
||||||
|
library(dplyr)
|
||||||
|
|
||||||
|
# Assuming 'summary_data' is already defined
|
||||||
|
ggplot(summary_data, aes(x = `Who Planted The Tree(s)?`, y = total_trees)) +
|
||||||
|
geom_bar(stat = "identity", fill = "#233f28", color = "#7e9084") +
|
||||||
|
geom_text(aes(label = scales::comma(total_trees)),
|
||||||
|
position = position_stack(vjust = 0.5), # Places text in the middle of the bars
|
||||||
|
color = "#ffffff", size = 5) + # Accent color for text labels
|
||||||
|
labs(
|
||||||
|
title = "Total Number of Trees Planted by Participant Type",
|
||||||
|
x = "Participant Type",
|
||||||
|
y = "Total Number of Trees Planted"
|
||||||
|
) +
|
||||||
|
scale_x_discrete(labels = c(
|
||||||
|
"agency" = "State Agency",
|
||||||
|
"community" = "Community Organization",
|
||||||
|
"landowner" = "Private Landowner",
|
||||||
|
"municipality" = "Municipal Government"
|
||||||
|
)) +
|
||||||
|
theme_minimal(base_size = 14) + # Adjusted base font size for clarity
|
||||||
|
theme(
|
||||||
|
plot.title = element_text(size = 16, face = "bold", color = "#233f28"),
|
||||||
|
axis.title = element_text(size = 12, color = "#233f28"),
|
||||||
|
axis.text = element_text(size = 10, color = "#233f28"),
|
||||||
|
plot.margin = margin(10, 10, 10, 10),
|
||||||
|
panel.grid.major = element_line(color = "#d9e1dd", linewidth = 0.3),
|
||||||
|
panel.background = element_rect(fill = "#d9e1dd"),
|
||||||
|
axis.text.x = element_text(angle = 45, hjust = 1)
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
```{r participant-type-table, echo=FALSE, message=FALSE}
|
||||||
|
# Summarize the data to calculate the total number of trees planted by participant type
|
||||||
|
summary_data <- survey_data %>%
|
||||||
|
group_by(`Who Planted The Tree(s)?`) %>%
|
||||||
|
summarise(total_trees = sum(`Number of Trees Planted`, na.rm = TRUE))
|
||||||
|
# Replace the participant type values with more readable labels
|
||||||
|
summary_data <- summary_data %>%
|
||||||
|
mutate(
|
||||||
|
`Who Planted The Tree(s)?` = recode(`Who Planted The Tree(s)?`,
|
||||||
|
"agency" = "State Agency",
|
||||||
|
"community" = "Community Organization",
|
||||||
|
"landowner" = "Private Landowner",
|
||||||
|
"municipality" = "Municipal Government")
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add percentage column
|
||||||
|
summary_data <- summary_data %>%
|
||||||
|
mutate(percentage = total_trees / sum(total_trees) * 100)
|
||||||
|
|
||||||
|
# Format the table to display the number of trees and percentage
|
||||||
|
summary_data_formatted <- summary_data %>%
|
||||||
|
mutate(
|
||||||
|
total_trees = scales::comma(total_trees), # Add commas to the total number of trees
|
||||||
|
percentage = paste0(round(percentage, 1), "%") # Round percentage and append '%'
|
||||||
|
)
|
||||||
|
|
||||||
|
# Print the table
|
||||||
|
summary_data_formatted %>%
|
||||||
|
knitr::kable(col.names = c("Participant Type", "Total Trees Planted", "Percentage of Total Trees"),
|
||||||
|
caption = "Total Number of Trees Planted by Participant Type and their Proportional Contribution") %>%
|
||||||
|
kableExtra::kable_styling(full_width = F, position = "center", bootstrap_options = c("striped", "hover"))
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
## Region Overview
|
||||||
|
This section provides an overview of regional involved and response to the tree planting survey.
|
||||||
|
|
||||||
|
In the table below, we aggregate plantings by Region. The results are provided in descending order of Total Trees Planted.
|
||||||
|
```{r region-summary, echo=FALSE, warning=FALSE, message=FALSE}
|
||||||
|
# Summarize the data by Region
|
||||||
|
region_summary_data <- survey_data %>%
|
||||||
|
group_by(Region) %>%
|
||||||
|
summarise(
|
||||||
|
total_records = n(), # Count the number of records in each region
|
||||||
|
total_trees_planted = sum(`Number of Trees Planted`, na.rm = TRUE), # Sum of trees planted in each region
|
||||||
|
mean_trees_planted = mean(`Number of Trees Planted`, na.rm = TRUE), # Mean number of trees planted
|
||||||
|
median_trees_planted = median(`Number of Trees Planted`, na.rm = TRUE) # Median number of trees planted
|
||||||
|
) %>%
|
||||||
|
arrange(desc(total_trees_planted)) # Sort by total trees planted in descending order
|
||||||
|
|
||||||
|
# Format the table to display the total number of records and trees planted
|
||||||
|
region_summary_data_formatted <- region_summary_data %>%
|
||||||
|
mutate(
|
||||||
|
total_trees_planted = scales::comma(total_trees_planted), # Add commas to the total number of trees
|
||||||
|
total_records = scales::comma(total_records), # Add commas to the total number of records
|
||||||
|
mean_trees_planted = round(mean_trees_planted, 1), # Round mean for readability
|
||||||
|
median_trees_planted = round(median_trees_planted, 1) # Round median for readability
|
||||||
|
)
|
||||||
|
|
||||||
|
# Print the summary table
|
||||||
|
region_summary_data_formatted %>%
|
||||||
|
knitr::kable(col.names = c("Region", "Total Submissions", "Total Trees Planted", "Mean", "Median"),
|
||||||
|
caption = "Total Records, Trees Planted, Mean, and Median by Region (Sorted by Trees Planted)") %>%
|
||||||
|
kableExtra::kable_styling(full_width = F, position = "center", bootstrap_options = c("striped", "hover"))
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
## County Overview
|
||||||
|
This section provides an overview of counties involved and response to the tree planting survey.
|
||||||
|
|
||||||
|
In the table below, we aggregate plantings by Region. The results are provided in descending order of Total Trees Planted.
|
||||||
|
```{r county-summary, echo=FALSE, warning=FALSE, message=FALSE}
|
||||||
|
# Summarize the data by Region
|
||||||
|
county_summary_data <- survey_data %>%
|
||||||
|
group_by(County) %>%
|
||||||
|
summarise(
|
||||||
|
total_records = n(), # Count the number of records in each county
|
||||||
|
total_trees_planted = sum(`Number of Trees Planted`, na.rm = TRUE), # Sum of trees planted in each region
|
||||||
|
mean_trees_planted = mean(`Number of Trees Planted`, na.rm = TRUE), # Mean number of trees planted
|
||||||
|
median_trees_planted = median(`Number of Trees Planted`, na.rm = TRUE) # Median number of trees planted
|
||||||
|
) %>%
|
||||||
|
arrange(desc(total_trees_planted)) # Sort by total trees planted in descending order
|
||||||
|
|
||||||
|
# Format the table to display the total number of records and trees planted
|
||||||
|
county_summary_data_formatted <- county_summary_data %>%
|
||||||
|
mutate(
|
||||||
|
total_trees_planted = scales::comma(total_trees_planted), # Add commas to the total number of trees
|
||||||
|
total_records = scales::comma(total_records), # Add commas to the total number of records
|
||||||
|
mean_trees_planted = round(mean_trees_planted, 1), # Round mean for readability
|
||||||
|
median_trees_planted = round(median_trees_planted, 1) # Round median for readability
|
||||||
|
)
|
||||||
|
|
||||||
|
# Print the summary table
|
||||||
|
county_summary_data_formatted %>%
|
||||||
|
knitr::kable(col.names = c("County", "Total Submissions", "Total Trees Planted", "Mean", "Median"),
|
||||||
|
caption = "Total Records, Trees Planted, Mean, and Median by County (Sorted by Trees Planted)") %>%
|
||||||
|
kableExtra::kable_styling(full_width = F, position = "center", bootstrap_options = c("striped", "hover"))
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
## Tree Count
|
||||||
|
In this section, we present summary statistics for the number of trees planted by all participants in various tree planting surveys.
|
||||||
|
|
||||||
|
```{r summary-stats, echo=FALSE, warning=FALSE, message=FALSE}
|
||||||
|
# Calculate summary statistics
|
||||||
|
summary_stats <- summary(survey_data$`Number of Trees Planted`, na.rm = TRUE)
|
||||||
|
```
|
||||||
|
|
||||||
|
Below is a summary of the `Number of Trees Planted` across participants:
|
||||||
|
|
||||||
|
| Statistic | Value |
|
||||||
|
|-------------|-------------|
|
||||||
|
| Min | `r summary_stats["Min"]` |
|
||||||
|
| 1st Qu. | `r summary_stats["1st Qu."]` |
|
||||||
|
| Median | `r summary_stats["Median"]` |
|
||||||
|
| Mean | `r summary_stats["Mean"]` |
|
||||||
|
| 3rd Qu. | `r summary_stats["3rd Qu."]` |
|
||||||
|
| Max | `r summary_stats["Max"]` |
|
||||||
|
|
||||||
|
The summary statistics for the number of trees planted provide insight into the distribution of trees planted by all participants in the tree planting surveys. While the median value gives us a sense of the "typical" number of trees planted, the mean might be skewed by a few participants planting a very large number of trees.
|
||||||
Loading…
Reference in New Issue
Block a user