Files
tree-planting-data-validator/README.md
T
2026-05-19 11:39:59 -04:00

3.8 KiB

Tree Planting Data Validator 🌳

A robust Python script to validate Excel-based tree planting records.

This tool ensures that tree planting data submissions meet quality standards before being imported into databases or mapping systems. It performs comprehensive checks on participant types, planting dates, geographic coordinates, and tree sources.


Features

  • Flexible column matching: Automatically detects common column name variations
  • Data cleaning: Removes blank rows and normalizes input
  • Comprehensive validation:
    • Required fields check
    • Participant type validation
    • Date range and format validation (no future dates, no pre-2024 planting)
    • Geographic boundary validation (Northeast US focus)
    • Tree source categorization
  • Clear, human-readable error reports grouped by Excel row
  • Command-line interface with optional CSV error export
  • Debug mode for troubleshooting

📋 Validation Rules

Field Requirements
Participant Type Must be one of: landowner, community, professional, municipality, agency
Plant Start Required, valid date, on or after Jan 1, 2024, not in the future
Plant End Required, valid date, not in the future
Latitude Required, numeric, between 40.4774 and 45.01585
Longitude Required, numeric, between -79.5 and -71.1
Tree Source Required, one of: local, regional, national, plant_sale, self, other

🚀 Installation

  1. Clone the repository:

    git clone <your-repo-url>
    cd tree-planting-validator
    
    
  2. (Optional but recommended) Create a virtual environment:

    python -m venv venv
    source venv/bin/activate    # Linux/Mac
    # venv\Scripts\activate    # Windows
    
  3. Install dependencies:

    pip install pandas openpyxl
    

📖 Usage

Basic usage

python validate_tree_planting_data.py "path/to/your/data.xlsx"

With options

python validate_tree_planting_data.py "data.xlsx" \
  --header-row 2 \
  --debug \
  --output errors_report.csv

Command Line Arguments

Argument Description Default
excel_file_path Path to the Excel file (required) -
--header-row Row number containing column headers 1
--debug Enable debug output False
--output Save validation errors to a CSV file None

Example Output

Success:

==========================================================================================
✅ VALIDATION SUCCESSFUL
Processed 245 records
==========================================================================================

With Errors:

❌ VALIDATION REPORT
Total records processed : 245
Total errors found      : 12

📍 Excel Row 47
-----------------------------------------------------------------
   • Plant Start..................... Future date
   • Longitude....................... Out of range

File Requirements

  • Excel file (.xlsx or .xls)
  • Header row with recognizable column names (case-insensitive)
  • At minimum: Participant Type, Plant Start, Plant End, Latitude, Longitude, Tree Source

Contributing

Contributions are welcome! Feel free to submit issues or pull requests for:

  • Additional validation rules
  • Support for more file formats
  • Enhanced reporting features

License

This project is open source. Feel free to use and modify as needed.


Made for tree planting initiatives 🌱