753-Data-Sync/README.md
Nick Heppler cad8d64c23 feat: Add --reload flag to data synchronization script for loading JSON data
- Introduced a new command-line argument `--reload` to allow loading data from a specified JSON file.
- Updated `parse_arguments` to include the `--reload` flag.
- Modified `main` function to truncate the feature layer and load data from the JSON file when the flag is used.
- Enhanced documentation to reflect the new functionality and usage examples.
2025-05-22 11:25:41 -04:00

173 lines
5.5 KiB
Markdown

![753 Data Sync logo](https://git.nickhepler.cloud/nick/753-Data-Sync/raw/branch/master/logo.png)
# 753 Data Sync
*A Python-based data ingestion tool for syncing enforcement data from a public API to ArcGIS Online.*
![Gitea Release](https://img.shields.io/gitea/v/release/nick/753-Data-Sync?gitea_url=https%3A%2F%2Fgit.nickhepler.cloud%2F&style=for-the-badge&logo=Python)
![Enhancements](https://img.shields.io/gitea/issues/open/nick/753-Data-Sync?gitea_url=https%3A%2F%2Fgit.nickhepler.cloud%2F&labels=enhancement&style=for-the-badge&logo=Gitea&label=Enhancements)
![Defects](https://img.shields.io/gitea/issues/open/nick/753-Data-Sync?gitea_url=https%3A%2F%2Fgit.nickhepler.cloud%2F&labels=bug&style=for-the-badge&logo=Gitea&label=Defects)
---
## 🚀 Overview
This script fetches enforcement data from an external API, truncates a specified feature layer in ArcGIS, and adds the fetched data as features to the layer. It also logs the operation, saves data to JSON files, and optionally purges old files. Additionally, it supports reloading data from a JSON file without making API calls.
---
## 📦 Requirements
- Python 3.6 or higher
- Required packages in `requirements.txt`
- `.env` file with your configuration
- ArcGIS Online credentials
---
## 🔧 Installation
```bash
pip install -r requirements.txt
```
Or install packages individually:
```bash
pip install requests python-dotenv
```
---
## ⚙️ Configuration
Create a `.env` file in the root of your project:
```env
API_URL=https://example.com/api
AGOL_USER=your_username
AGOL_PASSWORD=your_password
HOSTNAME=www.arcgis.com
INSTANCE=your_instance
FS=your_feature_service
LAYER=0
LOG_LEVEL=DEBUG
PURGE_DAYS=5
```
### Required Variables
| Variable | Description |
|----------------|--------------------------------------------|
| `API_URL` | The API endpoint to fetch data from |
| `AGOL_USER` | ArcGIS Online username |
| `AGOL_PASSWORD`| ArcGIS Online password |
| `HOSTNAME` | ArcGIS host (e.g., `www.arcgis.com`) |
| `INSTANCE` | ArcGIS REST instance path |
| `FS` | Feature service name |
| `LAYER` | Feature layer ID or name |
### Optional Variables
| Variable | Description |
|----------------|--------------------------------------------|
| `LOG_LEVEL` | Log level (`DEBUG`, `INFO`, etc.) |
| `PURGE_DAYS` | Number of days to retain logs and JSONs |
---
## 🧪 Script Usage
```bash
python 753DataSync.py --results_per_page 100
```
### CLI Arguments
| Argument | Description |
|----------------------|---------------------------------------------|
| `--results_per_page` | Optional. Number of results per API call (default: `100`) |
| `--test` | Optional. If set, only fetch the first page of results. |
| `--reload` | Optional. Load data from a specified JSON file instead of fetching from the API. |
---
## 📋 Functionality
1. **🔁 Truncate Layer** — Clears existing ArcGIS features.
2. **🌐 Fetch Data** — Retrieves paginated data from the API.
3. **💾 Save Data** — Writes each page to a time-stamped JSON file.
4. **📦 Aggregate Data** — Combines all pages into one file.
5. **📤 Add Features** — Sends data to ArcGIS feature layer.
6. **🧹 File Cleanup** — Deletes `.json`/`.log` files older than `PURGE_DAYS`.
7. **📑 Dynamic Logs** — Logs saved to `753DataSync_YYYY-MM-DD.log`.
8. **🧪 Test Mode** — Use the `--test` flag to fetch only the first page of results for testing purposes.
9. **🔄 Reload Data** — Use the `--reload` flag to truncate the feature layer and load data from a specified JSON file.
---
## 📁 Example Output
```bash
📁 data/
├── enforcement_page_1_results_100_2025-03-26_14-30-45.json
├── enforcement_page_2_results_100_2025-03-26_14-31-10.json
└── aggregated_enforcement_results_2025-03-26_14-31-15.json
📄 753DataSync_2025-03-26.log
```
---
## 📝 Example Log
```text
2025-03-26 14:30:45 - INFO - Attempting to truncate layer...
2025-03-26 14:30:51 - INFO - Fetching page 1 from API...
2025-03-26 14:30:55 - INFO - Saved data to data/enforcement_page_1_results_100_...
2025-03-26 14:30:57 - INFO - Aggregated data saved.
2025-03-26 14:31:00 - INFO - Features added successfully.
2025-03-26 14:31:01 - INFO - Deleted old log: 753DataSync_2025-03-19.log
```
---
## 🛠 Troubleshooting
- Set `LOG_LEVEL=DEBUG` in `.env` for detailed logs.
- Ensure `.env` has no syntax errors.
- Make sure your ArcGIS layer has permission for truncation and writes.
- Check for internet/API access and expired ArcGIS tokens.
- Logs are written to both console and daily log files.
---
## 🧪 Testing
Currently, the script is tested manually. Automated testing may be added under a `/tests` folder in the future.
---
## 📖 Usage Examples
```bash
# Run with default page size
python 753DataSync.py
# Run with custom page size
python 753DataSync.py --results_per_page 50
```
---
## 💬 Support
Found a bug or want to request a feature?
[Open an issue](https://git.nickhepler.cloud/nick/753-Data-Sync/issues) or contact [@nick](https://git.nickhepler.cloud/nick) directly.
---
## 📜 License
This project is licensed under the [GNU General Public License v3.0](LICENSE).
> 💡 *You are free to use, modify, and share this project as long as you preserve the same license in your changes.*