753-Data-Sync/README.md

195 lines
6.2 KiB
Markdown

![753 Data Sync logo](https://git.nickhepler.cloud/nick/753-Data-Sync/raw/branch/master/logo.png)
# 753 Data Sync
*A Python-based data ingestion tool for syncing enforcement data from a public API to ArcGIS Online.*
![Gitea Release](https://img.shields.io/gitea/v/release/nick/753-Data-Sync?gitea_url=https%3A%2F%2Fgit.nickhepler.cloud%2F&style=for-the-badge&logo=Python)
![Enhancements](https://img.shields.io/gitea/issues/open/nick/753-Data-Sync?gitea_url=https%3A%2F%2Fgit.nickhepler.cloud%2F&labels=enhancement&style=for-the-badge&logo=Gitea&label=Enhancements)
![Defects](https://img.shields.io/gitea/issues/open/nick/753-Data-Sync?gitea_url=https%3A%2F%2Fgit.nickhepler.cloud%2F&labels=bug&style=for-the-badge&logo=Gitea&label=Defects)
---
## 🚀 Overview
This script fetches enforcement data from an external API, truncates a specified feature layer in ArcGIS, and adds the fetched data as features to the layer. It also logs the operation, saves data to JSON files, and optionally purges old files. Additionally, it supports reloading data from a JSON file without making API calls.
---
## 📦 Requirements
- Python 3.6 or higher (if using the Python script)
- Required packages in `requirements.txt`
- `.env` file with your configuration
- ArcGIS Online credentials
---
## ⚙️ Installation
### Python Script
```bash
pip install -r requirements.txt
```
Or install packages individually:
```bash
pip install requests python-dotenv
```
### Windows Executable
A Windows executable is available for users who prefer not to run the script directly. You can download it from the [releases page](https://git.nickhepler.cloud/nick/753-Data-Sync/releases). This executable is compiled using PyInstaller and can be run without needing to install Python or any dependencies.
---
## ⚙️ Configuration
Create a `.env` file in the root of your project:
```env
API_URL=https://example.com/api
AGOL_USER=your_username
AGOL_PASSWORD=your_password
HOSTNAME=www.arcgis.com
INSTANCE=your_instance
FS=your_feature_service
LAYER=0
LOG_LEVEL=DEBUG
PURGE_DAYS=5
```
### Required Variables
| Variable | Description |
|----------------|--------------------------------------------|
| `API_URL` | The API endpoint to fetch data from |
| `AGOL_USER` | ArcGIS Online username |
| `AGOL_PASSWORD`| ArcGIS Online password |
| `HOSTNAME` | ArcGIS host (e.g., `www.arcgis.com`) |
| `INSTANCE` | ArcGIS REST instance path |
| `FS` | Feature service name |
| `LAYER` | Feature layer ID or name |
### Optional Variables
| Variable | Description |
|----------------|--------------------------------------------|
| `LOG_LEVEL` | Log level (`DEBUG`, `INFO`, etc.) |
| `PURGE_DAYS` | Number of days to retain logs and JSONs |
---
## 🧪 Script Usage
### Python Script
```bash
python 753DataSync.py --results_per_page 100
```
### Windows Executable
Simply double-click the executable file to run it. You can also run it from the command line with:
```bash
753DataSync.exe --results_per_page 100
```
### CLI Arguments
| Argument | Description |
|----------------------|---------------------------------------------|
| `--results_per_page` | Optional. Number of results per API call (default: `100`) |
| `--test` | Optional. If set, only fetch the first page of results. |
| `--reload` | Optional. Load data from a specified JSON file instead of fetching from the API. |
---
## 📋 Functionality
1. **🔁 Truncate Layer** — Clears existing ArcGIS features.
2. **🌐 Fetch Data** — Retrieves paginated data from the API.
3. **💾 Save Data** — Writes each page to a time-stamped JSON file.
4. **📦 Aggregate Data** — Combines all pages into one file.
5. **📤 Add Features** — Sends data to ArcGIS feature layer.
6. **🧹 File Cleanup** — Deletes `.json`/`.log` files older than `PURGE_DAYS`.
7. **📑 Dynamic Logs** — Logs saved to `753DataSync_YYYY-MM-DD.log`.
8. **🧪 Test Mode** — Use the `--test` flag to fetch only the first page of results for testing purposes.
9. **🔄 Reload Data** — Use the `--reload` flag to truncate the feature layer and load data from a specified JSON file.
---
## 📁 Example Output
```bash
📁 data/
├── enforcement_page_1_results_100_2025-03-26_14-30-45.json
├── enforcement_page_2_results_100_2025-03-26_14-31-10.json
└── aggregated_enforcement_results_2025-03-26_14-31-15.json
📄 753DataSync_2025-03-26.log
```
---
## 📝 Example Log
```text
2025-03-26 14:30:45 - INFO - Attempting to truncate layer...
2025-03-26 14:30:51 - INFO - Fetching page 1 from API...
2025-03-26 14:30:55 - INFO - Saved data to data/enforcement_page_1_results_100_...
2025-03-26 14:30:57 - INFO - Aggregated data saved.
2025-03-26 14:31:00 - INFO - Features added successfully.
2025-03-26 14:31:01 - INFO - Deleted old log: 753DataSync_2025-03-19.log
```
---
## 🛠 Troubleshooting
- Set `LOG_LEVEL=DEBUG` in `.env` for detailed logs.
- Ensure `.env` has no syntax errors.
- Make sure your ArcGIS layer has permission for truncation and writes.
- Check for internet/API access and expired ArcGIS tokens.
- Logs are written to both console and daily log files.
---
## 🧪 Testing
Currently, the script is tested manually. Automated testing may be added under a `/tests` folder in the future.
---
## 📖 Usage Examples
```bash
# Run with default page size
python 753DataSync.py
# Run with custom page size
python 753DataSync.py --results_per_page 50
# Run the Windows executable with default page size
753DataSync.exe
# Run the Windows executable with custom page size
753DataSync.exe --results_per_page 50
```
---
## 💬 Support
Found a bug or want to request a feature?
[Open an issue](https://git.nickhepler.cloud/nick/753-Data-Sync/issues) or contact [@nick](https://git.nickhepler.cloud/nick) directly.
---
## 📜 License
This project is licensed under the [GNU General Public License v3.0](LICENSE).
> 💡 *You are free to use, modify, and share this project as long as you preserve the same license in your changes.*