Improve README file.
This commit is contained in:
parent
6f412348fe
commit
ed8647df43
190
README.md
190
README.md
@ -1,126 +1,170 @@
|
||||

|
||||
|
||||
# 753 Data Sync
|
||||
|
||||
*A Python-based data ingestion tool for syncing enforcement data from a public API to ArcGIS Online.*
|
||||
|
||||

|
||||

|
||||

|
||||

|
||||

|
||||
|
||||
This script fetches enforcement data from an external API, truncates a specified feature layer in ArcGIS, and adds the fetched data as features to the layer. The script performs the following tasks:
|
||||
---
|
||||
|
||||
- **Truncate** the specified layer in ArcGIS to clear any previous features before adding new ones.
|
||||
- **Fetch** data from an API in paginated form.
|
||||
- **Save** data from each API response to individual JSON files.
|
||||
- **Aggregate** all data from all pages into one JSON file.
|
||||
- **Add** the aggregated data as features to an ArcGIS feature service.
|
||||
## 🚀 Overview
|
||||
|
||||
## Requirements
|
||||
This script fetches enforcement data from an external API, truncates a specified feature layer in ArcGIS, and adds the fetched data as features to the layer. It also logs the operation, saves data to JSON files, and optionally purges old files.
|
||||
|
||||
---
|
||||
|
||||
## 📦 Requirements
|
||||
|
||||
- Python 3.6 or higher
|
||||
- Required Python packages (see `requirements.txt`)
|
||||
- ArcGIS Online credentials (username and password)
|
||||
- `.env` file for configuration (see below for details)
|
||||
- Required packages in `requirements.txt`
|
||||
- `.env` file with your configuration
|
||||
- ArcGIS Online credentials
|
||||
|
||||
## Install Dependencies
|
||||
---
|
||||
|
||||
To install the required dependencies, use the following command:
|
||||
## 🔧 Installation
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Alternatively, you can install the necessary packages individually:
|
||||
Or install packages individually:
|
||||
|
||||
```bash
|
||||
pip install requests
|
||||
pip install python-dotenv
|
||||
pip install requests python-dotenv
|
||||
```
|
||||
|
||||
## Configuration
|
||||
---
|
||||
|
||||
Before running the script, you need to configure some environment variables. Create a `.env` file in the root of your project with the following details:
|
||||
## ⚙️ Configuration
|
||||
|
||||
Create a `.env` file in the root of your project:
|
||||
|
||||
```env
|
||||
API_URL=your_api_url
|
||||
AGOL_USER=your_arcgis_online_username
|
||||
AGOL_PASSWORD=your_arcgis_online_password
|
||||
HOSTNAME=your_arcgis_host
|
||||
INSTANCE=your_arcgis_instance
|
||||
API_URL=https://example.com/api
|
||||
AGOL_USER=your_username
|
||||
AGOL_PASSWORD=your_password
|
||||
HOSTNAME=www.arcgis.com
|
||||
INSTANCE=your_instance
|
||||
FS=your_feature_service
|
||||
LAYER=your_layer_id
|
||||
LOG_LEVEL=your_log_level # e.g., DEBUG, INFO, WARNING, ERROR, CRITICAL
|
||||
LAYER=0
|
||||
LOG_LEVEL=DEBUG
|
||||
PURGE_DAYS=5
|
||||
```
|
||||
|
||||
### Environment Variables:
|
||||
### Required Variables
|
||||
|
||||
- **API_URL**: The URL of the API you are fetching data from.
|
||||
- **AGOL_USER**: Your ArcGIS Online username.
|
||||
- **AGOL_PASSWORD**: Your ArcGIS Online password.
|
||||
- **HOSTNAME**: The hostname of your ArcGIS Online instance (e.g., `www.arcgis.com`).
|
||||
- **INSTANCE**: The instance name of your ArcGIS Online service.
|
||||
- **FS**: The name of the feature service you are working with.
|
||||
- **LAYER**: The ID or name of the layer to truncate and add features to.
|
||||
- **LOG_LEVEL**: The desired logging level (e.g., `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`).
|
||||
| Variable | Description |
|
||||
|----------------|--------------------------------------------|
|
||||
| `API_URL` | The API endpoint to fetch data from |
|
||||
| `AGOL_USER` | ArcGIS Online username |
|
||||
| `AGOL_PASSWORD`| ArcGIS Online password |
|
||||
| `HOSTNAME` | ArcGIS host (e.g., `www.arcgis.com`) |
|
||||
| `INSTANCE` | ArcGIS REST instance path |
|
||||
| `FS` | Feature service name |
|
||||
| `LAYER` | Feature layer ID or name |
|
||||
|
||||
## Script Usage
|
||||
### Optional Variables
|
||||
|
||||
You can run the script with the following command:
|
||||
| Variable | Description |
|
||||
|----------------|--------------------------------------------|
|
||||
| `LOG_LEVEL` | Log level (`DEBUG`, `INFO`, etc.) |
|
||||
| `PURGE_DAYS` | Number of days to retain logs and JSONs |
|
||||
|
||||
---
|
||||
|
||||
## 🧪 Script Usage
|
||||
|
||||
```bash
|
||||
python 753DataSync.py --results_per_page <number_of_results_per_page>
|
||||
python 753DataSync.py --results_per_page 100
|
||||
```
|
||||
|
||||
### Arguments:
|
||||
### CLI Arguments
|
||||
|
||||
- `--results_per_page` (optional): The number of results to fetch per page (default: 100).
|
||||
| Argument | Description |
|
||||
|----------------------|---------------------------------------------|
|
||||
| `--results_per_page` | Optional. Number of results per API call (default: `100`) |
|
||||
|
||||
## Functionality
|
||||
---
|
||||
|
||||
### 1. **Truncate Layer**:
|
||||
Before fetching and adding any new data, the script will call the `truncate` function to clear out any existing features from the specified layer. This ensures that the feature layer is empty and ready for the new data.
|
||||
## 📋 Functionality
|
||||
|
||||
### 2. **Fetch Data**:
|
||||
The script will then fetch data from the specified API in pages. Each page is fetched sequentially until all data is retrieved.
|
||||
1. **🔁 Truncate Layer** — Clears existing ArcGIS features.
|
||||
2. **🌐 Fetch Data** — Retrieves paginated data from the API.
|
||||
3. **💾 Save Data** — Writes each page to a time-stamped JSON file.
|
||||
4. **📦 Aggregate Data** — Combines all pages into one file.
|
||||
5. **📤 Add Features** — Sends data to ArcGIS feature layer.
|
||||
6. **🧹 File Cleanup** — Deletes `.json`/`.log` files older than `PURGE_DAYS`.
|
||||
7. **📑 Dynamic Logs** — Logs saved to `753DataSync_YYYY-MM-DD.log`.
|
||||
|
||||
### 3. **Save Data**:
|
||||
Data from each page will be saved to an individual JSON file, with the filename including the page number and timestamp. The aggregated data (all pages combined) is saved to a separate file.
|
||||
---
|
||||
|
||||
### 4. **Add Features**:
|
||||
After all the data has been fetched and saved, the script will send the aggregated data as features to the specified ArcGIS feature layer.
|
||||
## 📁 Example Output
|
||||
|
||||
## Example Output
|
||||
```bash
|
||||
📁 data/
|
||||
├── enforcement_page_1_results_100_2025-03-26_14-30-45.json
|
||||
├── enforcement_page_2_results_100_2025-03-26_14-31-10.json
|
||||
└── aggregated_enforcement_results_2025-03-26_14-31-15.json
|
||||
|
||||
- Individual page files are saved in the `data/` directory with filenames like `enforcement_page_1_results_100_2025-03-26_14-30-45.json`.
|
||||
- The aggregated file is saved as `aggregated_enforcement_results_2025-03-26_14-30-45.json`.
|
||||
📄 753DataSync_2025-03-26.log
|
||||
```
|
||||
|
||||
Logs will also be generated in the `753DataSync.log` file and printed to the console.
|
||||
---
|
||||
|
||||
## Example Output (Log)
|
||||
## 📝 Example Log
|
||||
|
||||
```text
|
||||
2025-03-26 14:30:45 - INFO - Attempting to truncate layer on https://www.arcgis.com/...
|
||||
2025-03-26 14:30:50 - INFO - Successfully truncated layer: https://www.arcgis.com/...
|
||||
2025-03-26 14:30:51 - INFO - Making request to: https://api.example.com/1/100
|
||||
2025-03-26 14:30:55 - INFO - Data saved to data/enforcement_page_1_results_100_2025-03-26_14-30-45.json
|
||||
2025-03-26 14:30:56 - INFO - No more data to fetch, stopping pagination.
|
||||
2025-03-26 14:30:57 - INFO - Data saved to data/aggregated_enforcement_results_2025-03-26_14-30-45.json
|
||||
2025-03-26 14:30:45 - INFO - Attempting to truncate layer...
|
||||
2025-03-26 14:30:51 - INFO - Fetching page 1 from API...
|
||||
2025-03-26 14:30:55 - INFO - Saved data to data/enforcement_page_1_results_100_...
|
||||
2025-03-26 14:30:57 - INFO - Aggregated data saved.
|
||||
2025-03-26 14:31:00 - INFO - Features added successfully.
|
||||
2025-03-26 14:31:01 - INFO - Deleted old log: 753DataSync_2025-03-19.log
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
---
|
||||
|
||||
The script handles errors gracefully, including:
|
||||
## 🛠 Troubleshooting
|
||||
|
||||
- If an error occurs while fetching data, the script will log the error and stop execution.
|
||||
- If the `truncate` or `add_features` operations fail, the script will log the error and stop execution.
|
||||
- The script handles HTTP errors and network-related errors gracefully, ensuring that any issues are logged with detailed information.
|
||||
- Set `LOG_LEVEL=DEBUG` in `.env` for detailed logs.
|
||||
- Ensure `.env` has no syntax errors.
|
||||
- Make sure your ArcGIS layer has permission for truncation and writes.
|
||||
- Check for internet/API access and expired ArcGIS tokens.
|
||||
- Logs are written to both console and daily log files.
|
||||
|
||||
## Troubleshooting
|
||||
---
|
||||
|
||||
- If the script unexpectedly stops, check the logs (`753DataSync.log`) for detailed error information.
|
||||
- Ensure the `.env` file is correctly configured with valid credentials and API URL.
|
||||
- Confirm that your ArcGIS layer has the correct permissions to allow truncation and feature addition.
|
||||
- If you encounter network issues, make sure your system has proper internet access and that the API endpoint is available.
|
||||
- For debugging, ensure that you have set the `LOG_LEVEL` to `DEBUG` in your `.env` file for detailed logs.
|
||||
## 🧪 Testing
|
||||
|
||||
## License
|
||||
Currently, the script is tested manually. Automated testing may be added under a `/tests` folder in the future.
|
||||
|
||||
This project is licensed under the [GNU General Public License v3.0](LICENSE), which allows you to freely use, modify, and distribute the code, provided that you include the same license in derivative works.
|
||||
---
|
||||
|
||||
## 📖 Usage Examples
|
||||
|
||||
```bash
|
||||
# Run with default page size
|
||||
python 753DataSync.py
|
||||
|
||||
# Run with custom page size
|
||||
python 753DataSync.py --results_per_page 50
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 💬 Support
|
||||
|
||||
Found a bug or want to request a feature?
|
||||
[Open an issue](https://git.nickhepler.cloud/nick/753-Data-Sync/issues) or contact [@nick](https://git.nickhepler.cloud/nick) directly.
|
||||
|
||||
---
|
||||
|
||||
## 📜 License
|
||||
|
||||
This project is licensed under the [GNU General Public License v3.0](LICENSE).
|
||||
|
||||
> 💡 *You are free to use, modify, and share this project as long as you preserve the same license in your changes.*
|
||||
Loading…
Reference in New Issue
Block a user