Compare commits
No commits in common. "ab6a2f0ec88af9de1cfd8df31ad2dbe182a01f43" and "6bbf29493c89fd6ac72ba486f96ef43f0f1eb5a8" have entirely different histories.
ab6a2f0ec8
...
6bbf29493c
67
README.md
67
README.md
@ -1,14 +1,12 @@
|
||||

|
||||
# 753 Data Sync
|
||||

|
||||
|
||||
This script fetches enforcement data from an external API, truncates a specified feature layer in ArcGIS, and adds the fetched data as features to the layer. The script performs the following tasks:
|
||||
|
||||
- **Truncate** the specified layer in ArcGIS to clear any previous features before adding new ones.
|
||||
- **Fetch** data from an API in paginated form.
|
||||
- **Save** data from each API response to individual JSON files.
|
||||
- **Aggregate** all data from all pages into one JSON file.
|
||||
- **Add** the aggregated data as features to an ArcGIS feature service.
|
||||
1. **Truncate the specified layer** in ArcGIS to clear any previous features before adding new ones.
|
||||
2. **Fetch data** from an API in paginated form.
|
||||
3. **Save data** from each API response to individual JSON files.
|
||||
4. **Aggregate all data** from all pages into one JSON file.
|
||||
5. **Add the aggregated data** as features to an ArcGIS feature service.
|
||||
|
||||
## Requirements
|
||||
|
||||
@ -17,24 +15,17 @@ This script fetches enforcement data from an external API, truncates a specified
|
||||
- ArcGIS Online credentials (username and password)
|
||||
- `.env` file for configuration (see below for details)
|
||||
|
||||
## Install Dependencies
|
||||
### Install dependencies
|
||||
|
||||
To install the required dependencies, use the following command:
|
||||
You can install the required dependencies using `pip`:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Alternatively, you can install the necessary packages individually:
|
||||
|
||||
```bash
|
||||
pip install requests
|
||||
pip install python-dotenv
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Before running the script, you need to configure some environment variables. Create a `.env` file in the root of your project with the following details:
|
||||
Before running the script, you'll need to configure some environment variables. Create a `.env` file with the following details:
|
||||
|
||||
```env
|
||||
API_URL=your_api_url
|
||||
@ -44,10 +35,9 @@ HOSTNAME=your_arcgis_host
|
||||
INSTANCE=your_arcgis_instance
|
||||
FS=your_feature_service
|
||||
LAYER=your_layer_id
|
||||
LOG_LEVEL=your_log_level # e.g., DEBUG, INFO, WARNING, ERROR, CRITICAL
|
||||
```
|
||||
|
||||
### Environment Variables:
|
||||
### Variables
|
||||
|
||||
- **API_URL**: The URL of the API you are fetching data from.
|
||||
- **AGOL_USER**: Your ArcGIS Online username.
|
||||
@ -56,7 +46,6 @@ LOG_LEVEL=your_log_level # e.g., DEBUG, INFO, WARNING, ERROR, CRITICAL
|
||||
- **INSTANCE**: The instance name of your ArcGIS Online service.
|
||||
- **FS**: The name of the feature service you are working with.
|
||||
- **LAYER**: The ID or name of the layer to truncate and add features to.
|
||||
- **LOG_LEVEL**: The desired logging level (e.g., `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`).
|
||||
|
||||
## Script Usage
|
||||
|
||||
@ -66,34 +55,36 @@ You can run the script with the following command:
|
||||
python 753DataSync.py --results_per_page <number_of_results_per_page>
|
||||
```
|
||||
|
||||
### Arguments:
|
||||
### Arguments
|
||||
|
||||
- `--results_per_page` (optional): The number of results to fetch per page (default: 100).
|
||||
- **--results_per_page** (optional): The number of results to fetch per page (default: 100).
|
||||
|
||||
## Functionality
|
||||
|
||||
### 1. **Truncate Layer**:
|
||||
Before fetching and adding any new data, the script will call the `truncate` function to clear out any existing features from the specified layer. This ensures that the feature layer is empty and ready for the new data.
|
||||
1. **Truncate Layer**: Before fetching and adding any new data, the script will call the `truncate` function to clear out any existing features from the specified layer. This ensures that the feature layer is empty and ready for the new data.
|
||||
|
||||
### 2. **Fetch Data**:
|
||||
The script will then fetch data from the specified API in pages. Each page is fetched sequentially until all data is retrieved.
|
||||
2. **Fetch Data**: The script will then fetch data from the specified API in pages. Each page is fetched sequentially until all data is retrieved.
|
||||
|
||||
### 3. **Save Data**:
|
||||
Data from each page will be saved to an individual JSON file, with the filename including the page number and timestamp. The aggregated data (all pages combined) is saved to a separate file.
|
||||
3. **Save Data**: Data from each page will be saved to an individual JSON file, with the filename including the page number and timestamp. The aggregated data (all pages combined) is saved to a separate file.
|
||||
|
||||
### 4. **Add Features**:
|
||||
After all the data has been fetched and saved, the script will send the aggregated data as features to the specified ArcGIS feature layer.
|
||||
4. **Add Features**: After all the data has been fetched and saved, the script will send the aggregated data as features to the specified ArcGIS feature layer.
|
||||
|
||||
## Example Output
|
||||
### Example Output
|
||||
|
||||
- Individual page files are saved in the `data/` directory with filenames like `enforcement_page_1_results_100_2025-03-26_14-30-45.json`.
|
||||
- The aggregated file is saved as `aggregated_enforcement_results_2025-03-26_14-30-45.json`.
|
||||
|
||||
|
||||
Logs will also be generated in the `753DataSync.log` file and printed to the console.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- If an error occurs while fetching data, the script will log the error and stop execution.
|
||||
- If the `truncate` or `add_features` operations fail, the script will log the error and stop execution.
|
||||
- The script handles HTTP errors and network-related errors gracefully.
|
||||
|
||||
## Example Output (Log)
|
||||
|
||||
```text
|
||||
```
|
||||
2025-03-26 14:30:45 - INFO - Attempting to truncate layer on https://www.arcgis.com/...
|
||||
2025-03-26 14:30:50 - INFO - Successfully truncated layer: https://www.arcgis.com/...
|
||||
2025-03-26 14:30:51 - INFO - Making request to: https://api.example.com/1/100
|
||||
@ -103,14 +94,6 @@ Logs will also be generated in the `753DataSync.log` file and printed to the con
|
||||
2025-03-26 14:31:00 - INFO - Features added successfully.
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The script handles errors gracefully, including:
|
||||
|
||||
- If an error occurs while fetching data, the script will log the error and stop execution.
|
||||
- If the `truncate` or `add_features` operations fail, the script will log the error and stop execution.
|
||||
- The script handles HTTP errors and network-related errors gracefully, ensuring that any issues are logged with detailed information.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- If the script stops unexpectedly, check the logs (`753DataSync.log`) for detailed error information.
|
||||
@ -119,4 +102,4 @@ The script handles errors gracefully, including:
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the GNU General Public License v3.0 or later - see the [LICENSE](LICENSE) file for details.
|
||||
This project is licensed under the **GNU General Public License v3.0** or later - see the [LICENSE](LICENSE) file for details.
|
||||
238
app.py
238
app.py
@ -8,37 +8,20 @@ import argparse
|
||||
import urllib.parse
|
||||
from dotenv import load_dotenv
|
||||
|
||||
# Load environment variables from .env file
|
||||
load_dotenv("753DataSync.env")
|
||||
|
||||
# Configuration
|
||||
BASE_URL = "{}/{}/{}"
|
||||
log_level = os.getenv('LOG_LEVEL', 'INFO').upper()
|
||||
|
||||
# Setup logging
|
||||
logger = logging.getLogger()
|
||||
|
||||
# Set the log level for the logger
|
||||
if log_level == 'DEBUG':
|
||||
logger.setLevel(logging.DEBUG)
|
||||
elif log_level == 'INFO':
|
||||
logger.setLevel(logging.INFO)
|
||||
elif log_level == 'WARNING':
|
||||
logger.setLevel(logging.WARNING)
|
||||
elif log_level == 'ERROR':
|
||||
logger.setLevel(logging.ERROR)
|
||||
elif log_level == 'CRITICAL':
|
||||
logger.setLevel(logging.CRITICAL)
|
||||
else:
|
||||
logger.setLevel(logging.INFO)
|
||||
logger.setLevel(logging.INFO)
|
||||
|
||||
# File handler
|
||||
file_handler = logging.FileHandler('753DataSync.log')
|
||||
file_handler.setLevel(getattr(logging, log_level))
|
||||
file_handler.setLevel(logging.INFO)
|
||||
|
||||
# Stream handler (console output)
|
||||
stream_handler = logging.StreamHandler(sys.stdout)
|
||||
stream_handler.setLevel(getattr(logging, log_level))
|
||||
stream_handler.setLevel(logging.INFO)
|
||||
|
||||
# Log format
|
||||
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
|
||||
@ -54,30 +37,23 @@ def fetch_data(api_url, page_number, results_per_page):
|
||||
url = BASE_URL.format(api_url, page_number, results_per_page)
|
||||
|
||||
try:
|
||||
logger.info(f"Making request to: {url} with page_number={page_number} and results_per_page={results_per_page}")
|
||||
logger.info(f"Making request to: {url}")
|
||||
response = requests.get(url)
|
||||
|
||||
# Check for HTTP errors
|
||||
response.raise_for_status()
|
||||
|
||||
# Success log
|
||||
logger.info(f"Successfully fetched data from {url}. Status code: {response.status_code}.")
|
||||
|
||||
# Debug log with additional response details
|
||||
logger.debug(f"GET request to {url} completed with status code {response.status_code}. "
|
||||
f"Response time: {response.elapsed.total_seconds()} seconds.")
|
||||
|
||||
# Return JSON data
|
||||
return response.json()
|
||||
|
||||
except requests.exceptions.HTTPError as http_err:
|
||||
logger.error(f"HTTP error occurred while fetching data from {url}: {http_err}")
|
||||
logger.error(f"HTTP error occurred: {http_err}")
|
||||
sys.exit(1)
|
||||
except requests.exceptions.RequestException as req_err:
|
||||
logger.error(f"Request error occurred while fetching data from {url}: {req_err}")
|
||||
logger.error(f"Request error occurred: {req_err}")
|
||||
sys.exit(1)
|
||||
except Exception as err:
|
||||
logger.exception(f"An unexpected error occurred while fetching data from {url}: {err}")
|
||||
logger.error(f"An unexpected error occurred: {err}")
|
||||
sys.exit(1)
|
||||
|
||||
def save_json(data, filename):
|
||||
@ -86,22 +62,15 @@ def save_json(data, filename):
|
||||
# Ensure directory exists
|
||||
if not os.path.exists('data'):
|
||||
os.makedirs('data')
|
||||
logger.info(f"Directory 'data' created.")
|
||||
|
||||
|
||||
# Save data to file
|
||||
with open(filename, 'w', encoding='utf-8') as f:
|
||||
json.dump(data, f, ensure_ascii=False, indent=4)
|
||||
|
||||
logger.info(f"Data successfully saved to {filename}")
|
||||
logger.info(f"Data saved to {filename}")
|
||||
|
||||
except OSError as e:
|
||||
logger.error(f"OS error occurred while saving JSON data to {filename}: {e}")
|
||||
sys.exit(1)
|
||||
except IOError as e:
|
||||
logger.error(f"I/O error occurred while saving JSON data to {filename}: {e}")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
logger.error(f"Unexpected error occurred while saving JSON data to {filename}: {e}")
|
||||
logger.error(f"Error saving JSON data: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
def parse_arguments():
|
||||
@ -127,36 +96,14 @@ def generate_token(username, password, url="https://www.arcgis.com/sharing/rest/
|
||||
'expiration': '120'
|
||||
}
|
||||
headers = {}
|
||||
|
||||
try:
|
||||
logger.info(f"Generating token for username '{username}' using URL: {url}")
|
||||
response = requests.post(url, headers=headers, data=payload)
|
||||
|
||||
# Log the request status and response time
|
||||
logger.debug(f"POST request to {url} completed with status code {response.status_code}. "
|
||||
f"Response time: {response.elapsed.total_seconds()} seconds.")
|
||||
|
||||
response.raise_for_status() # Raise an error for bad status codes
|
||||
|
||||
# Extract token from the response
|
||||
token = response.json().get('token')
|
||||
|
||||
if token:
|
||||
logger.info("Token generated successfully.")
|
||||
else:
|
||||
logger.error("Token not found in the response.")
|
||||
sys.exit(1)
|
||||
|
||||
token = response.json()['token']
|
||||
logger.info("Token generated successfully.")
|
||||
return token
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"Error generating token for username '{username}': {e}")
|
||||
sys.exit(1)
|
||||
except KeyError as e:
|
||||
logger.error(f"Error extracting token from the response: Missing key {e}")
|
||||
sys.exit(1)
|
||||
except Exception as e:
|
||||
logger.exception(f"Unexpected error generating token for username '{username}': {e}")
|
||||
logger.error(f"Error generating token: {e}")
|
||||
sys.exit(1)
|
||||
|
||||
def truncate(token, hostname, instance, fs, layer, secure=True):
|
||||
@ -166,17 +113,10 @@ def truncate(token, hostname, instance, fs, layer, secure=True):
|
||||
url = f"{protocol}{hostname}/{instance}/arcgis/rest/admin/services/{fs}/FeatureServer/{layer}/truncate?token={token}&async=true&f=json"
|
||||
|
||||
try:
|
||||
# Attempt the POST request
|
||||
logging.info(f"Attempting to truncate layer {layer} on {hostname}...")
|
||||
|
||||
# Debug logging for the URL being used
|
||||
logging.debug(f"Truncate URL: {url}")
|
||||
|
||||
response = requests.post(url, timeout=30)
|
||||
|
||||
# Log response time
|
||||
logging.debug(f"POST request to {url} completed with status code {response.status_code}. "
|
||||
f"Response time: {response.elapsed.total_seconds()} seconds.")
|
||||
|
||||
# Check for HTTP errors
|
||||
response.raise_for_status() # Raise an exception for HTTP errors (4xx, 5xx)
|
||||
|
||||
@ -184,30 +124,28 @@ def truncate(token, hostname, instance, fs, layer, secure=True):
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
if 'error' in result:
|
||||
logging.error(f"Error truncating layer {layer}: {result['error']}")
|
||||
logging.error(f"Error truncating layer: {result['error']}")
|
||||
return None
|
||||
logging.info(f"Successfully truncated layer: {protocol}{hostname}/{instance}/arcgis/rest/admin/services/{fs}/FeatureServer/{layer}.")
|
||||
return result
|
||||
else:
|
||||
logging.error(f"Unexpected response for layer {layer}: {response.status_code} - {response.text}")
|
||||
logging.error(f"Unexpected response: {response.status_code} - {response.text}")
|
||||
return None
|
||||
|
||||
except requests.exceptions.Timeout as e:
|
||||
logging.error(f"Request timed out while truncating layer {layer}: {e}")
|
||||
return None
|
||||
except requests.exceptions.RequestException as e:
|
||||
logging.error(f"Request failed while truncating layer {layer}: {e}")
|
||||
# Catch network-related errors, timeouts, etc.
|
||||
logging.error(f"Request failed: {e}")
|
||||
return None
|
||||
except Exception as e:
|
||||
logging.error(f"An unexpected error occurred while truncating layer {layer}: {e}")
|
||||
# Catch any other unexpected errors
|
||||
logging.error(f"An unexpected error occurred: {e}")
|
||||
return None
|
||||
|
||||
def add_features(token, hostname, instance, fs, layer, aggregated_data, secure=True):
|
||||
"""Add features to a feature service."""
|
||||
protocol = 'https://' if secure else 'http://'
|
||||
url = f"{protocol}{hostname}/{instance}/arcgis/rest/services/{fs}/FeatureServer/{layer}/addFeatures?token={token}&rollbackOnFailure=true&f=json"
|
||||
|
||||
logger.info(f"Attempting to add features to {protocol}{hostname}/{instance}/arcgis/rest/services/{fs}/FeatureServer/{layer}...")
|
||||
logger.info(f"Attempting to add features on {protocol}{hostname}/{instance}/arcgis/rest/services/{fs}/FeatureServer/{layer}...")
|
||||
|
||||
# Prepare features data as the payload
|
||||
features_json = json.dumps(aggregated_data) # Convert aggregated data to JSON string
|
||||
@ -221,119 +159,73 @@ def add_features(token, hostname, instance, fs, layer, aggregated_data, secure=T
|
||||
}
|
||||
|
||||
try:
|
||||
# Log request details (but avoid logging sensitive data)
|
||||
logger.debug(f"Request URL: {url}")
|
||||
logger.debug(f"Payload size: {len(features_json)} characters")
|
||||
|
||||
response = requests.post(url, headers=headers, data=payload, timeout=180)
|
||||
|
||||
# Log the response time and status code
|
||||
logger.debug(f"POST request to {url} completed with status code {response.status_code}. "
|
||||
f"Response time: {response.elapsed.total_seconds()} seconds.")
|
||||
|
||||
response.raise_for_status() # Raise an error for bad status codes
|
||||
|
||||
logger.info("Features added successfully.")
|
||||
|
||||
# Log any successful response details
|
||||
if response.status_code == 200:
|
||||
logger.debug(f"Response JSON size: {len(response.text)} characters.")
|
||||
|
||||
return response.json()
|
||||
|
||||
except requests.exceptions.Timeout as e:
|
||||
logger.error(f"Request timed out while adding features: {e}")
|
||||
return {'error': 'Request timed out'}
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"Request error occurred while adding features: {e}")
|
||||
logger.error(f"Request error: {e}")
|
||||
return {'error': str(e)}
|
||||
|
||||
except json.JSONDecodeError as e:
|
||||
logger.error(f"Error decoding JSON response while adding features: {e}")
|
||||
logger.error(f"Error decoding JSON response: {e}")
|
||||
return {'error': 'Invalid JSON response'}
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"An unexpected error occurred while adding features: {e}")
|
||||
return {'error': str(e)}
|
||||
|
||||
def main():
|
||||
"""Main entry point for the script."""
|
||||
try:
|
||||
logger.info("Starting script execution.")
|
||||
# Parse command-line arguments
|
||||
results_per_page = parse_arguments()
|
||||
|
||||
# Parse command-line arguments
|
||||
results_per_page = parse_arguments()
|
||||
logger.info(f"Parsed arguments: results_per_page={results_per_page}")
|
||||
load_dotenv("753DataSync.env")
|
||||
api_url = os.getenv('API_URL')
|
||||
|
||||
# Load environment variables
|
||||
logger.info("Loading environment variables.")
|
||||
load_dotenv("753DataSync.env")
|
||||
api_url = os.getenv('API_URL')
|
||||
if not api_url:
|
||||
logger.error("API_URL environment variable not found.")
|
||||
return
|
||||
# Generate the token
|
||||
username = os.getenv('AGOL_USER')
|
||||
password = os.getenv('AGOL_PASSWORD')
|
||||
token = generate_token(username, password)
|
||||
|
||||
# Generate the token
|
||||
username = os.getenv('AGOL_USER')
|
||||
password = os.getenv('AGOL_PASSWORD')
|
||||
if not username or not password:
|
||||
logger.error("Missing AGOL_USER or AGOL_PASSWORD in environment variables.")
|
||||
return
|
||||
token = generate_token(username, password)
|
||||
# Set ArcGIS host details
|
||||
hostname = os.getenv('HOSTNAME')
|
||||
instance = os.getenv('INSTANCE')
|
||||
fs = os.getenv('FS')
|
||||
layer = os.getenv('LAYER')
|
||||
|
||||
# Set ArcGIS host details
|
||||
hostname = os.getenv('HOSTNAME')
|
||||
instance = os.getenv('INSTANCE')
|
||||
fs = os.getenv('FS')
|
||||
layer = os.getenv('LAYER')
|
||||
# Truncate the layer before adding new features
|
||||
truncate(token, hostname, instance, fs, layer)
|
||||
|
||||
# Truncate the layer before adding new features
|
||||
truncate(token, hostname, instance, fs, layer)
|
||||
all_data = []
|
||||
page_number = 1
|
||||
|
||||
all_data = []
|
||||
page_number = 1
|
||||
while True:
|
||||
# Fetch data from the API
|
||||
data = fetch_data(api_url, page_number, results_per_page)
|
||||
|
||||
while True:
|
||||
try:
|
||||
# Fetch data from the API
|
||||
data = fetch_data(api_url, page_number, results_per_page)
|
||||
# Append features data to the aggregated list
|
||||
all_data.extend(data) # Data is now a list of features
|
||||
|
||||
# Append features data to the aggregated list
|
||||
all_data.extend(data)
|
||||
# Generate filename with timestamp for the individual page
|
||||
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
|
||||
page_filename = f"data/enforcement_page_{page_number}_results_{results_per_page}_{timestamp}.json"
|
||||
|
||||
# Save individual page data
|
||||
save_json(data, page_filename)
|
||||
|
||||
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
|
||||
page_filename = f"data/enforcement_page_{page_number}_results_{results_per_page}_{timestamp}.json"
|
||||
|
||||
# Save individual page data
|
||||
if log_level == 'DEBUG':
|
||||
save_json(data, page_filename)
|
||||
# Check if the number of records is less than the results_per_page, indicating last page
|
||||
if len(data) < results_per_page:
|
||||
logger.info("No more data to fetch, stopping pagination.")
|
||||
break
|
||||
|
||||
# Check if the number of records is less than the results_per_page, indicating last page
|
||||
if len(data) < results_per_page:
|
||||
logger.info("No more data to fetch, stopping pagination.")
|
||||
break
|
||||
page_number += 1
|
||||
|
||||
page_number += 1
|
||||
except Exception as e:
|
||||
logger.error(f"Error fetching or saving data for page {page_number}: {e}", exc_info=True)
|
||||
break
|
||||
# Prepare aggregated data
|
||||
aggregated_data = all_data # Just use the collected features directly
|
||||
|
||||
# Prepare aggregated data
|
||||
aggregated_data = all_data # Just use the collected features directly
|
||||
# Save aggregated data to a single JSON file
|
||||
aggregated_filename = f"data/aggregated_enforcement_results_{timestamp}.json"
|
||||
save_json(aggregated_data, aggregated_filename)
|
||||
|
||||
# Save aggregated data to a single JSON file
|
||||
aggregated_filename = f"data/aggregated_enforcement_results_{timestamp}.json"
|
||||
logger.info(f"Saving aggregated data to {aggregated_filename}.")
|
||||
save_json(aggregated_data, aggregated_filename)
|
||||
|
||||
# Add the features to the feature layer
|
||||
response = add_features(token, hostname, instance, fs, layer, aggregated_data)
|
||||
except Exception as e:
|
||||
logger.error(f"An unexpected error occurred: {e}", exc_info=True)
|
||||
return
|
||||
finally:
|
||||
logger.info("Script execution completed.")
|
||||
# Add the features to the feature layer
|
||||
response = add_features(token, hostname, instance, fs, layer, aggregated_data)
|
||||
logger.info(f"Add features response: {json.dumps(response, indent=2)}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
main()
|
||||
|
||||
Loading…
Reference in New Issue
Block a user