Enhancement: Configurable Log and Data Retention #5

Closed
opened 2025-04-14 10:29:58 -04:00 by nick · 1 comment
Owner

Description

Enhance the 753 Data Sync script to dynamically name the log file based on the current date and purge old log/data files based on an environment variable (PURGE_DAYS). This feature will improve log file management and ensure old files are deleted automatically.


Proposed Behavior

  1. Dynamically Name Log Files: The log file should be named based on the current date in the format 753DataSync_YYYY-MM-DD.log. This will prevent overwriting of log files and allow better organization of logs over time.

  2. Purge Old Files: Upon script startup, the script will check the environment variable PURGE_DAYS and delete any logs or data files older than the specified number of days. This will ensure that outdated logs and data do not accumulate in the data folder or current directory.


New Environment Variables

Variable Name Description
PURGE_DAYS Optional. If set, the script will delete files older than this many days from both the data folder (for JSON files) and the current directory (for .log files). Default is 30 days.

Behavioral Details

  • Log File Naming: The log file will be named in the format 753DataSync_YYYY-MM-DD.log, where YYYY-MM-DD represents the current date.
  • File Purging: When the script starts, it will check the PURGE_DAYS environment variable. If this variable is set, the script will delete files in the data folder and any .log files in the current directory that are older than the specified number of days.

Implementation Notes

  • Log File Naming: The log file name is constructed using datetime.now().strftime("%Y-%m-%d").
  • File Purging: The purge process checks all files in the data folder (for .json files) and in the current directory (for .log files). Files older than the specified PURGE_DAYS will be deleted.
  • The script should ensure that no critical files are deleted by mistake, and proper logging should occur for each file deletion.
  • Default PURGE_DAYS: If the environment variable is not set, the default is 30 days.

Benefits

  • Improved Log Management: Prevents logs from accumulating indefinitely by using dynamically named log files.
  • File Cleanup: Ensures that outdated logs and data files are removed automatically based on a configurable duration, keeping the environment clean and optimized.
  • Non-intrusive: Both features (dynamic log naming and file purging) only activate if the corresponding environment variables are set, so they will not interfere with existing functionality.

Example Usage

  1. To dynamically name the log file based on the current date, simply ensure that the log_filename is set as shown in the implementation:

    current_date = datetime.now().strftime("%Y-%m-%d")
    log_filename = f"753DataSync_{current_date}.log"
    
  2. To purge old files, set the PURGE_DAYS environment variable to the desired number of days (e.g., 5 days). The script will automatically delete files older than this value:

    export PURGE_DAYS=5
    

Implementation Example

Here’s a code snippet for the purge functionality:

def purge_old_files(purge_days):
    """Purge log and data files older than PURGE_DAYS from the 'data' folder."""
    data_folder = 'data'
    log_folder = '.'  # Log files are in the current directory

    if not os.path.exists(data_folder):
        logger.warning(f"The '{data_folder}' folder does not exist.")
        return

    purge_threshold = datetime.now() - timedelta(days=purge_days)

    # Delete old log files
    for filename in os.listdir(log_folder):
        if filename.endswith(".log"):
            file_path = os.path.join(log_folder, filename)
            file_modified_time = datetime.fromtimestamp(os.path.getmtime(file_path))
            if file_modified_time < purge_threshold:
                logger.info(f"Deleting old log file: {file_path}")
                os.remove(file_path)

    # Delete old data files
    for filename in os.listdir(data_folder):
        file_path = os.path.join(data_folder, filename)
        if filename.endswith(".json"):
            file_modified_time = datetime.fromtimestamp(os.path.getmtime(file_path))
            if file_modified_time < purge_threshold:
                logger.info(f"Deleting old data file: {file_path}")
                os.remove(file_path)
#### **Description** Enhance the **753 Data Sync** script to dynamically name the log file based on the current date and purge old log/data files based on an environment variable (`PURGE_DAYS`). This feature will improve log file management and ensure old files are deleted automatically. --- #### **Proposed Behavior** 1. **Dynamically Name Log Files**: The log file should be named based on the current date in the format `753DataSync_YYYY-MM-DD.log`. This will prevent overwriting of log files and allow better organization of logs over time. 2. **Purge Old Files**: Upon script startup, the script will check the environment variable `PURGE_DAYS` and delete any logs or data files older than the specified number of days. This will ensure that outdated logs and data do not accumulate in the `data` folder or current directory. --- #### **New Environment Variables** | Variable Name | Description | |-----------------|-------------------------------------------------------------------| | `PURGE_DAYS` | Optional. If set, the script will delete files older than this many days from both the `data` folder (for JSON files) and the current directory (for `.log` files). Default is 30 days. | --- #### **Behavioral Details** - **Log File Naming**: The log file will be named in the format `753DataSync_YYYY-MM-DD.log`, where `YYYY-MM-DD` represents the current date. - **File Purging**: When the script starts, it will check the `PURGE_DAYS` environment variable. If this variable is set, the script will delete files in the `data` folder and any `.log` files in the current directory that are older than the specified number of days. --- #### **Implementation Notes** - **Log File Naming**: The log file name is constructed using `datetime.now().strftime("%Y-%m-%d")`. - **File Purging**: The purge process checks all files in the `data` folder (for `.json` files) and in the current directory (for `.log` files). Files older than the specified `PURGE_DAYS` will be deleted. - The script should ensure that no critical files are deleted by mistake, and proper logging should occur for each file deletion. - **Default `PURGE_DAYS`**: If the environment variable is not set, the default is 30 days. --- #### **Benefits** - **Improved Log Management**: Prevents logs from accumulating indefinitely by using dynamically named log files. - **File Cleanup**: Ensures that outdated logs and data files are removed automatically based on a configurable duration, keeping the environment clean and optimized. - **Non-intrusive**: Both features (dynamic log naming and file purging) only activate if the corresponding environment variables are set, so they will not interfere with existing functionality. --- #### **Example Usage** 1. To dynamically name the log file based on the current date, simply ensure that the `log_filename` is set as shown in the implementation: ```python current_date = datetime.now().strftime("%Y-%m-%d") log_filename = f"753DataSync_{current_date}.log" ``` 2. To purge old files, set the `PURGE_DAYS` environment variable to the desired number of days (e.g., `5` days). The script will automatically delete files older than this value: ```bash export PURGE_DAYS=5 ``` --- #### **Implementation Example** Here’s a code snippet for the purge functionality: ```python def purge_old_files(purge_days): """Purge log and data files older than PURGE_DAYS from the 'data' folder.""" data_folder = 'data' log_folder = '.' # Log files are in the current directory if not os.path.exists(data_folder): logger.warning(f"The '{data_folder}' folder does not exist.") return purge_threshold = datetime.now() - timedelta(days=purge_days) # Delete old log files for filename in os.listdir(log_folder): if filename.endswith(".log"): file_path = os.path.join(log_folder, filename) file_modified_time = datetime.fromtimestamp(os.path.getmtime(file_path)) if file_modified_time < purge_threshold: logger.info(f"Deleting old log file: {file_path}") os.remove(file_path) # Delete old data files for filename in os.listdir(data_folder): file_path = os.path.join(data_folder, filename) if filename.endswith(".json"): file_modified_time = datetime.fromtimestamp(os.path.getmtime(file_path)) if file_modified_time < purge_threshold: logger.info(f"Deleting old data file: {file_path}") os.remove(file_path) ```
nick added the
enhancement
label 2025-04-14 10:29:58 -04:00
nick self-assigned this 2025-04-15 09:52:21 -04:00
nick added reference feature/logfile-purge-support 2025-04-15 18:17:46 -04:00
Author
Owner

This has been merged into master.

This has been [merged](https://git.nickhepler.cloud/nick/753-Data-Sync/pulls/8) into master.
nick closed this issue 2025-04-15 18:19:55 -04:00
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: nick/753-Data-Sync#5
No description provided.