qa.manage

Manages file manipulations.

Module Contents

Functions

check_file_exists(filename)

Check if a file with the given name exists. If it does not exist, the function ends the current

run_all_replicates(→ None)

Loops through all replicates and runs a specific funtion on each.

run_all_replicates(→ None)

Loops through all replicates and runs a specific funtion on each.

find_stalled()

Identifies frozen TeraChem jobs.

check_esp_failed()

Find jobs where the ESP calculations failed or terminated.

check_folder(→ None)

Checks if a folder exists and creates it if it doesn't.

check_file(→ None)

Checks if a file exists and copies it if it doesn't.

copy_script(→ None)

Copies a script from the Scripts folder in the package.

collect_esp_components(→ None)

Loops over replicates and single points and collects metal-centered ESPs.

replicate_interval_submit(replicate, first_job, ...)

Submits a specific range of sub jobs within a replicate folder.

qa.manage.check_file_exists(filename)

Check if a file with the given name exists. If it does not exist, the function ends the current Python session with an informative error message.

Parameters:

filename (str) – The name of the file to check for existence.

Raises:

FileNotFoundError – If the specified file does not exist.

Examples

>>> check_file_exists("example.txt")
None
>>> check_file_exists("non_existent_file.txt")
FileNotFoundError: File 'non_existent_file.txt' does not exist. Please provide a valid file name.
qa.manage.run_all_replicates(function) None

Loops through all replicates and runs a specific funtion on each.

Often a function needs to be run on all replicates. Given a common file structure, this will take care of all of them at once.

Notes

The directories in the first level should contain replicates, and the sub directories should contain restarts. Additional random directories will lead to errors.

qa.manage.run_all_replicates(function) None

Loops through all replicates and runs a specific funtion on each.

Often a function needs to be run on all replicates. Given a common file structure, this will take care of all of them at once.

Notes

The directories in the first level should contain replicates, and the sub directories should contain restarts. Additional random directories will lead to errors.

qa.manage.find_stalled()

Identifies frozen TeraChem jobs.

When running TeraChem on SuperCloud, the jobs will occassionally freeze on the COSMO step. These instances can be difficult to identify. This script will check all jobs, and find those on the specific problematic step.

qa.manage.check_esp_failed()

Find jobs where the ESP calculations failed or terminated.

The ESP jobs are expensive and can take a long time. Occasionally they hang and need to be restarted. This script will check all qm calculations for ESP jobs that are unfinished. Run it from the folder that contains all the replicates.

Notes

You may need to update ignore if you have additional directories.

qa.manage.check_folder(dir_name) None

Checks if a folder exists and creates it if it doesn’t.

Parameters:

dir_name (str) – The name of the folder you are checking for

qa.manage.check_file(file_name, location) None

Checks if a file exists and copies it if it doesn’t.

Parameters:

file_name (str) – The name of the file you are checking for

qa.manage.copy_script(script_name) None

Copies a script from the Scripts folder in the package.

To perform a number of different analyses, you will require one of the saved scripts. This function will copy a requested script to your current location.

Parameters:

script_name (str) – The name of the requested script.

qa.manage.collect_esp_components(components, first_job: int, last_job: int, step: int) None

Loops over replicates and single points and collects metal-centered ESPs.

The main purpose is to navigagt the file structure and collect the data. The computing of the ESP is done in the calculate_esp() function.

Parameters:
  • first_job (int) – The name of the first directory and first job e.g., 0

  • last_job (int) – The name of the last directory and last job e.g., 39900

  • step (int) – The step size between each single point.

qa.manage.replicate_interval_submit(replicate: int, first_job: int, last_job: int, step: int, function)

Submits a specific range of sub jobs within a replicate folder.

I wrote this function for time consuming analysis programs. It takes a long time to compute the Hirshfeld and other charge schemes. This function computes the charge schemes for a subset of the single points from a specific replicate. Dividie and conquer. Written to run on MIT supercloud.

Parameters:
  • replicate (int) – The number of the replicate that you want to analyze

  • first_job (int) – The name of the first directory and first job e.g., 0

  • last_job (int) – The name of the last directory and last job e.g., 39900

  • step (int) – The step size between each single point.