`qa.process`¶

Process the raw TeraChem output for future analysis.

Module Contents¶

Functions¶

`get_pdb`(→ str)	Searches all directories recursively for a PDB file.
`get_xyz`(→ str)	Searches all directories for a XYZ file.
`get_atom_count`(→ int)	Finds an xyz file and gets the number of atoms.
`combine_xyzs`(→ None)	Combine an arbitrary number of xyz files.
`get_protein_sequence`(→ List[str])	Gets the full amino acid sequence of your protein.
`get_charge_file`(→ str)	Searches all directories for a charge xls file.
`combine_sp_xyz`()	Combines single point xyz's for all replicates.
`combine_restarts_old`(→ None)	Collects all charges or coordinates into single xls and xyz files.
`combine_restarts_new`(→ None)	Collects all charges or coordinates into single xls and xyz files.
`combine_replicates`(→ None)	Collects charges or coordinates into a xls and xyz file across replicates.
`summed_residue_charge`(charge_data, template)	Sums the charges for all atoms by residue.
`get_residue_identifiers`(→ List[str])	Gets the residue identifiers such as Ala1 or Cys24.
`xyz2pdb`(→ None)	Converts an xyz file into a pdb file.
`xyz2pdb_traj`(→ None)	Converts an xyz trajectory file into a pdb trajectory file.
`xyz2pdb_ensemble`(→ None)	Converts an xyz trajectory file into a pdb ensemble.
`clean_incomplete_xyz`(→ None)	For removing incomplete frames during troublshooting.
`check_valid_resname`(→ Tuple[str, int])	Checks if a valid resname has been identified.
`get_res_atom_indices`(→ List[int])	For a residue get the atom indices of all atoms in the residue.
`clean_qm_jobs`(→ None)	Cleans all QM jobs and checks for completion.
`combine_qm_charges`(→ None)	Combines the charge_mull.xls files generate by TeraChem single points.
`combine_qm_replicates`(→ None)	Combine the all_charges.xls files for replicates into a master charge file.
`string_to_list`(→ List[List[int]])	Converts a list of numerical strings to a list of lists of numbers.
`simple_xyz_combine`()	Takes all xyz molecular structure files in the current directory

qa.process.get_pdb() → str¶

Searches all directories recursively for a PDB file.

If more than one PDB is found it will use the first one. If no PDB file was found, it will prompt the user for a PDB file path.

Returns:: pdb_file – The path of a PDB file within the current directory (recursive).
Return type:: str

Notes

Currently it uses the name to distinguish single structures, ensembles, and trajectories. In the future, this function should check the contents to confirm.

qa.process.get_xyz() → str¶

Searches all directories for a XYZ file.

If more than one XYZ is found it will use the first one. If no XYZ file was found it will prompt the user for the XYZ file path.

Returns:: xyz_name – The path of a XYZ file within the current directory.
Return type:: str

qa.process.get_atom_count() → int¶

Finds an xyz file and gets the number of atoms.

Returns:: atom_count – The number of atoms in the identified xyz file.
Return type:: int

qa.process.combine_xyzs() → None¶

Combine an arbitrary number of xyz files.

When generating the input for the QM calculations, you may have created a directory of single xyz strucutres. This script will recombine them back into a single xyz trajectory.

qa.process.get_protein_sequence(pdb_path) → List[str]¶: Gets the full amino acid sequence of your protein.

See also

qa.plot.heatmap

qa.process.get_charge_file() → str¶

Searches all directories for a charge xls file.

If more than one .xls file is found it will use the first one. If no .xls file was found it will prompt the user for the .xls file path. This is the standard charge output for TeraChem.

Returns:: charge_file – The path of a charge .xls file within the current directory.
Return type:: str

Notes

Starts in the directory containing all the directories

qa.process.combine_sp_xyz()¶

Combines single point xyz’s for all replicates.

The QM single points each of a geometry file. Combines all those xyz files into. Preferential to using the other geometry files to insure they are identical.

Returns:: replicate_info – List of tuples with replicate number and frame count for the replicates.
Return type:: List[tuple()]

qa.process.combine_restarts_old(atom_count, all_charges: str = 'all_charges.xls', all_coors: str = 'all_coors.xyz') → None¶

Collects all charges or coordinates into single xls and xyz files.

Likely the first executed function after generating the raw AIMD data. Trajectories were likely generated over multiple runs. This function combines all coordinate and charge data for each run.

Parameters:

all_charges (str) – The name of the file containing all charges in xls format.
all_coors.xyz (str) – The name of the file containing the coordinates in xyz format.
atom_count (int) – The number of atoms in the structure

Notes

Run from the directory that contains the run fragments.

qa.process¶

Module Contents¶

Functions¶

`qa.process`¶