qa.analyze

Functions for analyzing an AIMD trajectory.

Module Contents

Functions

charge_matrix(→ None)

Generates mutual information and cross-correlation matrices.

get_joint_qres(res_x, res_y, axes_range)

Calculates the joint net partial charge sum on each residue, q(RES).

cpptraj_covars(→ None)

Calculate the covariance using CPPTraj.

charge_matrix_analysis(→ None)

Analyze the charging coupling across all amino acids.

calculate_charge_schemes()

Calculated a variety of charge schemes with Multiwfn.

calculate_esp(component_atoms, scheme)

Calculate the electrostatic potential (ESP) of a molecular component.

compute_rmsd(→ float)

Computes the Root Mean Square Deviation (RMSD) between two matrices of atoms with their xyz coordinates.

get_rmsd(ref_atoms, traj_atoms)

Gets RMSD for specific atoms for different analogs.

td_coupling(res_x, res_y, replicate_dir)

Calculates and plots the partial charge for two residues over time.

pairwise_distances_csv(pdb_traj_path, output_file)

Calculate pairwise distances between residue centers of mass and save the result to a CSV file.

parse_components(components)

Parse the components input, handling range inputs (e.g., '253-424').

centroid_distance(components)

Calculate the distance between two structural components.

combine_qm_charges_replicates()

qa.analyze.charge_matrix(pdbfile) None

Generates mutual information and cross-correlation matrices.

Parameters:

pdbfile (str) – The name of the pdbfile

qa.analyze.get_joint_qres(res_x, res_y, axes_range)

Calculates the joint net partial charge sum on each residue, q(RES).

Parameters:
  • res_x (str) – The amino acid to be represented on the x-axis

  • res_y (str) – The amino acid to be represented on the y-axis

Returns:

joint_df – A dataframe with the charge information for both correlated residues.

Return type:

pd.DataFrame()

Notes

Big picture flow of function.
  1. Read charges.xls in as a pd dataframe

  2. Get atom indices for requested atoms from get_res_atom_indices()

  3. Extract and sum residue columns

  4. Save as a csv

  5. Return as a pandas dataframe

  6. Use joint_plot() to plot the results

qa.analyze.cpptraj_covars(delete, recompute=False) None

Calculate the covariance using CPPTraj.

A measure of how different amino acids are geometrically correlated. For example, fluctuations in side chain position or conformational changes.

Parameters:
  • delete (List[List[int]]) – A list containing two lists. The first is the residues to delete. The second is the columns of the matrix to delete.

  • recompute (bool) – Recompute the calculation even if results are already present.

qa.analyze.charge_matrix_analysis(delete, recompute=False) None

Analyze the charging coupling across all amino acids.

Performs the analyze to generate matrix files. Automates plotting of the results.

Parameters:
  • delete (List[List[int]]) – A list containing two lists. The first is the residues to delete. The second is the columns of the matrix to delete.

  • recompute (bool) – Recompute the calculation even if results are already present.

qa.analyze.calculate_charge_schemes()

Calculated a variety of charge schemes with Multiwfn.

Compute metal-centered Hirshfeld, Voronoi, Mulliken, and ADCH charges. Uses the Multiwfn package to compute the charge partitions.

Parameters:

molden (str) – The name of the molden file to be processed with Multiwfn

qa.analyze.calculate_esp(component_atoms, scheme)

Calculate the electrostatic potential (ESP) of a molecular component.

Takes the output from a Multiwfn charge calculation and calculates the ESP. Run it from the folder that contains all replicates. It will generate a single csv file with all the charges for your residue, with one component/column specified in the input residue dictionary.

Parameters:

component_atoms (List[int]) – A list of the atoms in a given component

qa.analyze.compute_rmsd(matrix_A: numpy.ndarray, matrix_B: numpy.ndarray) float

Computes the Root Mean Square Deviation (RMSD) between two matrices of atoms with their xyz coordinates.

Parameters:
  • matrix_A (np.ndarray) – A matrix of size N x 3, where N is the number of atoms and the columns are the x, y, and z coordinates.

  • matrix_B (np.ndarray) – A matrix of size N x 3, where N is the number of atoms and the columns are the x, y, and z coordinates.

Returns:

rmsd – The computed RMSD between the two matrices.

Return type:

float

qa.analyze.get_rmsd(ref_atoms: List[int], traj_atoms: List[List[int]])

Gets RMSD for specific atoms for different analogs.

Loops over analog directories and computes a series of RMSDs, between a reference set of atoms and a set of atoms from an xyz trajectory. The atom indices for the reference structure are given with ref_atoms. The atom’s xyz coordinates for a frame in an xyz trajectory are traj_atoms. ref_atoms and traj_atoms are used to generate matrices A and B. The RMSD is then computed with compute_rmsd.

Parameters:
  • ref_atoms (List[int]) – A list of the atom indices corresponding to the reference xyz structure

  • traj_atoms (List[int]) – A list of the atom indices corresponding to the current xyz frame, which we will compare the reference atoms to

Returns:

rmsd_list – List of lists where each list represents and analog, and each list contains RMSDs for each frame.

Return type:

List[List[float]]

Notes

Run from the folder with multiple analogs, e.g., MC6, MC6*, MC6*a

qa.analyze.td_coupling(res_x, res_y, replicate_dir)

Calculates and plots the partial charge for two residues over time.

This function will calculate the charge for each frame for two residues, and plot them. This will help show if the charges are inversely correlated.

Parameters:
  • res_x (str) – The first amino acid to be plotted against time.

  • res_y (str) – The second amino acid to be plotted against time.

  • replicate_dir (str) – The name of the replicate that we will calculate the time-dependent coupling for

Returns:

joint_df – A dataframe with the charge information for both correlated residues.

Return type:

pd.DataFrame()

qa.analyze.pairwise_distances_csv(pdb_traj_path, output_file)

Calculate pairwise distances between residue centers of mass and save the result to a CSV file.

Parameters:
  • pdb_traj_path (str) – The file path of the PDB trajectory file.

  • output_file (str) – The name of the output CSV file.

qa.analyze.parse_components(components)

Parse the components input, handling range inputs (e.g., ‘253-424’).

Parameters:

components (list of lists) – A list of components, each component being a list of strings representing atom numbers or ranges

Returns:

parsed_components – A list of components, each component being a list of integers representing atom numbers

Return type:

list of lists

qa.analyze.centroid_distance(components)

Calculate the distance between two structural components.

This purpose of this script is to do more than calculate a simple distance. It will take in two sets of amino acids. This can be a single amino acid or an arbitrary number of amino acid. It will then calculate the centroid of the two components, and the distance between the centroids.

Parameters:

components (List of lists) – A list of two lists corresponding to two components with their atoms e.g., [[487],[253-424]]

Notes

Examples of interesting components for mimochromes:

all : 1-487 lower : 1-252 upper : 253-424 lower-his : 1-86,104-252 heme : 425-486 his : 87-103

qa.analyze.combine_qm_charges_replicates()