qa.predict

Layer-wise relevance propogate MD predictions.

Module Contents

Functions

shuffle_data(charges_mat, labels_mat)

Enables the option to shuffle the time series data if desired.

create_combined_csv(→ pandas.DataFrame)

Generate a pd.DataFrame of all features.

data_processing(df, labels[, n_frames])

Scales the data for the ML workflows.

run_ml(data_norm, labels[, models, recompute])

ML analysis workflow.

Attributes

mutations

qa.predict.shuffle_data(charges_mat, labels_mat)

Enables the option to shuffle the time series data if desired.

qa.predict.create_combined_csv(charge_files: List[str], templates: List[str], mutations: List[int]) pandas.DataFrame

Generate a pd.DataFrame of all features.

Returns:

  • charges_df (pd.DataFrame) – The original charge data as a pandas dataframe.

  • lablels_df (pd.DataFrame) – One-hot-encoded labels for each frame.

qa.predict.data_processing(df, labels, n_frames=1)

Scales the data for the ML workflows.

Parameters:
  • df (pd.DataFrame) – The original data as a pandas dataframe

  • labels_df (pd.DataFrame) – One-hot-encoded labels for each frame.

  • n_frames (int) – Step for filtering the data (e.g. 1 = every frame, 2 = everyother frame)

Returns:

df_norm – The data scaled by column.

Return type:

pd.DataFrame

qa.predict.run_ml(data_norm, labels, models=['RF', 'MLP'], recompute=False)

ML analysis workflow.

Parameters:
  • df_norm (numpy matrix) – The data scaled by column.

  • labels_df (pd.DataFrame) – One-hot-encoded labels for each frame.

qa.predict.mutations = [2, 19, 22]