streamline.runners.stats_runner module

class streamline.runners.stats_runner.StatsRunner(output_path, experiment_name, algorithms=None, exclude=('XCS', 'eLCS'), class_label='Class', instance_label=None, scoring_metric='balanced_accuracy', top_features=40, sig_cutoff=0.05, metric_weight='balanced_accuracy', scale_data=True, exclude_plots=None, show_plots=False, run_cluster=False, queue='defq', reserved_memory=4)[source]

Bases: object

Runner Class for collating statistics of all the models

Parameters:
  • output_path – path to output directory

  • experiment_name – name of experiment (no spaces)

  • algorithms – list of str of ML models to run

  • scoring_metric='balanced_accuracy'

  • sig_cutoff – significance cutoff, default=0.05

  • metric_weight='balanced_accuracy'

  • scale_data=True

  • exclude_plots

  • metric_weight – ML model metric used as weight in composite FI plots (only supports balanced_accuracy or roc_auc as options). Recommend setting the same as primary_metric if possible, default=’balanced_accuracy’

  • top_features – number of top features to illustrate in figures, default=40

  • show_plots – flag to show plots

get_cluster_params(full_path, len_cv)[source]
run(run_parallel=False)[source]
save_metadata()[source]
submit_lsf_cluster_job(dataset_path, len_cv)[source]
submit_slurm_cluster_job(dataset_path, len_cv)[source]