streamline.runners.stats_runner module

class streamline.runners.stats_runner.StatsRunner(output_path, experiment_name, algorithms=None, exclude=('XCS', 'eLCS'), class_label='Class', instance_label=None, scoring_metric='balanced_accuracy', top_features=40, sig_cutoff=0.05, metric_weight='balanced_accuracy', scale_data=True, exclude_plots=None, show_plots=False, run_cluster=False, queue='defq', reserved_memory=4)[source]

Bases: object

Runner Class for collating statistics of all the models

Parameters:

output_path – path to output directory
experiment_name – name of experiment (no spaces)
algorithms – list of str of ML models to run
scoring_metric='balanced_accuracy'
sig_cutoff – significance cutoff, default=0.05
metric_weight='balanced_accuracy'
scale_data=True
exclude_plots
metric_weight – ML model metric used as weight in composite FI plots (only supports balanced_accuracy or roc_auc as options). Recommend setting the same as primary_metric if possible, default=’balanced_accuracy’
top_features – number of top features to illustrate in figures, default=40
show_plots – flag to show plots

get_cluster_params(full_path, len_cv)[source]

run(run_parallel=False)[source]

save_metadata()[source]

submit_lsf_cluster_job(dataset_path, len_cv)[source]

submit_slurm_cluster_job(dataset_path, len_cv)[source]