streamline.runners.replicate_runner module
- class streamline.runners.replicate_runner.ReplicationRunner(rep_data_path, dataset_for_rep, output_path, experiment_name, class_label=None, instance_label=None, match_label=None, algorithms=None, load_algo=True, exclude=('XCS', 'eLCS'), exclude_plots=None, run_cluster=False, queue='defq', reserved_memory=4, show_plots=False)[source]
Bases:
object
Phase 9 of STREAMLINE (Optional)- This ‘Main’ script manages Phase 9 run parameters, and submits job to run locally (to run serially) or on cluster (parallelized).
- Parameters:
rep_data_path – path to directory containing replication or hold-out testing datasets (must have at least all features with same labels as in original training dataset)
dataset_for_rep – path to target original training dataset
output_path – path to output directory
experiment_name – name of experiment (no spaces)
match_label – applies if original training data included column with matched instance ids, default=None
exclude_plots – analysis to exclude from outputs, possible options given below. export_feature_correlations, run and export feature correlation analysis (yields correlation heatmap), default=True
plot_roc –
averages (Plot PRC curves individually for each algorithm including all CV results and) –
default=True –
plot_prc –
averages –
default=True –
plot_metric_boxplots –
metric (Plot box plot summaries comparing algorithms for each) –
default=True –