scikit-FIBERS

Feature Inclusion Bin Evolver for Risk Stratification (FIBERS) is an evolutionary algorithm for automatically binning features to stratify risk in right-censored survival data. In particular it was designed for features that correspond to mismatches between donor and recipients for transplantation. This repository focuses on a scikit-learn compatible implementation and ongoing improvement/expansion of the of the original (FIBERS) algorithm. Further development of the FIBERS algorithm will take place via this repository. The schematic below outlines how this algorithm works.

Installation

We can easily install scikit-fibers using the following command:

pip install scikit-fibers

How to Use:

An Example Notebook is given with sample code that shows what functions are available in scikit-FIBERS and how to use them by utilizing a built in survival data simulator. This notebook is currently set up to run by downloading this repository and running the included notebook, however you can also set up scikit-fibers to be installed and applied using pip install (above).

Read About and Cite FIBERS and scikit-FIBERS

FIBERS was originally based on the RARE algorithm, an evolutionary algorithm for rare variant binning.

Dasariraju, S. and Urbanowicz, R.J., 2021, July. RARE: evolutionary feature engineering for rare-variant bin discovery. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (pp. 1335-1343).

The first implementation of FIBERS <https://github.com/UrbsLab/FIBERS> was developed within it’s own GitHub repository, and was applied to an investigation of graft failure in kidney transplantation.

Dasariraju, S., Gragert, L., Wager, G.L., McCullough, K., Brown, N.K., Kamoun, M. and Urbanowicz, R.J., 2023. HLA amino acid Mismatch-Based risk stratification of kidney allograft failure using a novel Machine learning algorithm. Journal of Biomedical Informatics, 142, p.104374.

The first publication detailing scikit-FIBERS (release 0.9.3) was applied and evaluated on simulated right-censored survival data with amino acid mismatch features.

Urbanowicz, R., Bandhey, H., Kamoun, M., Fogarty, N. and Hsieh, Y.A., 2023, July. Scikit-FIBERS: An’OR’-Rule Discovery Evolutionary Algorithm for Risk Stratification in Right-Censored Survival Analyses. In Proceedings of the Companion Conference on Genetic and Evolutionary Computation (pp. 1846-1854).

FIBERS was extended with a prototype adaptive burden thresholding approach to allow bins to simulaneously identify the best bin threshold to apply.

Bandhey, H., Sadek, S., Kamoun, M. and Urbanowicz, R., 2024, March. Evolutionary Feature-Binning with Adaptive Burden Thresholding for Biomedical Risk Stratification. In International Conference on the Applications of Evolutionary Computation (Part of EvoStar) (pp. 225-239). Cham: Springer Nature Switzerland.

Most recently FIBERS 2.0 was released, as a completely redesigned, refactored and expanded implementation. Expansions include (1) a merge operator, (2) variable mutation rate, (3) improved adaptive burden thresholding, (4) a bin diversity pressure mechanism, (5) a fitness option based on deviance residuals to estimate covariate adjustments throught algorithm training, and (6) a bin population cleanup option. This paper is currently submitted (under review).

Urbanowicz, R., Bandhey, H., McCullough, K., Chang, A., Gragert, L., Brown, N., Kamoun, M., 2024, April. FIBERS 2.0: Evolutionary Feature Binning For Biomedical Risk Stratification in Right-Censored Survival Analyses With Covariates.

Documentation for FIBERS Class:

Extensive code documentation about the scikit-FIBERS API can be found here.

Contact

Please email Ryan.Urbanowicz@cshs.org and Harsh.Bandhey@cshs.org for any inquiries related to scikit-FIBERS.