Board 21: Improving Prediction Accuracy of RBPBind by Including Variable Footprint

Student Scientist: Niladri Deb ’25
Research Mentors: Ralf Bundschuh (Department of Physics, The Ohio State University)

Proteins are important for cells and they bind with Ribonucleic Acid (RNA) Molecules. We know how proteins bind with RNA but to understand how strongly they bind we use a software called RBPBind. The software was restricted to a limited number of proteins. We modified the software to accept a much broader range of proteins.


Proteins and Ribonucleic Acids (RNA) are crucial for the everyday tasks of cells. Proteins and RNA often bind together and it is important to know how well which protein binds which RNA. We have software available to us that helps us determine the binding affinities for some proteins and a given RNA sequence. However, this software depends on the footprint size of the protein, which is the number of nucleotides that the protein molecule binds with at a time, and it is a fixed size. Thus, proteins that have a variable footprint size will not be supported by this software. Hence, we wanted to find out if adding a variable footprint size to the RBPBind software would improve its prediction accuracy. This would make the software applicable to both proteins with a fixed footprint size and proteins with variable footprint sizes, which could help us understand the biological significance of these additional proteins. The software was modified and we additionally generated the site occupancy for each nucleotide for different PUM2 footprints using the penalties in the form of change in Gibbs free energy for each nucleotide at each position published on PUM2 in “A Quantitative and Predictive Model for RNA Binding by Human Pumilio Proteins”. We then tested our model by comparing the predicted change in Gibbs free energy from our model to the measured Gibbs free energy changes from the literature (Jarmoskaite et al., Mol. Cell, 966, 2019). As we incorporated additional, variable footprints, the predicted binding affinities became closer to the measured values. Thus, we conclude that adding multiple footprint sizes to RBPBind increases the prediction accuracy of the software.