Weather Update
Ohio Wesleyan will reopen at noon Tuesday, Jan. 27, 2026, but faculty may hold morning classes remotely. Students should watch for faculty communication. Essential personnel should report as normally scheduled.
Ohio Wesleyan will reopen at noon Tuesday, Jan. 27, 2026, but faculty may hold morning classes remotely. Students should watch for faculty communication. Essential personnel should report as normally scheduled.
Project Title: A recursive formulation for a rank sum statistic used to detect genomic copy number variation.
Student: Nam Tran Hoang ’15
Mentor: Dr. Craig Jackson
Copy number variation (CNV) results from duplications and deletions of genomic DNA. Since CNVs were found to correlate with a number of genetic diseases, detecting and characterizing CNV is a major goal of genetic research. Recently, a rank-based method has been developed to analyze raw CNV.
This method involves a rank comparison of a sample DNA across multiple DNA sections, against multiple controls. The overall CNV of the sample is then determined by a statistical comparison of the sample's Rank-Sum against the discrete null distribution. As such, the accuracy of this method depends, to a large degree, on an accurate representation of the null distribution. So far, the exact null distribution has only been approximated using the continuous Irwin-Hall distribution.
This study includes the rigorous proof of several recursive formulations for the weights of the random Rank-Sum statistic. Unexpectedly, these recursive formulae give the generalized form of the binomial coefficients. The descriptive statistics of the exact null distribution are also derived.
The approximated Irwin-Hall distribution is compared to the exact null distribution, from which it is shown to underestimate the standard deviation and overestimate the kurtosis. Using data simulations, the approximated Irwin-Hall distribution also increases the likelihood of type I error (false positive) and gives an overstatement of the test power. Hence, the use of these recursive formulae improves the ability of this rank-based method to detect CNV.