Andrew Zhang and Kalyan Palepu named Siemens semifinalists

Andrew Zhang III and Kalyan Palepu II were named semifinalists this fall in the Siemens Competition, the nation’s premier research competition in math, science, and technology for high school students. Under the mentorship of Professor Gil Alterovitz of Harvard Medical School, Andrew and Kalyan created a compression system for genome data to make it more accessible for daily clinical use, taking advantage of recent advances in machine learning, which can discover features in massive data without explicitly telling the computer how. Andrew explains:

“Our system consists of two parts: one for finding features in genome data and using those patterns to compress the data, and the other for reconstructing the data. More specifically, each side is an artificial neural network, which simulates a biological neural network (a brain). It does this through a network of interconnected nodes, which each perform some mathematical function. A program runs the system repeatedly, each time adjusting parameters to tune its accuracy. Fortunately, Google’s Machine Learning framework, Tensorflow, hides all the technical details, and provides an easy interface to use. We designed and implemented our system with Python (a computer language) and Tensorflow.”

Andrew and Kalyan’s system resulted in a 136x reduction of the genome data file size—from 500MB down to 23MB—and recovers without loss. It compresses an entire genome in three hours, and decompresses any section in a fraction of a second, all on a laptop. 

In addition to being named Siemens semifinalists, Andrew and Kalyan also won 2nd place in the 2017 Chinese-American Biomedical Association (CABA) research competition.