Joel Bailey ’21–This summer, I did a computer science research internship for Wabash College. I worked with Dr. McCartin-Lim virtually; we would meet for about an hour daily. We typically discussed what we accomplished the previous day, and then planned out our next steps. After each meeting, I worked independently for the rest of the day. Each day, I was usually either reading through relevant research papers or working with the code provided by the authors of those papers. The research project is in its early phases, so we are currently focusing on learning from previous research.

The objective of our research is to apply machine learning techniques to proving mathematical theorems. Writing proofs is about using hypotheses that are assumed to be true to reach some conclusion. A simple example would be “Assume that Matt gave Dana an apple. In conclusion, Dana has an apple.” There already exist programs that automatically prove theorems, but our goal is to use machine learning for that task. A machine learning model identifies patterns in large amounts of data, and these patterns can be tested. For example, a model that receives a large dataset of cat images may learn to look for whiskers or cat eyes. Afterward, when the model is asked to identify if an image has a cat, it will look for those patterns and report its conclusion. Machine learning has been successful with problems previously solved only with human intuition, not with technology.

There are a lot of different models available. My primary duty was to find the model that worked best with our research. There was a lot of reading involved; I had to analyze the models and results from previous research and identify each paper’s key points. They were usually not easy to read. I often had to reread passages to understand the technical details behind the researchers’ models. Sometimes researchers publicly released the code they used for their research, which was useful for me to get a hands-on approach to the model they were using. Even if the researcher’s experiments were not relevant to proving theorems, I could still reuse their model to work with a problem that is more relevant to proofs. Running the code was often resource-intensive, so a big challenge was to keep my laptop from overheating.

I learned from this experience that it takes a lot of work to get the foundation for anyone’s research. Before this internship, I already noticed that research papers of any field often had many references to previous documents and data. Now, I appreciate how much work goes into reading and compiling all of those references. Based on my experience, it seems researchers may have to go through many papers that do not end up directly contributing to their research, which may be two or three times the number of papers the researchers do reference.