Core Skills – Data Science/Machine Learning
Industry Experience
I've had the privilege of helping numerous companies, from start-ups to industry titans, with data science and machine learning related projects.
As the culmination of my Master's program, I paired up with industry leaders to help them recommend products using machine learning. I used state-of-the-art natural language processing, advanced models and embeddings like BERT and Word2Vec, and the intuitions I honed throughout my graduate coursework to advise them on how to use the data.
A pillar of my internship project at Qualcomm was using recurrent neural networks to detect vulnerabilities (CWEs) in assembly code. I discussed a range of possible options for architectures, including LSTMs and BLSTMs, and read relevant literature on the topic of machine learning in security.
With incomplete data on noise exposure, I was challenged to approximate harm levels with just a modicum of data. This meant creating novel algorithms to extrapolate from the data I had, tailored to my specific situation. This wouldn't have been possible without understanding deeply the tools at my disposal, where the pitfalls were, and how to arrive where we did.
Academic Experience
My academic experience with machine learning started during my junior year at Carnegie Mellon. By the end of my senior year, I had completed a Master's in Data Analytics, concentrating on the latest techniques in machine learning and their application to sciences. Below are the highlights of the coursework in this field, although the full set can be found on my resume.
Taught by Professor Olexandr Isayev, we learned techniques such as Principle Component Analysis, k-means clustering, k-means++, agglomerative and divisive clustering, DBSCAN and other density-based clustering techniques, Bayesian models, linear and logistic regressions, generalized linear models, support vector machines, decision trees, ensembles, and much more.
In another course taught by Professor Isayev, I explored the whole landscape of machine learning techniques including, starting by writing my own neural networks, from scratch. I then learned about feedforward networks, convolutional neural networks and the complexities that they come with, recurrent neural networks including LSTMs and BLSTMs, many sequence to sequence models including transformers, and other topics such as graph neural networks, autoencoders, regularization, and even GANs.
Computer Vision took me on a tour of a burgeoning field and the difficulties it comes with. Throughout the course, I got a taste of many different techniques. We began with image processing, including filters, identifying textures, image descriptors such as FAST or BRIEF, and Fourier transforms. Next, we delved into geometry and motion, including camera models and planar homographies between them, as well as techniques for image alignment, tracking motion and flow with algorithms such as Lucas-Kanade tracking. During the latter half of the semester, we began exploring deep learning techniques, convolutional neural networks, and GANs specifically for recognition and generation problems.
As part of Carnegie Mellon's world-renowned computational biology department, Neural Computation merged deep learning concepts with the latest neuroscience research. We began with the connection between Hebbian learning and Long-Term Potentiation, then moved through topics like PCA and deep belief networks with the relevant biological background. As we moved into topics closer to deep learning, we discussed associative memory models, such as the Boltzmann machine or Hopefield networks. Through various neuroimaging techniques, we saw similarities between the way many animals interpret visual signals to the latest research in convolutional neural networks. Finally, we dug into topics such as backpropagation and reinforcement learning, and their analogs in many central nervous systems.
As part of my Master's program at Carnegie Mellon, I used the Pittsburgh Super Computing Center's new behemoth called Bridges2 to learn techniques in so-called Big Data. I employed techniques specific to clustering large datasets, reducing their dimensionality, and importantly became familiar with the widespread MPI framework, essential to efficiently parallelizing computing in data analysis.