Deep Learning: CNNs, Transformers & BERT Fine-Tuning

Implementing core deep learning architectures from scratch in PyTorch, then fine-tuning pretrained models for real classification tasks.

Python
PyTorch
HuggingFace
torchaudio
NumPy

Overview

This project, completed for UC Berkeley's CS 189 (Introduction to Machine Learning), covers the modern deep learning stack from first principles to practical fine-tuning. It spans five sub-projects: a convolutional network and a ResNet built from scratch for image classification, a full Transformer implemented from its mathematical building blocks, a BERT-style model fine-tuned to classify DNA sequences, and a ConvNeXt model fine-tuned to classify urban sounds from audio spectrograms.

Architecture / Approach

Results / What I learned

Both fine-tuning tasks concluded with class-wide Kaggle competitions on held-out test sets. The DNABERT classifier placed 32nd of 449 (top 8%) on the leaderboard for the 3-class DNA species task — a noisy genomics benchmark where random guessing yields 33% and my model scored 47%. The ConvNeXt sound classifier reached 94% test accuracy on UrbanSound8K, placing 76th of 401. The biggest takeaway was seeing how the same handful of primitives — attention, convolutions, residual connections, and a pretrained backbone with a task-specific head — power everything from image classification to genomics to audio.

Source code is not published because this was graded university coursework; this page describes the approach and results instead.