Research
Interests & Active Work
Notes on what I'm working on, what's caught my eye lately, and where I want this to go.
Active work
- Compute-in-memory transformer accelerator. A self-directed FPGA capstone that I'm building in seven milestones. The first is an 8×8 INT8 output-stationary systolic MAC array. It verifies bit-exact against a NumPy reference and closes timing at 100 MHz on Xilinx Artix-7. The next milestones extend that into a full transformer datapath with attention and FFN blocks.
- SSCS PICO Chipathon. I'm in the current cohort, planning to submit a digital compute-in-memory design. The top submissions get the chance to be published as open-source designs by the IEEE Solid-State Circuits Society, which is what I'm aiming at.
Conferences
- IEEE ISSCC 2026, San Francisco. I went in person and spent most of the time in the AI-accelerator and compute-in-memory sessions. I also got to talk with engineers from Qualcomm, IBM, Samsung, Synopsys, and GSMC. The thing that stuck with me was how much of the published work is bottlenecked by analog periphery and on-chip memory rather than by the compute fabric itself.
Heading toward
After undergrad I'd like to go into research on ML-accelerator microarchitecture. It sits between what an algorithm wants to do and what silicon will actually let you build, and I find that pull-and-tension the most interesting part of the field. Some of the specific questions I'm chasing:
- Quantization-aware datapaths and sparsity-aware dispatch. What's the architectural cost of supporting both dense and sparse workloads on the same fabric?
- Compute-in-memory beyond the demo. The compute fabric itself is the easier part; what really limits realistic workloads is the periphery (drivers, ADCs, accumulators).
- Memory hierarchies for accelerators. Bank-conflict avoidance, and the scratchpad-versus-coherent-cache trade-off for systolic and dataflow engines.
Reading
-
Spear, C. & Tumbush, G. SystemVerilog for Verification. Springer, 3rd ed., 2012.
Most of my verification flow is cocotb in Python, so this book is filling in the SystemVerilog side. The SVA, randomization, and coverage features don't really translate cleanly into a Python harness, and I want to be fluent in both.