Eric Zhang Projects

Open Source

I view building software in the open as a mode of creative exploration. It lets me quickly act on inspiration, delve into new topics, and make tools that improve people's lives.

You'll see that I particularly like programming languages, distributed systems, machine learning, computer graphics, music, and art.

If you find something interesting, let me know!

sshx November 2023

Rust
Svelte
Systems

A secure web-based, collaborative terminal.

sshx takes your terminal and shares it on the web. It works on screens of all sizes. Create new windows, resize, drag and drop, and see other people’s cursors as they type, with live output. Everything is end-to-end encrypted over a low-latency global server mesh.

This was the synthesis of two years of planning, development and experimentation. The result is a unique software artifact. It combines work as a solo engineer, crafting a complex, performant streaming system, and as a designer of deeply collaborative software.

Links: GitHub, Website

sshx preview image

Rushlight June 2023

TypeScript
Systems
Redis

Collaborative code editors that run on your own infrastructure.

Rushlight is my take on the essence of a persistent, collaborative text editor based on operational transformation, as a library you would embed in a larger app.

It supports multiple real-time documents, with live cursors, based on CodeMirror 6. The backend is stateless, and you bring your own transport — even a boring HTTP handler works. There aren’t any proprietary gadgets or opaque configurations. It’s just a distributed data structure in its purest pragmatic form.

Links: GitHub, Demo

Rushlight preview image

Dispict November 2022

Python
Svelte
Art
Vision

Creative aesthetics tool to discover art museums.

Design a growing artistic exhibit of your own making, with semantic search powered by OpenAI CLIP. Bring your own labels and context.

Creativity and curiosity require associative thinking. What if computers could connect us with masterpieces made by artists of the past? Dispict is a new way to experience the Harvard Art Museums’ online collection (containing over 200,000 digitized works) through language-image machine learning models, helping you curate your own personalized galleries.

Links: GitHub, Website

Dispict preview image

Inline SQL October 2022

Python
Programming Languages
Databases

Weaving relational queries into the fabric of Python.

The inline-sql library provides a simple embedded language for writing SQL in Python programs. Operations run directly within the calling process, and they execute on Pandas dataframes. Local variables are automatically interpolated as parameters.

This is implemented as a small wrapper around DuckDB, so it is extremely fast. From my own experimental perspective, the intent of the project was to see whether we can usefully and ergonomically embed declarative, relational query languages within general-purpose media.

Links: GitHub, PyPI

Inline SQL preview image

classes.wtf August 2022

Go
TypeScript
Svelte
Systems

A course catalog with extremely fast full-text search.

Harvard has many course search websites, but none of them are good. This project is an attempt to take the problem more seriously: write high-performance software and set great defaults so that people can get better, more useful suggestions, 100x faster.

Classes.wtf is a custom, distributed course search engine that focuses on speed and quality of results. The goal is for the entire {request, computation, response, and render} pipeline to take under 30 milliseconds. I built it in a weekend out of personal annoyance and launched a week later; it’s now pretty popular among students at my school.

Links: GitHub, Website

classes.wtf preview image

Redis Rope July 2022

Zig
Rust
Systems
Algorithms

Fast native data type for manipulating large strings in Redis.

redis-rope is a fast and versatile rope data type for large strings in Redis, distributed as a native module. It’s primarily written in Zig and tested with Rust.

The ropes in this module are backed by splay trees, which are a self-adjusting data structure that has logarithmic amortized worst-case performance, while recently-accessed indices are also quick to access in subsequent operations. Each splay tree node stores between 64 and 127 bytes of data.

More than just being theoretically interesting, this module also tries to be somewhat practical by caring about safety, correctness, and speed. It approaches the performance of ordinary strings for simple operations and is hundreds of times faster for complex operations.

Links: GitHub, Releases

Redis Rope preview image

Bore – Localhost Tunnels April 2022

Rust
Systems
CLI

A modern, simple TCP tunnel in 400 lines of Rust.

bore is a minimal CLI tool that exposes local ports to a remote server, bypassing standard NAT connection firewalls. That’s all it does: no more, and no less.

With a single binary, you can expose any number of local TCP ports to the public internet at a specified remote server address. There is a public instance of the tunneling server running at bore.pub, available for anyone to use, similar to ngrok or localtunnel. It’s also very easy to host your own server with a single bore server command in the same binary executable, with optional authentication.

Links: GitHub, Crates.io, Documentation

Bore – Localhost Tunnels preview image

Percival November 2021

Rust
Svelte
Datalog
Visualization

Web-based, reactive Datalog notebooks.

Percival is a declarative data query and visualization language. It provides a reactive, web-based notebook environment for exploring complex datasets, producing interactive graphics, and sharing results.

Percival combines the flexibility of Datalog as a query language for relational data with the beauty of exploratory visualization grammars. These declarative components interact through a reactive dataflow system. Because Percival uses web technologies (including Web Workers for multithreaded, sandboxed execution), fully-interactive notebooks can be shared with anyone on the Internet, making data analyses more tangible to others.

Links: GitHub, Website

Percival preview image

Composing Studio September 2021

TypeScript
Rust
Music
React

Collaborative music composition for everyone.

Composing Studio is a free and easy-to-use online music notation editor that lets anyone collaborate in real time on short musical pieces. To use it, just go to composing.studio in your browser, create a new session, and share the link with other musicians! There’s no setup or installation required. You’ll be able to typeset musical notation while seeing each other’s work in real time (just like Google Docs), with instant sheet music rendering and live audio playback.

This project originated at HackMIT 2021 (we won a grand prize), where I pitched the idea and formed a team of four programmer-musicians. Although we hadn’t previously known each other, we all shared the same goal of exploring collaborative music with the global community.

Links: GitHub, Website

Composing Studio preview image

µKanren-rs September 2021

Rust
Programming Languages
Systems

A featherweight relational programming language.

This is a Rust implementation of µKanren, a very minimal language in the miniKanren family. See the original Scheme implementation here for reference. Similar to how functional programming focuses on functions as the core unit of abstraction, relational programming aims to represent common mathematical concepts as logical relations.

The library was originally implemented as an exercise for a graduate programming languages design seminar at Harvard, but I made the code available to the open source community as a Rust crate.

Links: GitHub, Crates.io, Documentation

µKanren-rs preview image

Rustpad June 2021

Rust
Systems
TypeScript
React

A self-hosted online collaborative code editor.

Rustpad is an efficient and minimal collaborative text editor based on the operational transformation algorithm. It lets users collaborate in real time while writing code in their browser. Rustpad is completely self-hosted and fits in a tiny Docker image, no database required.

Architecturally, client-side code communicates via WebSocket with a central server that stores in-memory data structures. This makes the editor very fast, allows us to avoid provisioning a database, and makes testing much easier. It demonstrates the power of distributed systems and concurrent network programming, designing with consideration for the entire stack.

Links: GitHub, Website

Rustpad preview image

Pencil Sketch Rendering May 2021

OpenGL
Graphics
Geometry
Algorithms

Geometry processing for real-time pencil sketching.

This research project explores the task of drawing 3D objects, either triangle meshes or implicitly represented as signed distance fields (SDFs), in the art style of a pencil sketch. We review, implement, and extend existing methods with geometry processing techniques.

Our primary contribution is a new scale-invariant algorithm for estimating surface curvatures of an SDF within the graphics pipeline, which was a previously unexplored topic. This algorithm has the advantage of enabling real-time rendering of dynamic geometries at arbitrary scales (modeled by implicit functions), without the noise sensitivity of mesh-based methods.

Links: GitHub, Website, Paper

Pencil Sketch Rendering preview image

Graphics Workshop April 2021

OpenGL
Graphics
Shaders

Learn computer graphics by writing GPU shaders!

This workshop contains a selection of self-guided projects designed to teach the basics of computer graphics. Each project introduces an important graphics technique that is extensively used in real-world applications. The code is designed to run in real time on modern GPUs, without requiring any extra software. I wrote this when frustrated with the lack of proper documentation and simple code examples for GPU shaders.

The topics include fragment shaders (GLSL), procedural texture generation, rasterization, lighting calculations, and real-time ray tracing. I hosted a live version of this workshop at Harvard, and it has been used by thousands of self-learners after gaining popularity in the /r/gamedev community.

Links: GitHub, Deployment

Graphics Workshop preview image

Path Tracer December 2020

Rust
Graphics
Rendering

A physically-based path tracer in Rust.

This a physically based, CPU-only rendering engine written in Rust. It implements a Monte Carlo path tracing algorithm for global illumination. There’s a lot of features, including k-d tree mesh acceleration, physical material properties (microfacet BSDF with multiple importance sampling), HDRI environment maps, OBJ/MTL/STL files, depth of field, and particle physics simulation.

It’s also parallelized with rayon and available as a library on crates.io. The entire source code, including code for the example images and more, is very short (~3K SLOC). We’re still looking to extend it with bidirectional path tracing and other features.

This won top project out of 100 students in MIT’s computer graphics class (6.837, Fall 2020).

Links: GitHub, Crates.io, Documentation

Path Tracer preview image

Langevin Dynamics for Composition November 2020

Python
Machine Learning
Music
Transformers

Generative modeling of Bach chorales by gradient estimation.

This research project introduces a new generative model for music composition, based on annealed Langevin dynamics and a noise-conditional score matching algorithm using transformers. Unlike implicit models such as GANs, this learns an explicit distribution of the input data.

We study if Langevin dynamics and score matching can combine the controllability of Markov chain Monte Carlo (MCMC) methods with the global view and fast convergence of stochastic gradient descent, to produce high-quality, structured musical compositions.

Our contribution is to look in the direction of designing generative deep learning models for music that strongly avoid local minima, while retaining controllability.

Links: GitHub, Paper

Langevin Dynamics for Composition preview image

Crepe – Logic Programming in Rust September 2020

Rust
Programming Languages
Datalog

Fast, compiled Datalog for static analysis, with Rust integration.

Crepe is a library that allows you to write declarative logic programs in Rust, with a Datalog-like syntax. It provides a procedural macro that generates efficient, safe code and interoperates seamlessly with Rust programs.

Inspired by the power of Souffle and Formulog, this project is unique in enabling safe integration of arbitrary functions from the host language (Rust) within compiled Horn clauses. It also includes many features of modern Datalog implementations: stratified negation, semi-naive evaluation, and automatic index generation.

Links: GitHub, Crates.io, Documentation

Crepe – Logic Programming in Rust preview image

Fast Semantic Segmentation August 2020

Python
Machine Learning
Vision

State-of-the-art, real-time semantic segmentation with MobileNetV3.

While working at Nvidia, I open sourced PyTorch code and pretrained weights for real-time semantic segmentation of street images. The goal was to make it easy for anyone to use cutting-edge machine learning algorithms for semantic segmentation tasks, especially in embedded applications.

This was a time-consuming effort aided by researchers from ADLR. Although the MobileNetV3 paper reported 72.6% mIoU accuracy, previous public implementations only scored around 40-55% mIoU. After many adjustments to model architecture, loss functions (RMI), and hyperparameters, I was able to train models reaching 72.3% mIoU, while running inference at up to 37.3 FPS on a modern GPU.

I currently maintain this code as a package on PyPI, along with scripts for inference and exporting models to different formats for deployment.

Links: GitHub, Colab

Fast Semantic Segmentation preview image

Set with Friends January 2020

JavaScript
Game
React
Firebase

An online, real-time multiplayer card game.

Set with Friends is a web implementation of a real-time pattern matching card game called Set. Originally, when I was at SPARC 2019, I wondered how I could bridge the 3000 mile gap between some of the friends I had made after we parted ways. Set with Friends lets you play Set online with minimal overhead (no login necessary); it’s as simple as sharing a custom link and having fun!

During the design process, I learned about Firebase, so working on this app introduced me to serverless architectures and provided a good exercise in creating live interfaces. We completely overhauled the design in May and released version 2.0 in June. At its peak, the website had over 40,000 monthly active users for half a year, and as of September 2022 there have been 5,000,000 games played.

Links: GitHub, Website

Set with Friends preview image

Competitive Programming Workspace September 2019

JavaScript
CP
React
Express

An online, cloud-synchronized workspace for competitive programmers.

Top competitive programmers have dedicated setups on their local machines that let them quickly creating new programs from templates, generate input files, and run code on suites of test data. But what if you’re working on a different computer, and you don’t have all of these tools installed?

Workspace is a side-by-side problem viewer and code editor that allows you to run code online. It supports advanced features like autocompletion and automatically parses of test cases for each problem, saving time in a programming competition. This was inspired by CS Academy‘s interface.

The web server lets you scrape problems from online judges at the click of a button. All code is automatically saved and synchronized with a MongoDB instance in the cloud, so it persists across sessions and browsers. See the website below to try it out.

Links: GitHub, Website

Competitive Programming Workspace preview image

Polygon Triangulation with Holes May 2019

C++
Algorithms
Emscripten
Graphics

Fast computational geometry algorithms in WebAssembly.

This was an experiment to implement the ear clipping algorithm from computational geometry for polygon triangulation. Although easy to describe, triangulation is a surprisingly difficult problem, as there are many corner cases involving concave polygons, holes, and nested polygons. Therefore, we need to use some clever techniques to make the problem tractable.

Note that there is a randomized O(N log* N) algorithm for this problem (Seidel 1991). However, for most practical applications, ear clipping is fast enough and produces more robust results, while having lower constant-factor overhead due to the use of simpler data structures.

I implemented the algorithm in C++ from scratch, then compiled it to WebAssembly using Emscripten to produce code that could be run by the browser.

Links: GitHub, Website

Polygon Triangulation with Holes preview image

Procedural Harmony February 2019

Python
Music
Algorithms
Flask

Dynamic programming applied to four-part harmony.

This is a Python program that takes as input a string containing keys and roman numerals representing a chord progression, and voices the resulting chords according to common practice voice leading conventions. This can be used to try out chord progressions, compose a short chorale, and solve music theory exercises.

Parsing and interpreting the roman numerals was done with Music21. This project required satisfying a technical set of voice leading constraints, which I framed as a combinatorial optimization problem. Although the set of possible voicings is exponential in size, you can use dynamic programming on intermediate states for a relatively fast algorithm. See the website below for an interactive version deployed with Flask.

The code and idea behind this project were used as the basis for an abstract at ISMIR 2020, written by computational music theory researchers at McGill University.

Links: GitHub, Website

Procedural Harmony preview image

Algorithm and Data Structure Library January 2019

C++
CP
Algorithms
Angular

A competitive programmer's library of algorithms and data structures in C++.

I maintain an open-source library of about thirty-five C++ templates for algorithms and data structures. These are fairly advanced tricks targeted toward top competitive programmers, including fast implementations of FFT, Aho-Corasick string search, and Dinic’s blocking flow algorithm, as well as data structures like link-cut trees and centroid decomposition.

This is accessible on a static website, which I developed using Angular. The website also includes a searchable collection of more than 600 code samples from other open-source libraries.

Links: GitHub, Website

Algorithm and Data Structure Library preview image

Canvas Hashlife December 2018

JavaScript
Algorithms
Graphics
Vue

An ultra-fast simulation of Conway's Game of Life in the browser.

Ever wanted to simulate a 2100 × 2100 grid of cells over 250 generations? Well, now you can do that in just milliseconds without any specialized software, right in your browser!

How is it possible to simulate such a massive pattern without running out of CPU or memory? The secret is Bill Gosper’s Hashlife algorithm, which combines quadtrees and memoization to compress space and time. In the image, you can see a frontend web application computing the 498,913,509,376th generation of a Turing machine pattern in just under a millisecond.

Links: GitHub, Website

Canvas Hashlife preview image

Char-RNN Keras October 2018

Python
Machine Learning
Music

Character-level language models with recurrent neural networks in Keras.

Inspired by Andrej Karpathy’s blog post, this is an implementation of a multi-layer long short-term memory (LSTM) network in Keras, for learning and sampling from character-level patterns in text. This was trained to generate writing in the style of historical authors, like Shakespeare, Hugo, and Carroll.

A really interesting application was training this model on a text-based corpus of folk music, from which it could compose fairly convincing new music samples. See the music-gen demo below.

Links: GitHub, Demo

Char-RNN Keras preview image

Handwriting Generator August 2018

Python
Machine Learning
Graphics

Variational autoencoder that learns latent features in handwriting.

This is a neural network that encodes 28×28 images of handwritten characters in a 6-dimensional latent space using deep learning. It consists of two parts: an encoder block that puts the input through several convolutional and downsampling layers to extract latent variables, and a decoder block that takes latent variables and a label to reconstruct the original image, minimizing mean squared error. During the training process, we add some Gaussian noise to the latent space, which is the so-called “reparameterization trick” of a variational autoencoder.

I implemented the model in Keras and trained it on a dataset of over 400,000 handwritten characters. You can see the results in the web demo below, which stitches the outputs together onto a canvas. Without any prior knowledge, the model is able to isolate six of the most important characteristics in human handwriting.

Links: GitHub, Demo

Handwriting Generator preview image

Julia Fractal Renderers April 2018

C++
Java
Graphics
TypeScript
OpenGL

Fast, interactive fractal renderers in C++, Java, and WebGL.

Originally, this project started out as a multithreaded Julia fractal renderer in C++, which I used to create high-quality static images and animations. However, I also wanted to interactively explore the fractals by zooming and panning, so I created a Java Swing app with similar multithreaded performance.

The speed of these implementations was heavily CPU-bound though, so I finally implemented an OpenGL shader that could render Julia fractals with blazing-fast speed, fast enough to be explored interactively with almost no lag. I rendered using WebGL and connected it to mobile-friendly controls with TypeScript. You can play with this fractal explorer in the website linked below.

Links: GitHub, GitHub (Java), GitHub (WebGL), Website

Julia Fractal Renderers preview image
Julia Fractal Renderers subimage
Julia Fractal Renderers subimage
Julia Fractal Renderers subimage

Universal Gravity Simulator May 2016

JavaScript
Graphics
Physics

Control hundreds of tiny planets with gravity.

This was one of my first fun programming projects, a visual experiment simulating gravity in a system with hundreds of interacting objects. Calculations are done using simple physics formulas, with a bit of damping to make them more robust. The user can click to add an invisible mass that attracts all objects to the cursor. Everything is rendered in real time on a <canvas> element, and it’s quite satisfying to move the colorful balls around!

Links: GitHub, Website

Universal Gravity Simulator preview image