Overview

GANNs with friends is an educational distributed deep learning system designed to teach students about:

Distributed machine learning systems
Generative Adversarial Networks (GANs)
Database-coordinated computing
Collaborative training at scale

What makes this project unique?

Database-coordinated distributed training

Unlike traditional distributed systems that require complex networking setup, this project uses PostgreSQL as a coordination layer. This means:

No port forwarding or VPN setup required
Students can participate from anywhere with internet access
Simple to set up and manage
Fault-tolerant by design

Multiple participation paths

We support four different ways to participate, removing barriers to entry:

Google Colab - Zero installation, free GPU
Dev container - Full development environment
Native Python - Direct local installation
Conda - For conda users

Educational focus

This project is designed from the ground up for teaching:

Clear separation between coordinator and worker roles
Well-documented codebase with extensive comments
Multiple difficulty levels (simple worker to full coordinator)
Real-time visualization of training progress

Key concepts

Distributed data parallel training

Each worker:

Receives a unique batch of images
Computes gradients independently
Uploads results to the shared database
Continues with the next available work unit

The coordinator:

Creates work units (batches of image indices)
Waits for N workers to complete their work
Aggregates the gradients
Updates the model weights
Publishes new weights for the next iteration

GAN architecture

The project trains a DCGAN (Deep Convolutional GAN) to generate celebrity faces:

Generator: Transforms 100D random noise into 64x64 RGB images
Discriminator: Learns to distinguish real from generated images
Adversarial training: Generator and discriminator compete, driving improvement

Database as message queue

PostgreSQL provides:

Atomic work unit claiming with FOR UPDATE SKIP LOCKED
BLOB storage for model weights and gradients
Worker heartbeat tracking
Training state persistence

Project outcomes

After completing this project, students understand:

How modern distributed training systems work
The challenges of coordinating multiple workers
GAN architecture and training dynamics
Database transactions and concurrency
Practical considerations in distributed ML

Next steps

Background - Deeper technical background on GANs and distributed training
Installation - Set up your environment
Quick start - Get running in 5 minutes