Solving Sudoku is a classic constraint satisfaction problem usually handled by backtracking or constraint propagation. I wanted to see if a neural network could learn the underlying rules of the game without being explicitly programmed with them, so I built Sudoku Solver.
This project serves as a practical exploration of graph-based architectures, applying concepts from my Machine Perception & Learning (MPL) studies to non-Euclidean data structures.
It is a Graph Neural Network (GNN) that treats the Sudoku grid as a graph where each cell is a node and constraints (row, column, box) are edges. The model learns to predict the correct value for each empty cell through node classification.
The Core Concept
The project frames Sudoku as a node classification task on a graph with 81 nodes. Each node is connected to 20 other nodes (8 in the same row, 8 in the same column, and 4 in the same 3x3 box).
The GNN uses message passing to propagate information between cells. Here is a conceptual look at the message passing step:
import torch
import torch.nn as nn
from torch_geometric.nn import GCNConv
class SudokuGNN(nn.Module):
def __init__(self, num_features, hidden_dim):
super(SudokuGNN, self).__init__()
self.conv1 = GCNConv(num_features, hidden_dim)
self.conv2 = GCNConv(hidden_dim, hidden_dim)
self.classifier = nn.Linear(hidden_dim, 10) # 0-9 values
def forward(self, x, edge_index):
x = self.conv1(x, edge_index).relu()
x = self.conv2(x, edge_index).relu()
return self.classifier(x)The Deep Dive: How it works
The solver moves beyond simple heuristics by learning the spatial relationships and constraints directly from data.
1. Graph Representation
The Sudoku board is represented as a fixed graph structure. Each of the 81 cells is a node, and edges represent the “conflicts” defined by Sudoku rules. This allows the network to “see” which cells affect each other simultaneously.
2. Node Classification
Instead of predicting the entire board at once, the network performs multi-class classification for each node. Given the initial clues as input features, the GNN refines its internal representations through multiple layers of message passing until it can confidently assign a value to every cell.
3. Training & Dataset
The model is trained on a large dataset of Sudoku puzzles and their solutions. It learns to minimize the cross-entropy loss between its predictions and the ground truth, effectively “learning” the rules of Sudoku through exposure to millions of solved examples.
Project Structure
The project is organized for both training and inference:
data/: Utilities for generating and loading Sudoku datasets.model/: The GNN architecture and message passing logic.train.py: The training loop, including loss functions and optimizers.solve.py: A CLI tool to solve any Sudoku puzzle using the trained model.
Usage
Solve a puzzle from a string representation:
python solve.py --puzzle "53..7....6..195....98....6.8...6...34..8.3..17...2...6.6....28....419..5....8..79"Getting Started
Sudoku Solver is built with Python and PyTorch Geometric.
git clone https://github.com/yuazi/sudoku_solver
cd sudoku_solver
pip install -r requirements.txt