
Teaching a Computer to Assemble Jigsaw Puzzles
Hello everyone!
For a while now I've been meaning to write up the story behind my AI jigsaw puzzle solver. It's a project I started because I thought it would be easy — "just match the colors on the edges, right?" — and ended up being one of the most frustrating and rewarding computer vision experiments I've done.
The problem: you give the computer a photo of jigsaw pieces laid out on a table. It needs to figure out the full picture. That's it. Simple to explain, painful to implement.
Step one was segmenting each piece from the background. I started with basic thresholding and contour detection in OpenCV, which works okay if your background is a single flat color and the pieces have decent contrast. Once you lose either of those assumptions, the pipeline falls apart. I ended up adding a bunch of pre-processing — grayscale conversion, Gaussian blur, adaptive thresholding — to get a clean binary mask before running findContours.
After I had each piece isolated, I needed to characterize its shape. Jigsaw pieces have four "sides" and each side is either a tab (sticking out), a blank (sticking in), or a flat edge. Classifying which is which turned out to be the hardest part. I wrote a small routine that walks around the contour, finds the four corners (the sharpest turns in the polygon), and samples the midpoint of each edge. If the midpoint sits outside the bounding box of the two corner points, it's a tab. Inside, it's a blank. Straight across, it's flat.
Step two was matching sides. For two pieces to fit, their edge curves need to be near-mirrors of each other. I tried a couple of approaches:
- Resample each edge into a fixed number of points, then compute a pairwise distance between the two sequences.
- Compare color gradients along each edge — the piece of the image on one side should match the piece on the other.
Approach 1 gave me a solid geometric match; approach 2 disambiguated when multiple candidates had similar shapes. The combined score worked better than either alone.
Step three was assembly. Once I had a scoring function for "how well does edge A fit edge B", I treated the full puzzle as a graph problem — find the arrangement that minimizes total edge-match cost. I started with a greedy approach (place corner pieces first, then edges, then work inward) and it was good enough for most of my test puzzles.
Where it broke down:
- Puzzles with repeating patterns (lots of sky, lots of grass) — the color gradient signal isn't enough.
- Puzzles where the cut is aggressive and the tabs are big and weird — my side classifier would sometimes grab the wrong corners.
- Pieces that were flipped over — I hadn't thought about upside-down pieces at all.
I don't think this project is done. I'd love to retrain the side matching with a small CNN and probably swap the greedy assembler for a proper energy-minimization solver. For now though, it assembles most of the puzzles I throw at it, and I'm calling that a win.
Catch you in the next one!