Docker Inside Docker: The Movie
I generally don't post debugging or troubleshooting stories. Bugs have a short shelf life, no matter how cool they are :<
I'll make an exception for this story.
Once upon a time, we have a database. A database instance, running in a Docker container.
We want to visualise its contents, and we find a nice database visualiser off of Github. When we build it locally, the visualizer pops up at http://localhost:3000 and it looks amazing. Awesome, let's use it! We put it in a Docker container of its own.
Now when we start up our containers, the visualizer pops up and...it's blank.
???
The...what? Where's my sandwich?
Map of the Territory
This was one of those problems where a good mental picture makes all the difference. When the picture in my head didn't match the actual situation, I tried many things without much progress.
It took me a while to get to the revelation that we were running Docker inside Docker. As a sane, ordinary person who was faced with the blank page, I tried all sorts of things:
- replacing
https
withhttp
- which actually was needed in the fix, just not the cause of the problem - opening the dev console in my web browser and seeing a CORS error, going down a rabbit hole to fix the CORS error
- trying various spellings
- trying relevant-sounding answers from the Internet
But when I arrived at the right mental picture, of Docker inside Docker, the fix was more natural.
No Docker
Suppose you're developing solely in your local environment. You have a database visualiser listening on port 3000, and a local database listening on port 5430.
Pop quiz!
Question: From your machine, how can you access the visualiser?
Answer: localhost:3000
Question: From your machine, how can you connect to the database?
Answer: localhost:5430
Cool! Moving forward.
With Docker
Now you put your visualiser in a container, and your database in a separate container.
We have the idea of an "inside" and "outside." All containers are "inside"; Docker assigns each container a unique IP address and makes it part of this internal network. "Outside" is outside the internal network, in the external network.
When you curl localhost:3000 from inside the visualiser container, you can access your visualiser just fine. But when you curl localhost:3000 from outside the container, say your local machine, it doesn't reach your visualiser. Since the container and your local machine have different IP addresses, localhost:3000 refers to a different endpoints in each scenario.
We have a Docker host to mediate data transfer
- between two devices that are "inside" the same network: docker acts as a layer 2 network switch, with each container connected to one of the switch ports.
- between the "inside" and "outside": docker acts as a NAT device, which operates at layer 3 (IP) and 4 (TCP and UDP), the layers in charge of breaking data into packets for sending across two different networks
Being an outsider, in order for our request to localhost:3000 to reach port 3000 of the visualiser container, we want the Docker host to forward packets from port 3000 of the external network to port 3000 of the visualiser container.
In Docker, we can do this with port publishing. We map port 3000 of our external network to port 3000 of the visualiser container, and port 5430 of our external network to 5430 of the database container.
Pop quiz!
Question: From your machine, how can you access the visualiser?
Answer: localhost:3000
Question: From your machine, how can you connect to the database?
Answer: localhost:5430
Question: Okay, now suppose the database container doesn't publish port 5430 to 5430. Suppose it publishes 5430 to a different port on the external network, say port 8080. From your machine, how can you connect to the database?
Answer: localhost:8080
Question: Okay, continuing off of the previous question, suppose the database container still publishes port 5430 to 8080. From the visualiser container, how can it connect to the database?
Answer: name-of-database-container:5430
Cool! Moving forward.
Docker Inside Docker 😨
This was where I was stumped. See, I understood everything described above and I double-checked that I did the port mappings correctly. If that were the end of the story, we could open http://localhost:3000 and see our visualiser.
But a strange thing was happening where, when I started up the containers, the window that opened for the visualiser wasn't http://localhost:3000. It was https://space.xxx.xxx.xxx.com:8457/proxy/3000, and this was a blank page.
So why, then, couldn't we see our visualiser?
The revelation:
We're actually running Docker inside Docker.
Docker inside Docker, pebbles in pebbles. This delightful image was taken at the British Museum of Natural History.
At the company I was at, there's a nice internal page for developers where we can create development spaces. We click a button and it creates a development space for us. It also gives us an editor for our development space that looks like VSCode but isn't actually VSCode.
The development space it creates for us is actually a Docker container. The other development spaces, they're also in Docker containers of their own. We're a Docker container among Docker containers.
If I wanted to be poetic, I could say that I saw the Pebbles within pebbles display at the British Museum of Natural History and had a Sherlock Holmes moment. As E. M. Forster once said, "The king died and then the queen died is a story. The king died, and then queen died of grief is a plot."
Sorry, E. M. Forster! The actual way I arrived at the realization Docker inside Docker was not terribly eventful, just know that the troubleshooting journey became much more eventful after this.
To recap,
- We create a development space, which is built as a Docker container behind the scenes.
- In this development space, we install Docker.
- We start up our Docker containers.
- ???
- Blank page!
So what we have is something like this.
Pretend that the big box we discussed under the "With Docker" section is still the same. It still has the same containers as before, and the ports are mapped the same as before.
It's just that this box is now a Docker container, fittingly renamed "dev space container" in the diagram. Like how our dev space has an inner Docker host to manage its containers, an outer Docker host is needed to manage our dev space container and its fellow containers.
How does our dev space expose its ports to outer Docker? The fussy part is that we don't control the port mappings of outer Docker, as that's managed by the team that is in charge of supplying us with our development spaces.
All we know is that there exists some mapping from port 3000 of our development space to some port ???
of the external network.
TL;DR: We want to figure out what this mysterious port ???
is. Our database visualiser publishes port 3000 to port 3000 of what it thinks is the external network, but is actually port 3000 of our development space. Then our development space publishes port 3000 to port ???
of the external network, so we can reach our visualiser from port ???
.
The fix: We obtain the mysterious port number 3516
by finding a way to list the dev space containers and look for the port that's forwarded to 3000
. Now we go to the blank page, replace https://space.xxx.xxx.xxx:8457/proxy/3000 with http://space.xxx.xxx.xxx:3516, and it works! The end.
Moral of the Story
Aside from learning a bit about how Docker networking works and some other articles I read in an attempt to solve this problem, I had a few meta-learnings:
When you know, you know. If you have mixed feelings, it's a no. The interesting thing is, before I knew we were running Docker inside Docker, all the fixes I tried were "I'm not sure whether this'll work but maybe, maybe it sounds right, I don't know, let's see."
When I knew we were running Docker inside Docker, it was a, "I'm fairly confident this is the fix, I'll be surprised if it doesn't work" moment.
To keep in mind as I approach problems in the future, if I don't have a good enough picture of what's going on, I should take some time to understand it, enough that I can draw it out and explain what's going on. If it makes sense, it'll feel as natural as "I'm fairly confident this is the fix."
A lot of "I'm not sure but maybe's" end up being a timesink, even if they can work out. As I try each "I'm not sure but maybe" style fix, I should ask myself what picture I have in my head and what parts of the picture I'm trying to confirm or debunk.
Networking in the books vs networking in the streets. I feel like networking courses and textbooks are like "Where does your food come from?" documentaries, in that they teach me how to appreciate a sandwich. They tell me about the cargo ships that bring my sandwich ingredients to shore and what happens when a cargo ship sinks and all the backup plans the cargo ship companies have. They tell me how they translate addresses as they move through foreign lands, how they pick their routes, and the mysterious incantations they chant in between.
By the time I'm eating my sandwich, I'm like, "Ah, yes, I'm glad the cargo ship made it safely across the Atlantic so that I can eat my sandwich." I really, really appreciate my sandwich.
I'm grateful how, when I send a request to Bob, I don't have to think about what happens when a packet gets dropped and how I can make sure Bob gets my packets in the right order, because that's been taken care of several layers down, by the TCP three-way handshake. I'm also grateful that I also don't have to think about the intricacies of routing, address translation, name resolution, or any of that stuff.
But what do I do when I order a sandwich and it's not at the sandwich shop? It's a problem of a very different nature from the ones in the cargo ship documentaries, and my shuffling through the sandwich shop isn't becoming a documentary anytime soon.
Fortunately, it's also explainable enough that it doesn't need to be a documentary.