How does a robot know where it is?

The robot knew it was lost. It just didn't know how lost.

Before anyone solved this problem, robots navigated with a kind of blind confidence. They tracked their own movement, calculated where they must be, and proceeded accordingly. The math was correct. The result was often a robot standing in the wrong room, facing the wrong direction, completely certain of its position.

The problem wasn't the sensors. It wasn't the motors. It was something more fundamental. Seeing your surroundings and knowing where you are inside them turn out to be two different things. And each one requires the other. For years that looked like a dead end. The solution, when it came, was called Simultaneous Localization and Mapping — SLAM. And it didn't resolve the contradiction so much as decide to live inside it.

GPS, and why it misses the point

Consumer GPS is accurate to a few meters. Fine for navigating to the right street, not fine for a robot that needs to know exactly where it is in a corridor. And it fails completely indoors, which is where most useful robots actually operate.

But that's almost beside the point. Even if GPS worked indoors, it would still only tell you where you are on Earth. It has no idea what's around you. Not the shelf to your left, not the step at the end of the hall. A robot needs to know its surroundings in detail to do anything useful. No amount of GPS precision changes that.

So robots need to build their own picture of the world, from their own sensors, in real time. Which raises an immediate problem.

The problem with building a map

Before SLAM, robots navigated by tracking their own movement, a method called dead reckoning, borrowed from old seafaring navigation. Wheel encoders count rotations, motion sensors measure every turn, and together they produce a running position estimate called odometry. It works for a while. Small errors add up. A wheel slips on a smooth floor, a sensor reading is a fraction off, and each small mistake nudges the position estimate further from reality. After enough distance the robot is confident it's somewhere it isn't.

The fix seems straightforward. Use landmarks. If the robot's odometry says a wall is 2.3 meters away but its sensor says 2.1, it corrects itself. What the sensor sees beats what the math predicted. But this only works with a pre-built map. The robot needs to already know where that wall is supposed to be.

To navigate accurately you need a map. To build an accurate map you need to know where you are while building it. SLAM doesn't break that circle. It works inside it, which is either obvious or profound depending on how long you've been staring at the problem.

Key concept — Odometry and the drift problem

Odometry tracks every movement to estimate position. It needs no external reference and works immediately, but errors accumulate with every meter traveled. A reading slightly off here, a wheel slip there, and the gap between where the robot thinks it is and where it actually is grows continuously. That gap is what SLAM is designed to close.

"You need a map to know where you are. You need to know where you are to build the map. SLAM's answer was to stop waiting for certainty and just get less wrong, continuously."

How it actually works

SLAM starts uncertain and gets less uncertain over time. The loop runs many times per second. Predict where the robot probably is based on its last position and movement. Observe the environment and pick out stable landmarks, wall corners, door frames, anything recognizable that can be found again later. Then update both the position estimate and the map based on the difference between what was expected and what was actually seen.

Run this thousands of times and the map sharpens, the position stabilizes, and each one pulls the other toward accuracy. It is a clever design, two problems that seemed to block each other solved by running them together. Once you understand it, it feels almost obvious. Building one that actually holds up in a real building is a lot harder than it sounds.

Key concept — The SLAM loop

Predict from movement, observe landmarks, compare to map, update both. Repeat. The map and position estimate improve together. Neither gets significantly better without the other.

The sensors doing the observing matter a lot. Lidar fires rapid laser pulses and measures how long each takes to return, building a precise 3D skeleton of the space from millions of distance measurements per second. Expensive, and it struggles in rain and dust, but it produces some of the cleanest geometric data available. Cameras are cheaper and carry more visual detail, but need far more processing. Camera-based SLAM tracks visual features across frames and uses the slight shift in how they appear as the robot moves to work out depth. Many deployments combine both. Change the lighting, add some reflective surfaces, and the same system that worked perfectly yesterday starts making mistakes.

The part that makes it work over distance

Here is where SLAM gets interesting.

Even a well-tuned system drifts over a long run. Small errors stack up and the map warps slowly, not enough to notice in a short session, but over hundreds of meters it adds up. The mechanism that deals with this is called loop closure, and it is one of the cleverer ideas in the field.

When a robot returns to somewhere it has been before, its sensors recognize the match. It then compares what it sees now with what it mapped on the first visit and calculates a correction that spreads back through the entire trajectory, redistributing and reducing accumulated error across the whole map at once. A single moment of recognition reaches back through the robot's entire history and adjusts everything.

If every corridor looks the same, the robot has the same problem you would. It has no way to know which one it's in. Debugging this in real buildings is difficult because the failure often just looks like slow random drift, hard to spot, hard to trace back to the cause. Getting loop closure right in repetitive environments remains one of the open research problems in the field, which says something about how hard it actually is.

Modern SLAM runs on ordinary hardware in real time. Self-driving cars use it to map city streets at speed. Warehouse robots use it to track position to within a few centimeters across buildings the size of football fields. It runs on the robot vacuum in your living room. It also still catches engineers off guard in ways nobody expected.

Moving environments remain hard. When people walk through a space a robot is mapping, the robot has to decide in real time what belongs in the map and what doesn't. And SLAM only answers one question anyway. Where am I, and what's around me. What the robot should actually do with that, where to go, how to plan a route, what to do when something's in the way, is a separate problem entirely. SLAM lays the foundation. Everything else gets built on top of it.

How does a robot know where it is?

The robot knew it was lost. It just didn't know how lost.

GPS, and why it misses the point

The problem with building a map

How it actually works

The part that makes it work over distance

Continue reading

How does a robot see?

How does a robot hear?

How does a robot move?

Stay curious