The difference is in the number of channels (signals) used. Mono uses one, stereo uses more than one.

In monaural sound one single channel is used. It can be reproduced through several speakers, but all speakers are still reproducing the same copy of the signal. In stereophonic sound more channels are used (typically two). You can use two different channels and make one feed one speaker and the second channel feed a second speaker (which is the most common stereo setup). This is used to create directionality, perspective, space. Here is an example using a two speaker setup.

mono

stereo

More technically, true stereo means sound recording and sound reproduction that uses stereographic projection to encode the relative positions of objects and events recorded.

In a common stereo setup of two channels: left and right, one channel is sent to the left speaker and the other channel is sent to the right speaker. Now, by controlling to which channel you send the signal you can control the position of the sound. You'll hear sounds coming from different directions depending on which speaker you send the signal to, or in which proportion (you can send just a little more to the right speaker, so the sound is positioned just a little bit to the right). Sounds with equal proportions on both speakers will appear to come from the center.

In other words, stereo opens the possibility of playing with sound localization.