What did you just see? Simply put I made a retro 8-bit video game console, called The Box, and a 2D platformer game for it. Here is the feature list:
- Based on ATmega328P running at 16 Mhz (same as Arduino Uno).
- The game has a display resolution of 104x80 with 256 colors.
- Video mode is tile based and supports up to 3 sprites per scan line.
- Sprites are multiplexed so there can be unlimited number of sprites vertically on the screen.
- 4 audio channels with triangle, pulse, sawtooth and noise waveforms.
- Chiptune music playroutine and sound effects.
- NES controller support.
Read on to learn more!
Source code in github
Top-left: 128x80 untiled titlescreen, Top-right: game screen, 104x80 tiled with sprites
Bottom-left: Final console hardware, Bottom-right: Prototype based on Arduino Uno
The Box hardware
The most interesting part of the hardware is video signal generation. Here's the basic idea, inspired by Uzebox (which uses a more powerful MCU running at almost doubled clock rate btw.). The MCU outputs 8-bit colors in R3G3B2 format every sixth clock cycle and the bits are turned into analog voltages using resistor DAC. The R,G,B analog signals are fed to AD725, which is RGB to NTSC/PAL encoder. The AD725 outputs composite video signal. The AD725 requires a 14.31818Mhz clock signal for NTSC color modulation, so I have a DIP14 packaged crystal oscillator on board for this. The hardware design of the video stage is mostly based on the reference design in the AD725 datasheet. The AD725 is a surface mount part so I bought a SOIC28-DIP adapter for it and modded it to a SIOC16-DIP adapter to take less space the PCB.
The image quality is actually very good, the best we can get with composite video I would say. The picture is rock solid, only slight jittering can be seen between highly saturated colors problem inherent with composite video. If you going to build the console on breadboard expect lower visual quality -- only by building this on a PCB can you get rock solid picture. The breadboard version is actually not that bad, but after seeing the quality of PCB version you can't go back :)
Schematic (click to enlarge)
Parts list
Apart from standard value resistors and capacitors you need the following parts:- ATmega328P (easily found anywhere)
- 16Mhz crystal
- AD725 (I ordered 5 from China)
- 14.31818Mhz crystal oscillator in DIP8 or DIP14 package (RS components has the DIP14 version)
- SOIC28-DIP adapter for AD725 (I got it from Sparkfun)
- The 10uF filtering caps on the power supply lines should be tantalum (recommened by AD725 datasheet)
- 3.18k, 1.58k and 806 resistors (1% tolerance) for DAC
- NES controller
- NES controller socket (these can be bought online, e.g. from www.parallax.com)
Tiled graphics mode with sprites
The game screen is made of 13x10 tiles, each 8x8 pixels. A tile, therefore, consumes 64 bytes of program memory. I have a tile buffer of 13x10 pointers that point into tile graphics in program memory. On each scanline, I fetch the tile pointer from RAM, and pull 8 pixels from the tile and output the pixels exactly every 6 cycles. With pixel width of 6 cycles and with doubled scanlines the pixels are approximately square on screen. Pulling a pixel from program memory takes 3 cycles and outputting a pixel takes 1 cycle, so there are only 2 cycles remaining to fetch the tile addresses. With careful ordering of instructions and unrolling the loop it can be done. Overall, it was fairly easy to get the basic tiling setup working.
However, things started to get much more complicated because I also wanted to have sprites on top of the tiles. The ATmega328P running at 16Mhz is not fast enough to do the tiles and mask sprites on tiles during the time period of a scanline. It took me a while to figure out how do the sprites. Then it hit me. Because I have doubled the scanlines, I have actually two scanlines of time to process a single row of 104 pixels. In order to pull this off I had to use double buffering, so that while I was computing the next row of pixels, I was pulling in the previously computed scanline and still outputting pixels every 6th cycle. So on even scanlines, I do as many tiles as possible (which turned out to be 9 tiles), write the pixels to a scanline buffer WHILE reading the pixels of the previous scanline and outputting them to screen. On odd scanlines, I do the remaining 4 tiles, write the resulting pixels to memory, mask sprites on top of the tiles, and again while pulling pixels from previously computed scanline and outputting them to screen. A buffer holds a single row of 104 pixels. After two scanlines the buffers are swapped. Doing everything while outputting a pixel every 6 cycles meant that every cycle had to be counted.
There is actually three seperate "threads" running in the code and the threads are manually interleaved. This was very painful to code but eventually I managed to do it. The result is a video mode where I can have 3 sprites on each scanline. The game has actually more sprites because I can reuse the hardware sprites vertically on the screen by using multiplexing. I have a buffer in RAM which stores the sprite locations and image pointers for three sprites on each scanline. Multiplexing the sprites is as simple as writing the sprite data to the buffer in the correct place.
Multichannel music and sound effects
I also wanted to have multichannel music in the spirit of C64's SID and Rob Hubbard (the best chiptune musician ever, just listen to the music of Commando, International Karate or Monty on the Run if you don't believe me). Unfortunately there is not enough time left on the scanlines to do any sound synthesis. So sound had to be generated in the vertical blank period when the MCU is not busy doing the tiles and sprites. There are max 263 scanlines on a NTSC screen, so I fill a buffer of 263 bytes of 8-bit audio samples during the vblank. The video generation reads the samples and sends them out of the chip using pulse width modulation (PWM). Since we are constantly sending out samples while generating new samples, the sound needs to be double buffered. Otherwise clips and pops can be heard.
The audio system supports 4 channels, with triangle, pulse with varying pulse width, sawtooth and noise waveforms. Volume is controlled using ADSR envelopes. Oscillators and mixing is coded in assembler. The music playroutine is pretty much a standard four channel tracker with support for pulse width animation, volume slides, arpeggios, vibrato and portamento effects. Music data is compressed in memory so that each track row uses only 1 byte. The catchy tune was composed by Antti Tiihonen aka jpeeba using a custom textmode tracker I wrote just for this project.
Other tidbits
Some of the tiles are animated on the screen: gold pieces, hearts and the princess are technically background tiles. I only swap their tile pointers every few frames. To make this really fast I scan only a single row of tiles per frame, so that the whole screen is updated every 10 frames.
There are actually three different video modes in the game: the main game mode with tiles and sprites (13x10 tiles), untiled titlescreen mode with 128x80 resolution and intro text mode with 14x10 tiles with no sprites. I did not need sprites for the titlescreen and there was space left in program memory so I could afford a slightly bigger resolution for the titlescreen. I couldn't fit the intro text beautifully into only 13x10 tiles, so I had to do a custom graphics mode with one more tile horizontally for the intro ;-)
In the end, there are only a few bytes of RAM and about 200 bytes of program memory left. I know by optimizing and with better compression techniques (and removing one of two of the extra video modes) I could fit even more into memory but luckily the game does not really need more stuff.
Thanks for reading!
Etched circuit board (excuse the hand drawn lines)
Media coverage:
Legend of Grimrock Co-Creator Builds 8-Bit Game On DIY Console
8-Bit Video Game is Best of Retro Gaming on a Shoestring Budget
8-bit gaming with Atmel’s ATmega328P
The True 8-Bit Video Game Toorum’s Quest II And The Console Made To Play It