Do you see that grumpy old engineer in the corner of the lab who delights in sending new graduates down to stores to get a bag of holes. That could have been me, but when Personnel departments became Orwellian Human Resource departments producing dictates banning you from taking leave during Team Building Days, I decided to leave employment and start my own company. Freedom? Well, of a sort – you do still need to produce things that might sell. But you do get to spend a few days exploring some dimly lit avenue of electronics without having marketing asking you to produce yet another PowerPoint presentation for management that tries to justify their inspired idea that breaks a fundamental law of physics. This article is about one such diversion.

I enjoy reading the articles of those eccentric engineers emulating past microprocessors such as the 68000 or even a Cray. My own favourite processor was the Motorola 6809 and I used it as a workhorse in my early days as an engineer, using an old Dragon home computer as a development tool. I was thinking of designing it into an FPGA – just for fun – when a more pressing need took hold.

I used FPGAs a lot in my designs and usually needed a processor for some housekeeping or control purpose. Although FPGA-based processors were available, such as Altera’s Nios, they did use quite of lot of resources and often required external code memory, encumbered as they were by C compilers and even operating systems. I decided to design my own minimal microprocessor for those simple tasks such as I2C control of peripherals, or simple control interfaces such as digital encoders and character LCD displays. I loosely based the design on my favourite, the 6809, but soon started stripping it bare, finally managing to squeeze a workable processor into less than 400 logic elements (LEs – basically a 4-input look-up table followed by a programmable flip-flop – the building blocks of FPGAs), allowing it to fit in the smallest of FPGAs, or allow multiple instantiations in the same FPGA.

The block diagram of the processor, which I call the PT13, is shown in Figure 1.

Figure 1  The PT13 homebrew microprocessor block diagram 

I decided to strip out the 16-bit registers from the 6809, although I kept a limited indexing operation using Accumulator B because it is handy for referencing tables. This saves at least 32 LEs.

I also stripped out the two 16-bit stack pointer registers. To keep a minimal footprint for the processor, I decided to forgo interrupts. This means events have to be polled, which is a little wasteful of power, but as the code is relatively simple, and given the savings it allows – for example, not needing to keep a copy of all the registers – it was a worthwhile compromise. To implement subroutine branches, three registers form a FIFO to hold the program counter contents, allowing a maximum of three nested routines. When programming, you need to be aware of this restriction, as there are no assembler warnings if you exceed this number.

Each instruction cycle takes a fixed 16 clocks, so at 27 MHz (the clock I usually use for the PT13) each instruction takes about 600 ns. This may seem to be an excessive amount of time – and it is – but it did make for a simple internal state machine. There are four elements to each instruction cycle: Fetch, Decode, Execute, and Write-Back. The reason I use more than one cycle for each element is the structure of the Altera FPGA memories, which must be registered for the address (or is the data? it is one of them, or possibly both). It was simpler to just allow sufficient time for the memory data to be read or written.

At startup, the internal state machines are reset and the program counter (PC) is cleared. Then it starts addressing the program memory at address $0000 (the Fetch cycle). Program memory is internal to the FPGA – typical code size might be just 1 kB, which can be implemented in two Altera memory blocks (each block is 512×9 bits). Altera has two memory file formats, HEX and MIF. The assembler can create either of these, although I use MIF files. If you instantiate a memory block configured as single port ROM and point the initial file to the assembled binary file, then each time you modify the processor code, all you have to do it compile the FPGA and the new code will automatically be included in the compile.

The data at the first PC location is decoded by the Op-code decode block which in turn controls the sequencing of the control unit. There are about 50 instructions, and they are grouped into Arithmetic, Logic, Branch, Load & Store, and a token NOP (no operation) instruction. There is no special handling of illegal instructions. Arithmetic instructions comprise ADD and SUBTRACT on either accumulator, A or B. A condition-code register detects carry and zero conditions for both accumulators.

Logic instructions are AND, OR, EXOR, logic and arithmetic shifts, and rotate (through carry) left and right. Conditional branches (carry clear, carry set, zero, or non-zero), branch always, and branch to subroutine (and return) make up the branching instructions. In each case the absolute branch address follows the instruction, and is calculated by the assembler. Load and store instructions include immediate instructions to write to the accumulators, and load and store instructions for external data memory.

Figure 2 shows how to connect the PT13 to external ROM and RAM memory and also to input/output (I/O) devices. RAM and I/O occupy the same address space. RAM is only required for variable storage as the stack is register-based. Addresses are formed of two parts: the upper 4 bits are formed by a data page register which is written using a load immediate instruction. The lower byte is either immediate data following the instruction, or the contents of accumulator B, allowing indexed operations.

Figure 2  External memory & I/O connections

This is how to instantiate the PT13 in Verilog:

PT13 PT13(.Clock(XTAL_27M), .RESETn(RESETn), .Program_memory_data(ROM_data[7:0]), .RAM_data_out(read_data[7:0]),  .ROM_A(rom_a[13:0]), .RAM_A(ram_a[11:0]), .RAM_data_in(ram_data_in[7:0]), .read_write(read_write));

Program memory is effectively a single port ROM that is initialised using the assembled binary file (which by default is called as13text.mif). The address to the ROM is ROM_A[13:0] and the read data is ROM_data[7:0]. The clock can be anything you like but should be fixed and stable – I usually have a 27 MHz clock available so I use that. This clock should then be used for anything else connected to the PT13 such as memory or I/O. RESETn is an asynchronous active low reset. RAM is addressed by RAM_A[11:0]. There are separate ports for the read and write data: the data to be written to RAM is RAM_data_in[7:0] and the data read from the RAM is RAM_data_out[7:0]. The read_write signal indicates if the RAM is to be written (low) or read (high).

I usually use one FPGA RAM block for variable storage, even if I am only using a couple dozen locations, as it saves on the logic resources required if I were to implement the memory as registers, whereas I usually have memory left over in the FPGA design. I/O devices use a decoded memory address. To read back more than one address, a multiplexer is necessary. For example:

//          Multiplex read data into PT13

 always @* begin

case (ram_a[15:8])

            8'h00:    read_data = RAM_memory;

8'h09:    read_data = ({6'd0,SDA_HDMI,SCL_HDMI});

            default:  read_data = 8'd0;



Below is an example of an I/O write-only register, which I use for bit-twiddled I2C control. I2C_control_en is decoded from RAM_A[11:0] (8’h09 as it happens, which is read back as shown above).

  // HDMI I2C control

  always @ (posedge XTAL_27M or negedge RESETn) begin

               if (!RESETn) begin

               I2C_control_latch <= 2'd0;

               end else if (I2C_control_en) begin

               I2C_control_latch <= ram_data_in[1:0];

               end else begin

               I2C_control_latch <= I2C_control_latch;



   assign SDA_HDMI = ~I2C_control_latch[1] ? 1'b0 : 1'bZ ;

   assign SCL_HDMI = ~I2C_control_latch[0] ? 1'b0 : 1'bZ ;


The original code for the PT13 was written in Altera’s Hardware Description Language (AHDL), which still produces the most compact code, but I have now rewritten PT13 in Verilog. The last compile I did used 391 LEs. I am sure it could be optimised further, but what I have served its purpose and time was better spent on the work that earned money.

Although PT13 software can be written using any editor, I use one that allows you to add a highlighter file that highlights the op-codes, which improves clarity. The PT13 assembler (as13.exe) was written by a friend of mine who is fluent in the dark arts of software. It runs in a DOS box and, as mentioned, can directly create an Altera MIF file of the correct size. It is convenient to keep the assembler executable in the same directory as the code. In that case, for example, to assemble code called PT13_code.asm, you enter:

as13 –s rom=1024 PT13_code.asm

which will produce a 1 kB file called as13text.mif. The assembler will also report errors in the code.

Over the years, I have used the PT13 in many projects: for user interface, such as control of an LCD character display, digital encoder, and switches; for housekeeping duties, such as what to do in the event of signal loss; and for simple communications, such as RS-232 or I2C control of a peripheral. The PT13 is very useful as part of a complete design as it allows simple modification of some real-time routines. For example, a video AGC routine can be altered by reprogramming the PT13 if some input conditions are encountered that were not considered in the original design. This may not matter in FPGA applications, but for an ASIC design, if a small reprogrammable ROM (internal or external to the ASIC) can be provided, then it is possible to fix the algorithm without having to re-spin the whole design.

The PT13 can be an educational tool — a sort of deconstructed Raspberry Pi, where you can dive down into the actual processor and modify it if you wish. I designed a small board with a few peripherals, such as an LCD, buttons, digital encoder, IR receiver, and a couple of Digilent Pmod ports which allow access to a wide range of peripherals that match the capabilities of the PT13.

I used the now ancient Altera EP2C8 Cyclone II FPGA which has the advantage of no exposed pad, so the board can be hand assembled with a fine tipped soldering iron, a steady hand, and, in my case, a bench microscope. The board was designed using Express PCB’s free-to-download software, so PCBs can be manufactured at low cost (the EP2C8 device is only supported by older versions of the Quartus software).

Designing the PT13 was a lot of fun, and very different from my day job, which is video processing. OK, there are a zillion microprocessors out there running at 600 MHz and able to make you tea whilst reading you a bedtime story, and all for a couple of dollars. But designing the PT13 was a throwback to the days when you made your own audio amplifiers or AM radio. Probably dozens of projects have “PT13 Inside”, trundling away, reading and twiddling bits. I keep meaning to go back and optimise the code further, but I never seem able to find the time. Or maybe I’ll code up the Inmos Transputer next. Just for fun of course!



All of the files and information relating to the PT13 are available for download.

This includes the AHDL and Verilog versions of the PT13, links to the editor and assembler, a user manual, code examples, and the PCB design files.

Perhaps you'd enjoy reading about the Dragon computer I used in my youth...


Related articles:

 —Daniel Ogilvie runs a video processing company in Scotland, and has lived and engineered in several continents.