Chat about this codebase

AI-powered code exploration

Online

Project Overview

Lukas' Microprocessor (LMP) is a minimal educational CPU simulator. It models a simple fetch-decode-execute cycle on a small instruction set. LMP includes a built-in Fibonacci demo to showcase register operations, branching, and memory access.

Goals

  • Demonstrate core CPU concepts: instruction formats, control unit, datapath
  • Provide a hands-on platform for learning assembly programming
  • Offer a foundation for extending a CPU with additional instructions and logic

When to Use

  • Teaching or studying basic CPU architecture
  • Prototyping simple instruction-set extensions
  • Exploring low-level programming and bitwise operations

Key Features

  • Instruction set:
    • Data movement: LOAD, STORE, MOV
    • Arithmetic/logic: ADD, SUB (AND, OR, LSL, LSR pending)
    • Control flow: JMP, JZ (jump if zero), JNZ (jump if not zero)
    • HALT to stop execution
  • Built-in Fibonacci demo in demos/fibonacci.bin
  • Assembly format with binary encodings
  • Architecture diagrams in docs/architecture-0.png and docs/architecture-1.png

Quickstart

CLI Usage

# Clone and build
git clone https://github.com/e3ntity/lmp.git
cd lmp
cargo build --release

# Run the Fibonacci demo
./target/release/lmp demos/fibonacci.bin

Library Integration

use lmp::{Loader, Simulator};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load preassembled binary
    let program = Loader::from_file("demos/fibonacci.bin")?;
    
    // Initialize simulator and load program
    let mut sim = Simulator::new();
    sim.load(&program);
    
    // Execute until HALT
    sim.run();
    
    // Output result in register R0
    println!("Fibonacci result: {}", sim.reg(0));
    Ok(())
}

This overview helps you get started with LMP’s CPU simulation, explore its built-in demo, and extend the instruction set or datapath.

Getting Started

This guide walks you through cloning the e3ntity/lmp repository, building the simulator, running the built-in Fibonacci sample, and verifying its output.

Prerequisites

  • Git
  • GNU Make
  • GCC (or compatible C compiler)

1. Clone the Repository

Execute:

git clone https://github.com/e3ntity/lmp.git
cd lmp

2. Build the Simulator

Run:

make build

This command:

  • Compiles all C source files in src/
  • Outputs the executable mlp into the build/ directory
  • Enables debugging flags (-g)

3. Run the Fibonacci Sample

Invoke the simulator:

make run

Under the hood, this runs:

./build/mlp

The program loads a Fibonacci routine into simulated memory, steps through fetch–decode–execute cycles, and prints results.

4. Verify the Output

On successful execution, you should see the first 10 Fibonacci numbers (indices 0–9):

Fibonacci sequence:
0
1
1
2
3
5
8
13
21
34

If the output matches, the simulator correctly executed the sample. Proceed to explore other demos or extend the instruction set.

LMP Architecture & Instruction Set

Lukas’ Microprocessor (LMP) implements a simple 16-bit Von‐Neumann architecture with a unified address space, eight 8-bit general-purpose registers, a status register for flags, and a small control unit that drives its datapath. Below is a detailed breakdown of its components, control signals, memory map, and instruction encodings.

Register File & Flags

  • Eight 8-bit general-purpose registers R0–R7
  • Status Register (SR) holds:
    • Z (Zero) flag: set if ALU result == 0
    • N (Negative) flag: set if MSB of ALU result == 1
    • C (Carry) flag: set on unsigned overflow

Memory Map

  • 16-bit address bus (0x0000–0xFFFF)
  • 0x0000–0x7FFF: Program ROM
  • 0x8000–0xBFFF: Data RAM
  • 0xC000–0xCFFF: Memory‐mapped I/O
  • 0xD000–0xFFFF: Reserved

Control Signals

Control signals decode each fetched instruction into micro‐operations in the datapath. See the block diagram for visual reference:

![Control Unit & Datapath][docs/architecture-1.png]

Key signals (from src/control_unit.h):

  • regWrite: write register file
  • memRead / memWrite: access data memory
  • aluSrc: select second ALU operand (register vs immediate)
  • branch: conditionally update PC
  • jump: unconditional PC ← address
  • aluOp[1:0]: ALU function code (00=ADD, 01=SUB, 10=AND, 11=OR)

Control‐signal decoder (simplified):

#include "control_unit.h"

// Extract fields and generate control signals
ControlSignals decodeControlSignals(uint16_t instr) {
    ControlSignals s = {0};
    uint8_t opcode = (instr >> 12) & 0xF;
    switch (opcode) {
      case 0x1: // ADD
      case 0x2: // SUB
      case 0x3: // CMP
      case 0x4: // MOV
        s.regWrite = true;
        s.aluSrc   = false;
        s.aluOp    = (opcode == 0x2) ? 0x1 : 0x0;
        break;
      case 0x5: // LD
        s.memRead  = true;
        s.aluSrc   = true;
        s.regWrite = true;
        break;
      case 0x6: // ST
        s.memWrite = true;
        s.aluSrc   = true;
        break;
      case 0x7: // B
        s.branch   = true;
        break;
      case 0x8: // JMP
        s.jump     = true;
        break;
      default:
        break;
    }
    return s;
}

Instruction Formats

Refer to the assembly formats diagram for bit-level layouts:

![Instruction Formats][docs/architecture-0.png]

LMP uses three 16-bit formats:

R-Type (register)
bits 15–12 opcode | 11–9 rd | 8–6 rs | 5–3 rt | 2–0 —
I-Type (immediate)
bits 15–12 opcode | 11–9 rd | 8–0 imm9 (signed)
J-Type (jump)
bits 15–12 opcode | 11–0 address12

Instruction Set & Encodings

Opcode (hex) Mnemonic Format Description
0x1 ADD rd, rs, rt R rd ← rs + rt
0x2 SUB rd, rs, rt R rd ← rs – rt
0x3 CMP rs, rt R Z←(rs==rt), N←MSB(rs–rt)
0x4 MOV rd, rs R rd ← rs
0x5 LD rd, [rs + imm] I rd ← Mem[rs + sign_ext(imm9)]
0x6 ST rs, [rd + imm] I Mem[rd + sign_ext(imm9)] ← rs
0x7 B cond, offset I if(cond) PC ← PC + offset
0x8 JMP address J PC ← address12
0x9–0xC AND | OR | LSL | LSR (unimpl.) R Reserved for future bitwise ops

Note: AND, OR, LSL, LSR opcodes are defined in the format but not implemented in the current microcode.

Example: Encoding & Execution Flow

    ; R0 ← R1 + R2
    ADD R0, R1, R2        ; opcode=0x1, rd=0, rs=1, rt=2
    ; LD R3, [R0 + -4]
    LD  R3, [R0, #-4]      ; opcode=0x5, rd=3, rs=0, imm9=0x1FC
    ; if Z flag set jump ahead
    BZ  +8                ; opcode=0x7, cond=Z, offset=+8
    JMP 0x010             ; opcode=0x8, address=0x010

Execution in C driver:

uint16_t instr = fetchInstruction(pc);
ControlSignals ctl = decodeControlSignals(instr);
uint8_t A = regFile[ (instr>>6)&0x7 ];
uint8_t B = ctl.aluSrc
             ? signExtend(instr & 0x1FF)
             : regFile[ (instr>>3)&0x7 ];
uint16_t result = alu(A, B, ctl.aluOp);
// writeback
if (ctl.memRead)   data = dataMemory[result];
if (ctl.memWrite)  dataMemory[result] = regFile[(instr>>9)&0x7];
if (ctl.regWrite)  regFile[(instr>>9)&0x7] = 
                    (ctl.memRead ? data : (uint8_t)result);
// branching
if (ctl.branch && checkCondition(instr, SR)) 
    pc += signExtend(instr & 0x1FF);
if (ctl.jump) pc = instr & 0x0FFF;

This completes the deep dive into LMP’s CPU architecture and instruction set.

Codebase Tour

The LMP core comprises modules for arithmetic, control decoding, memory management, and instruction execution. This tour highlights where each behavior lives and how execution flows through the system.

ALU Resolve (alu_resolve)

Performs 8-bit arithmetic operations and updates Zero/Negative flags in the register file.

Signature

int8_t alu_resolve(uint8_t alu_ctrl, int8_t a, int8_t b);

Parameters

  • alu_ctrl:
    • ALU_CTRL_NOP (0x00): no operation, returns 0, flags unaffected
    • ALU_CTRL_ADD (0x01): compute a + b
    • ALU_CTRL_SUB (0x02): compute a - b
  • a, b: signed 8-bit operands

Return

8-bit result (or 0 for NOP/invalid). Invalid codes log via DPRINT (showing current reg_ic) and return 0 without changing flags.

Side Effects

Updates bits in reg_file[REG_FLAGS]:

  • Zero (REG_FLAGS_Z): set if result == 0, else toggle
  • Negative (REG_FLAGS_N): set if result < 0, else toggle

Excerpt

int8_t alu_resolve(uint8_t alu_ctrl, int8_t a, int8_t b)
{
    int8_t res;
    switch (alu_ctrl) {
      case ALU_CTRL_NOP: return 0;
      case ALU_CTRL_ADD: res = a + b; break;
      case ALU_CTRL_SUB: res = a - b; break;
      default:
        DPRINT("Error: Invalid alu_ctrl %d at %02x\n", alu_ctrl, reg_ic);
        return 0;
    }
    if (res == 0)
        reg_file[REG_FLAGS] |= REG_FLAGS_Z;
    else
        reg_file[REG_FLAGS] ^= REG_FLAGS_Z;
    if (res < 0)
        reg_file[REG_FLAGS] |= REG_FLAGS_N;
    else
        reg_file[REG_FLAGS] ^= REG_FLAGS_N;
    return res;
}

Usage

#include "alu.h"
#include "register_file.h"

// Clear flags
reg_file[REG_FLAGS] &= ~(REG_FLAGS_Z | REG_FLAGS_N);

// Perform subtraction
int8_t result = alu_resolve(ALU_CTRL_SUB, reg_file[0], reg_file[1]);
bool isZero     = (reg_file[REG_FLAGS] & REG_FLAGS_Z) != 0;
bool isNegative = (reg_file[REG_FLAGS] & REG_FLAGS_N) != 0;

Instruction Decoding (cu_decode)

Translates a 16-bit instruction into control signals that drive the datapath.

Control Signal Structure

// control_unit.h
struct cu_signal {
    uint8_t jmp            :1;
    uint8_t mem_write      :1;
    uint8_t data_wd_select :1;
    uint8_t reg_wd_select  :2; // 0=MOV,1=LDR,2=ALU
    uint8_t imm            :1;
    uint8_t reg_write      :1;
    uint8_t zero_dest      :1;
    uint8_t zero_src       :1;
    uint8_t alu_ctrl;
};

Prototype

// control_unit.h
struct cu_signal cu_decode(uint16_t instr);

Operation

  1. Extract fields:
    uint8_t op  = (instr & 0x1E00) >> 9;
    uint8_t imm = (instr & 0x0100) >> 8;
    
  2. Zero the output struct cu_signal.
  3. Switch on op to set relevant bits:
    • JMP (1): jmp=1
    • MOV (3): imm, reg_wd_select=0, reg_write=1
    • LDR (4): data_wd_select=imm, zero_src=1, reg_wd_select=1, reg_write=1
    • STR (5): data_wd_select=imm, zero_dest=1, mem_write=1
    • ADD (8)/SUB (9): imm, alu_ctrl=ADD/SUB, reg_wd_select=2, reg_write=1
    • Unimplemented ops return all-zero.

Example

#include "control_unit.h"
#include "alu.h"

// Build ADD immediate: op=8 (0x1000), imm=1 (0x0100)
uint16_t instr = 0x1000 | 0x0100; // 0x1100
struct cu_signal sig = cu_decode(instr);
// sig.imm == 1
// sig.alu_ctrl == ALU_CTRL_ADD
// sig.reg_wd_select == 2
// sig.reg_write == 1

Memory Reset (mem_reset)

Clears both instruction and data memories to zero.

Signature

void mem_reset(void);

Details

  • uint8_t mem_instr[MEM_INSTR_SIZE] (256 bytes)
  • uint8_t mem_data[MEM_DATA_SIZE] (256 bytes)
  • Sets all entries to 0.

Excerpt

#include "memory.h"

extern uint8_t mem_instr[];
extern uint8_t mem_data[];

void mem_reset(void)
{
    for (uint32_t i = 0; i < MEM_INSTR_SIZE; i++)
        mem_instr[i] = 0;
    for (uint32_t i = 0; i < MEM_DATA_SIZE; i++)
        mem_data[i] = 0;
}

Usage

int main(void) {
    mem_reset();
    load_program_image("firmware.bin");
    run_cpu();
    return 0;
}

Register File Reset (rf_reset)

Resets the instruction counter and all registers to zero.

Prototype

// src/register_file.h
void rf_reset(void);

Behavior

  • Sets reg_ic = 0
  • Zeroes all REG_COUNT entries in reg_file[].

Excerpt

void rf_reset(void)
{
    reg_ic = 0;
    for (uint32_t i = 0; i < REG_COUNT; i++)
        reg_file[i] = 0;
}

Usage

#include "register_file.h"

int main(void) {
    rf_reset();                // clear IC and registers
    while (1) {
        // fetch-decode-execute loop
    }
    return 0;
}

Instruction Execution Cycle (handle)

Processes a single 16-bit instruction: condition check, decode, execute, and commit.

Steps

  1. Conditional check
    if (!cond_verify(instr)) {
        reg_ic += 2;
        return;
    }
    
  2. Decode control signals
    struct cu_signal cu = cu_decode(instr);
    
  3. Next PC
    uint8_t next_ic = cu.jmp ? (instr & 0xFF) : (reg_ic + 2);
    
  4. Register addressing
    uint8_t rf_a1 = cu.zero_dest ? 0 : ((instr >> 4) & 0x0F);
    uint8_t rf_a2 = cu.zero_src  ? 0 : (instr & 0x0F);
    uint8_t rf_rd1 = reg_file[rf_a1];
    uint8_t rf_rd2 = reg_file[rf_a2];
    
  5. Immediate override
    if (cu.imm)
      rf_rd2 = instr & 0x0F;
    
  6. ALU operation
    uint8_t alu_res = (uint8_t)alu_resolve(
        cu.alu_ctrl, (int8_t)rf_rd1, (int8_t)rf_rd2);
    
  7. Memory access
    uint8_t dm_a  = cu.data_wd_select ? (instr & 0xFF) : rf_rd2;
    uint8_t dm_rd = mem_data[dm_a];
    
  8. Write-back selection
    uint8_t rf_wd;
    switch (cu.reg_wd_select) {
      case 0: rf_wd = rf_rd2;    break;
      case 1: rf_wd = dm_rd;     break;
      case 2: rf_wd = alu_res;   break;
    }
    
  9. Commit stores
    if (cu.mem_write)
      mem_data[dm_a] = rf_rd2;
    if (cu.reg_write)
      reg_file[rf_a1] = rf_wd;
    
  10. Advance PC
    reg_ic = next_ic;
    

Usage

Call handle() each cycle after fetching:

uint16_t instr = *(uint16_t*)(mem_instr + reg_ic);
handle(instr);
## Extending & Contributing

This section explains how to add new instructions (AND, OR, LSL, LSR, etc.), implement their behavior in the ALU, update tests and demos, and follow repository coding conventions and pull-request checklist.

---

### 1. Adding New Opcodes

1. Open **src/control_unit.h**  
   • Locate the `enum alu_ctrl_e`.  
   • Add your new opcode, e.g.:
   ```c
   typedef enum {
     ALU_ADD = 0,
     ALU_SUB = 1,
     ALU_AND = 2,   // new opcode
     ALU_OR  = 3    // another example
   } alu_ctrl_e;
  1. Open src/control_unit.c
    • In cu_decode(uint16_t instr, control_signals_t *cs), map the instruction bits to your new opcode:
    // assume bits 12-15 define ALU opcodes
    switch ((instr >> 12) & 0xF) {
      case 0x0: cs->alu_ctrl = ALU_ADD; break;
      case 0x1: cs->alu_ctrl = ALU_SUB; break;
      case 0x2: // AND instruction encoding
        cs->alu_ctrl    = ALU_AND;
        cs->reg_write   = true;
        break;
      case 0x3: // OR instruction
        cs->alu_ctrl    = ALU_OR;
        cs->reg_write   = true;
        break;
      // … other cases …
    }
    

2. Implementing Behavior in the ALU

  1. Open src/alu.h
    • Ensure your new ALU_AND, ALU_OR codes match the enum in control_unit.h.
  2. Open src/alu.c
    • Extend the alu_resolve switch:
    #include "alu.h"
    #include "lmp.h"      // for debug macros
    
    uint16_t alu_resolve(uint16_t a, uint16_t b, alu_ctrl_e ctrl, regfile_t *rf) {
      uint16_t res = 0;
      switch (ctrl) {
        case ALU_ADD: res = a + b; break;
        case ALU_SUB: res = a - b; break;
        case ALU_AND: res = a & b; break;   // implement AND
        case ALU_OR:  res = a | b; break;   // implement OR
        default:      res = 0;              // NOP or undefined
      }
      // update flags
      rf->zero = (res == 0);
      rf->neg  = (res & 0x8000) != 0;
      DBG("ALU: %04X op %d %04X = %04X (Z=%d, N=%d)\n",
          a, ctrl, b, res, rf->zero, rf->neg);
      return res;
    }
    
    • Use DBG(...) (from lmp.h) for runtime tracing without printf overhead.

3. Updating Tests and Demo Programs

  1. Add or update a unit test in tests/test_alu.c:
    #include "alu.h"
    #include "regfile.h"
    #include <assert.h>
    
    void test_and_operation() {
      regfile_t rf = {0};
      uint16_t r = alu_resolve(0xF0F0, 0x0FF0, ALU_AND, &rf);
      assert(r == 0x00F0);
      assert(!rf.zero);
    }
    
    int main() {
      test_and_operation();
      return 0;
    }
    
  2. Extend demo programs under demo/ or examples/:
    • Encode your new instruction in assembly.
    • Show expected register values.
    • Use DBG() to print intermediate states.

4. Coding Conventions & Pull-Request Checklist

• Indentation: 4 spaces, no tabs.
• Braces: K&R style (opening brace on same line).
• Naming:
– Functions, variables: lower_snake_case
– Types: UpperCamelCase or *_t suffix
– Macros, enums: UPPER_SNAKE_CASE
• Headers: include guards or #pragma once; document public APIs with brief comments.
• Tests: cover edge cases; run make test until all pass.
• Formatting: run make format (uses clang-format).

Pull-Request Checklist:

  • Code compiles without warnings (-Wall -Werror).
  • All new functionality has tests and documentation updates.
  • You ran make format and committed formatting changes.
  • Debug code removed or guarded by DBG() macro.
  • Descriptive commit messages referencing issue numbers.
  • CI passes on your branch.