ayzg/cand-lang-proto Documentation - Complete Guide & API Reference

Project Overview

Cand implements the C& Practical Programming Language as a prototype compiler suite in C++. It serves as a testbed for language design, compiler architecture and tooling. This repository houses multiple sub-projects, documentation and test suites to help experimenters explore, extend and contribute to the language.

Goals

Define and iterate on C& language syntax and semantics
Prototype compiler backends and frontends in Visual Studio
Validate features through automated tests and playground code
Document language constructs for users and contributors

Major Sub-Projects

Candi_Compiler
The primary compiler prototype. Included in Candi_Compiler.sln, it builds Debug/Release for x86 and x64. Use Visual Studio or MSBuild to compile and explore the core parsing, semantic analysis and code generation modules.
cand_official_compiler
A standalone CLI compiler implementing stable language features. Produces object code or bytecode for supported targets.
tests
Unit and integration tests cover lexer, parser, type checker and codegen. Run via CTest or your preferred test runner to validate compiler changes.
scrapbook
A sandbox project for rapid prototyping of new language features, experiments and performance benchmarks without affecting the main compiler.

Quick Build & Run

# Clone and enter repo
git clone https://github.com/ayzg/cand-lang-proto.git
cd cand-lang-proto

# Build all projects (requires Visual Studio/MSBuild)
msbuild Candi_Compiler.sln /p:Configuration=Release /p:Platform=x64

# Run official compiler on a sample file
bin/candc -i samples/hello.cand -o hello.bin

# Execute tests
cd tests
ctest -C Release

Who Should Use This Repo

Experimenters exploring new programming‐language features
Language enthusiasts studying compiler internals
Prospective contributors aiming to shape C& design and implementation

Getting Started

This guide takes you from a clean checkout to a successful build of the compiler and its tests, then runs a five-minute “first compilation” on a sample .candi script.

Prerequisites

Windows 10 or later
Visual Studio 2022 (v143 toolset) with “Desktop development with C++”
Git
(Optional) MSBuild CLI (msbuild.exe)

1. Clone the Repository

git clone https://github.com/ayzg/cand-lang-proto.git
cd cand-lang-proto

2. Restore Packages

Visual Studio will auto-restore NuGet packages (Google Test v1.8.1.7). To restore on the CLI:

msbuild Candi_Compiler.sln /t:Restore

3. Build the Solution

In Visual Studio

Open Candi_Compiler.sln
Select Debug | x64 (or Release | x64)
Build → Build Solution (Ctrl + Shift + B)

On the Command Line

msbuild Candi_Compiler.sln /p:Configuration=Debug /p:Platform=x64 /m

Binaries land in each project’s bin\$(Configuration)\$(Platform) folder.

4. Run the Test Suite

In Visual Studio

Test → Run All Tests (Test Explorer)

On the Command Line

vstest.console.exe cand_official_compiler_tests\bin\Debug\x64\cand_official_compiler_tests.dll

You should see all Google Test cases passing.

5. First Compilation: “Hello, Cand!”

Create a minimal script hello.candi:

fn main() {
  println("Hello, Cand!");
}

Invoke the official compiler executable:

cd cand_official_compiler\bin\Debug\x64
.\cand_official_compiler.exe --input ../../../hello.candi

Verify output on stdout:

Program AST:
ROOT
  FN_DECL (main)
    PARAM_LIST
    BLOCK
      CALL_EXPR (println)
        LITERAL ("Hello, Cand!")

This confirms lexing, parsing, and AST generation work end-to-end.

You’re now set to explore the compiler internals, add new language features, or integrate code generation.

Cand Language Guide

A beginner’s tour of Cand syntax and semantics. Every concept links to compiler constants in Candi_Compiler for quick cross-reference.

Keywords

Reserved words for declarations, control flow and modifiers.
Each keyword maps to a string constant in cand_constants.hpp and a token kind in cand_syntax.hpp.

• def
• Constant: cand_constants::KW_DEF
• Token: syntax::tk::def_
• Declares functions and variables
• if / else
• Constants: cand_constants::KW_IF, cand_constants::KW_ELSE
• Tokens: syntax::tk::if_, syntax::tk::else_
• while
• Constant: cand_constants::KW_WHILE
• Token: syntax::tk::while_
• for
• Constant: cand_constants::KW_FOR
• Token: syntax::tk::for_
• return
• Constant: cand_constants::KW_RETURN
• Token: syntax::tk::return_
• break, continue
• Constants: cand_constants::KW_BREAK, cand_constants::KW_CONTINUE
• Tokens: syntax::tk::break_, syntax::tk::continue_

Example:

def max(a: int, b: int) -> int {
    if a > b {
        return a
    } else {
        return b
    }
}

Basic Types

Built-in and meta types. Represented in AST as astnode_enum::atype_.
• int (cand_constants::TYPE_INT)
• float (cand_constants::TYPE_FLOAT)
• bool (cand_constants::TYPE_BOOL)
• string (cand_constants::TYPE_STRING)
• auto (cand_constants::TYPE_AUTO) – type inference

Example:

def greet(name: string) {
    print("Hello, " + name)
}

Literals

Token kinds in cand_syntax.hpp:
• Number: syntax::tk::number_literal_
• String: syntax::tk::string_literal_
• Char: syntax::tk::char_literal_
• Boolean: syntax::tk::bool_literal_

Example:

def config() {
    port: int = 8080           # number_literal_
    pi: float = 3.1415         # number_literal_
    flag: bool = true          # bool_literal_
    msg: string = "Cand ✓"     # string_literal_
    letter: char = 'C'         # char_literal_
}

Operators

Defined in cand_constants.hpp (e.g. OP_PLUS, OP_LT_EQ) and tokenized as syntax::tk.

Arithmetic
• + (OP_PLUS, syntax::tk::plus_)
• - (OP_MINUS, syntax::tk::minus_)
• *, /, %

Comparison
• ==, != (OP_EQ, OP_NE)
• <, <=, >, >=

Logical
• &&, ||, !

Assignment & Compound
• = (OP_ASSIGN)
• +=, -=, *=

Example:

def update(x: int) {
    x += 5       # uses OP_PLUS_ASSIGN
    if (x % 2) == 0 && x > 10 {
        print("even and large")
    }
}

Control Structures

Block-structured constructs using braces {}.

If / Else

if condition {
    // true-branch
} else {
    // false-branch
}

While

while counter < 10 {
    counter += 1
}

For

for i in 0..5 {
    print(i)
}

Break / Continue
• Use break to exit loops (KW_BREAK)
• Use continue to skip to next iteration (KW_CONTINUE)

Directives

Preprocessor-style commands recognized by the lexer via DIRECTIVE_… constants in cand_constants.hpp.

• #include (cand_constants::DIRECTIVE_INCLUDE)
• #define (cand_constants::DIRECTIVE_DEFINE)
• #pragma (cand_constants::DIRECTIVE_PRAGMA)

Example:

#include "std.cand"    # loads standard library
#pragma optimize(on)  # enables optimizations

Macros

Simple text substitution via #define. The lexer treats the directive, parser records macro definitions in AST.

Example:

#define MAX(a, b) ((a) > (b) ? (a) : (b))

def test() {
    x: int = MAX(3, 7)   # expands to ((3) > (7) ? (3) : (7))
    print(x)             # outputs 7
}

Cross-Reference

cand_constants.hpp holds all string literals for keywords, operators, delimiters and directives.
cand_syntax.hpp defines token kinds (syntax::tk) and AST node enums (astnode_enum).
Use syntax::get_keyword_kind(str) and syntax::get_node_priority() for parser logic.

This guide maps Cand’s surface syntax to compiler constants and AST types, enabling quick lookups and reliable tooling.

Compiler Architecture

This section details the key stages of the Candi compiler and the primary data structures each stage manipulates. Contributors will gain insight into token navigation, expression parsing, AST construction, macro processing, compile‐time environments, and error reporting.

Token Cursor (`tk_cursor`): Navigating and Inspecting Token Streams

Purpose
Provide a lightweight, iterator‐like API over a vector of tokens (tk_vector) that simplifies lookahead, peeking, advancing, and token‐property queries in your parser.

Core Functionality

Wraps a pair of const-iterators (begin, end) and a current position it_.
Returns an EOF token for out‐of‐bounds via get().
Exposes literal, type, line, column, precedence and associativity.
Converts the current token to an AST statement node with to_statement().

Key Methods

const tk& get() const
tk_enum type() const / bool type_is(tk_enum t) const
sl_u8string lit() const
sl_size line() const, sl_size column() const
tk_cursor& advance(int n = 1) / tk_cursor next(int n = 1) const
const tk& peek(int n = 0) const
tk_cursor jump_to(tk_vector_cit new_cursor) / advance_to(...)
int priority(), syntax::e_assoc associativity()
astnode to_statement()
bool is_keyword_type() const

Practical Example: Parsing a Binary Expression

// tokens: [number_literal("3"), addition_("+"), number_literal("4"), eof_]
tk_cursor cur(tokens.cbegin(), tokens.cend());
if (cur.type_is(tk_enum::number_literal_)) {
    auto lhs = cur.get().literal();  // "3"
    cur.advance();                   // now at '+'
    if (cur.type_is(tk_enum::addition_)) {
        cur.advance();               // now at second number
        auto rhs = cur.get().literal();  // "4"
        int prec = cur.priority();
        auto assoc = cur.associativity();
        // build AST node...
    }
}

Lookahead & Matching

// Pattern: identifier '=' identifier
if (cur.peek(0).type_is(tk_enum::alnumus_) &&
    cur.peek(1).type_is(tk_enum::simple_assignment_) &&
    cur.peek(2).type_is(tk_enum::alnumus_)) {
    // assignment statement
}

Advanced Usage: Backtracking

auto saved = cur;       // copy
cur.advance(3);
if (!cur.type_is(tk_enum::semicolon_)) {
    cur = saved;        // rewind
}

Expression Parsing with Parenthesizer

Purpose
Transform a flat token stream into a fully‐parenthesized AST using a two‐phase API:

Parenthesizer rewrites tokens, injecting parentheses for precedence/associativity.
Recursive descent (parse_expression_impl) consumes the parenthesized list to build the AST.

Basic Usage

#include "parenthesizer.hpp"
#include "cand_syntax.hpp"

// tokenize input
std::vector<tk> tokens = tokenize("1 + 2 * 3");

// single‐call parse_expression hides both phases
ast root = caoco::parse_expression(tokens.cbegin(), tokens.cend());
// root now represents (1 + (2 * 3))

Direct Parenthesizer Invocation

auto p = caoco::parenthesizer(tokens.cbegin(), tokens.cend());
std::vector<tk> parenTokens = p.parenthesize_expression();
// ["(", "1", "+", "(", "2", "*", "3", ")", ")"]

Tips

Mismatched scopes trigger exceptions from tk_scope::find_paren.
Multi‐character operators carry correct priority/assoc metadata.
Function calls appear as postfix operators and wrap their arguments.

AST Node Construction Patterns

Purpose
Show how to instantiate caoco::astnode for leaf nodes, literals, and composite nodes.

Leaf node (no literal or children)

caoco::astnode eof_node;                      // type = eof_
caoco::astnode invalid_node(astnode_enum::invalid_);

Literal‐only node

caoco::astnode str_lit(astnode_enum::string_literal_, u8"\"hello world\"");
auto beg = tokens.cbegin() + startIdx;
auto end = tokens.cbegin() + endIdx;
caoco::astnode num_lit(astnode_enum::number_literal_, beg, end);

Node with explicit children vector

caoco::astnode lhs(astnode_enum::alnumus_, u8"a");
caoco::astnode rhs(astnode_enum::alnumus_, u8"b");
sl::sl_vector<caoco::astnode> operands{lhs, rhs};
caoco::astnode add_expr(astnode_enum::addition_, u8"+", operands);

Variadic‐children constructor

caoco::astnode stmt1(astnode_enum::statement_, u8"doSomething();");
caoco::astnode stmt2(astnode_enum::statement_, u8"return;");
caoco::astnode block(astnode_enum::functional_block_, u8"{…}", stmt1, stmt2);

Combining token ranges and variadic children

auto parenBeg = tokens.cbegin() + i;
auto parenEnd = tokens.cbegin() + j;
caoco::astnode mul(astnode_enum::multiplication_, u8"*",
                   caoco::astnode(astnode_enum::alnumus_, u8"x"),
                   caoco::astnode(astnode_enum::number_literal_, u8"2"));
caoco::astnode paren_expr(astnode_enum::LRScope, parenBeg, parenEnd, mul);

Guidance

Use token‐range constructors to reflect exact input text.
Assemble subtrees with vector or variadic constructors.
Choose specific e_type to aid downstream analyses.

Macro Expansion (`macro_expander.hpp`)

Purpose
Process #macro … #endmacro directives in a token stream, record macro bodies, and inline expansions before parsing.

Core API

sl_tuple<tk_vector, bool, sl_string> macro_expand(
    tk_vector code,
    sl_string source_file
);

Returns:

std::get<0>: expanded tokens
std::get<1>: success flag
std::get<2>: error message if failure

Workflow

Scan with tk_cursor.
On #macro IDENT, collect until #endmacro.
Store body in std::map<sl_u8string, tk_vector>.
Inline bodies on matching identifiers.
Report redefinitions or syntax errors.

Example

const char* src = R"(
#macro PI
3.14159
#endmacro
double area = PI * r * r;
)";
auto token_stream = tokenizer(src, src + strlen(src))();
auto tokens = token_stream.extract();
auto [expanded, ok, err] = caoco::macro_expand(tokens, "circle.cal");
if (!ok) throw std::runtime_error(err);
// expanded now holds tokens for "double area = 3.14159 * r * r;"

Guidance

Expand macros after include resolution, before grammar parsing.
Enforce one definition per name.
Macro bodies support any tokens but no parameters.
On error, log the returned message immediately.

rtenv: Managing Variable Scopes and Lookups

Purpose
Offer a hierarchical environment (rtenv) for compile‐time constant evaluation, with nested scopes, shadowing, and parent fallback.

Essential Methods

add_subenv(name)
create_variable(name, value)
resolve_variable(name) const
get_variable(name)
set_variable(name, value)
delete_variable(name)

Example

#include "constant_evaluator.hpp"
using namespace caoco;

// Root environment
rtenv global_env("global");
auto x_res = global_env.create_variable("x", RTValue{RTValue::NUMBER, 42});

// Nested function scope
rtenv& func_env = global_env.add_subenv("myFunction");
auto y_res = func_env.create_variable("y", RTValue{RTValue::NUMBER, 100});

// Resolve and shadow
auto rx = func_env.resolve_variable("x");       // parent
auto x2 = func_env.create_variable("x", RTValue{RTValue::NUMBER, 7});
auto rx2 = func_env.resolve_variable("x");      // shadowed

// Update and delete
auto set_res = func_env.set_variable("x", RTValue{RTValue::NUMBER, 99});
bool delY = func_env.delete_variable("y");

Guidance

Check .valid() before accessing .value().
create_variable fails on redeclaration in the same scope.
Shadow parents by declaring the same name in child.
delete_variable removes only from its declaring scope.

Error Message Generators in Candi Compiler

Purpose
Provide reusable lambdas that produce consistent error messages during tokenization and parsing. Namespaces: caoco::ca_error (generic) and compiler_error (project‐specific).

Tokenization Error Helpers (namespace caoco::ca_error::tokenizer)

invalid_char(sl_size line, sl_size col, char8_t c, sl_string error="")
lexer_syntax_error(...)
programmer_logic_error(...)

Parser Error Helpers (namespace caoco::ca_error::parser)

programmer_logic_error(astnode_enum type, tk_vector_cit loc, sl_string msg="")
operation_missing_operand(astnode_enum type, tk_vector_cit loc, sl_string msg="")
invalid_expression(tk_vector_cit loc, sl_string msg="")

Example

if (!isValidChar(c)) {
    auto msg = caoco::ca_error::tokenizer::invalid_char(line, col, c, "allowed: alphanumerics or '_'");
    throw std::runtime_error(msg);
}

if (current.type() == tk::PLUS && nextIsNotExpression()) {
    auto err = compiler_error::parser::operation_missing_operand(
        e_ast::AddOp, token_it, "right operand expected"
    );
    throw std::logic_error(err);
}

Guidelines

Throw or log messages immediately upon detection.
Use programmer_logic_error for internal invariants and the others for user errors.
Include error_location->literal_str() for clarity.
using namespace compiler_error::parser; can simplify calls.

Extending the Compiler & Language

This section guides you through hacking on the Cand compiler: applying coding standards, using file templates, and a worked example of adding a new operator from tokenizer through constant evaluation.

Coding Standards & File Templates

All new headers and sources must follow the templates in
cand_official_scrapbook/code_file_templates.h and
cand_official_scrapbook/code_file_templates_cand.h.
Adhere to naming rules in cand_official_scrapbook/standard_naming_rules.h:
• No leading underscores in public identifiers
• No double underscores anywhere
• Avoid std:: namespace collisions
Fill in metadata (project, author, license), section markers, and Doxygen-style comments per template.

Example Header Skeleton

//===- PowExpr.h - AST for '^' operator ---------------------*- C++ -*-===//
//
// Part of the Cand language prototype.
// Licensed under the Apache License v2.0 with LLVM Exceptions.
// See LICENSE.txt for details.
//
//===----------------------------------------------------------------------===//
///
/// \file
/// Declares PowExpr, the AST node for the power operator (^).
//===----------------------------------------------------------------------===//

#ifndef CAND_AST_POWEXPR_H
#define CAND_AST_POWEXPR_H

#include "cand/AST/Expr.h"

namespace cand {
namespace ast {

/// Represents “lhs ^ rhs” in the AST.
class PowExpr : public BinaryExpr {
public:
  /// lhs ^ rhs
  PowExpr(Expr *lhs, Expr *rhs);

  /// Constant-fold “a ^ b” when both operands are integers.
  llvm::APSInt constantFold() const override;
};

} // namespace ast
} // namespace cand

#endif // CAND_AST_POWEXPR_H

Template Usage

Copy code_file_templates_cand.h to cand/AST/PowExpr.h.
Replace placeholders: <FILENAME>, <DESCRIPTION>, <NAMESPACE>.
Add corresponding PowExpr.cpp using the source‐file template.
Run clang-format to enforce style.

Worked Example: Adding “^” (Power) Operator

1. Tokenizer

In cand/Lexer/Lexer.cpp, map ‘^’ to a new token:

--- a/cand/Lexer/Lexer.cpp
+++ b/cand/Lexer/Lexer.cpp
@@ Token Lexer::nextToken() {
   char C = peek();
   switch (C) {
+  case '^':
+    advance();
+    return makeToken(Token::Caret, "^");
   // existing cases...
   }

Add to cand/Lexer/Token.h:

enum class Token : unsigned {
  // ...
  Caret,    // '^'
  // ...
};

2. Parser

In cand/Parser/Parser.cpp, assign precedence and parse it:

// Precedence table
static const std::map<Token, int> BinaryPrecedence = {
  // ... existing operators ...
  {Token::Caret, 7}, // higher than * / %
};

// In Parser::parsePrimaryExpr() entry:
// ensure parseUnaryExpr → parsePowExpr → parseBinOpRHS chain

Expr *Parser::parsePowExpr() {
  Expr *LHS = parseUnaryExpr();
  while (curToken() == Token::Caret) {
    consume(Token::Caret);
    Expr *RHS = parseUnaryExpr();
    LHS = new ast::PowExpr(LHS, RHS);
  }
  return LHS;
}

// In parseExpression():
// replace call to parseBinOpRHS(parseUnaryExpr(), 0)
// with parseBinOpRHS(parsePowExpr(), 0)

3. AST & Constant Evaluation

Implement the AST node and fold constants:

// cand/AST/PowExpr.cpp
#include "cand/AST/PowExpr.h"
#include <cmath>

using namespace cand::ast;

PowExpr::PowExpr(Expr *L, Expr *R)
  : BinaryExpr(OpKind::Pow, L, R) {}

llvm::APSInt PowExpr::constantFold() const {
  auto LV = left()->constantFold();
  auto RV = right()->constantFold();
  if (LV && RV) {
    int64_t result = std::pow(
      LV.getLimitedValue(),
      RV.getLimitedValue()
    );
    return llvm::APSInt(llvm::APInt(64, result), /*isUnsigned=*/false);
  }
  return Expr::constantFold();
}

Register OpKind::Pow in your operator-kind enum (cand/AST/Expr.h) and extend any visitor or IR‐lowering logic accordingly.

4. Testing

Create a unit test in tests/power.cand:

const x = 2 ^ 3;
assert(x == 8);

Run ./cand tests/power.cand and verify output.

Practical Tips

Always update Token.h, Lexer.cpp, Parser.cpp, AST enums, and constant evaluator together.
Follow include-guard conventions: PROJECT_SUBPATH_FILENAME_H.
Document any new grammar rules in /docs/lang.md.
Use clang-format and clang-tidy to enforce standards before commit.

Testing & Validation

Candi-lang-proto provides two complementary test suites:

Google-Test unit tests covering tokenizer, parser, AST, preprocessor and runtime.
Scripted integration tests using .candi source files to validate end-to-end compiler behavior.

Google-Test Suites

Candi Compiler Unit Tests

Location

Core tests: Candi_Compiler/test.cpp
Support utilities: Candi_Compiler/unit_test_util.hpp
Official VS project: cand_official_compiler_tests/cand_official_compiler_tests.vcxproj

Building & Running via CMake

mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug ..
cmake --build . --target all
ctest --output-on-failure

Running the Test Binary Directly

# After build, assume binary is CandiTests
./CandiTests                     # run all tests
./CandiTests --gtest_filter=Tokenizer*   # run only tokenizer tests

Running in Visual Studio

Open cand_official_compiler_tests.sln.
Build Solution (Debug/Release).
Open Test Explorer → run or filter tests by name.

Core Test Utilities

unit_test_util.hpp streamlines writing parser/tokenizer tests:

test_single_token(input, type, literal)
print_ast(astnode)
compare_ast(actual, expected)
test_parsing_function(name, functor, it_begin, it_end)
test_and_compare_parsing_function_from_u8(name, functor, expected_ast, source_u8)

Example: Adding a Parser Unit Test

#include "unit_test_util.hpp"

TEST(Parser, ParseSimpleFunction) {
  // Build expected AST: function foo() { return 42; }
  caoco::astnode expected("FunctionDecl", "foo");
  expected.add_child(caoco::astnode("ParamList",""));
  expected.add_child(caoco::astnode("Body","")
    .add_child(caoco::astnode("ReturnStmt","")
      .add_child(caoco::astnode("NumberLiteral","42"))
    )
  );

  bool ok = test_and_compare_parsing_function_from_u8(
    u8"SimpleFunction",
    caoco::parse_function_definition,
    expected,
    u8"int foo() { return 42; }"
  );
  ASSERT_TRUE(ok);
}

Scripted Integration Tests (.candi)

Place end-to-end test cases under tests/integration/:

tests/integration/*.candi — source files
tests/integration/*.out — expected stdout

Basic runner (bash):

#!/usr/bin/env bash
COMPILER=./candi        # path to built compiler
for src in tests/integration/*.candi; do
  exe=${src%.candi}.exe
  out=${src%.candi}.out
  echo "=== Testing $(basename "$src") ==="
  $COMPILER -o "$exe" "$src"     # compile
  if [ $? -ne 0 ]; then
    echo "Compilation failed"; exit 1
  fi
  "$exe" > actual.out
  diff -u "$out" actual.out || exit 1
done
echo "All integration tests passed."

Running all integration tests:

chmod +x scripts/run_integration_tests.sh
./scripts/run_integration_tests.sh

Adding New Tests

Unit Test (Google-Test)
- In Candi_Compiler/test.cpp or new .cpp under that folder.
- Include unit_test_util.hpp for parser/tokenizer helpers.
- Register your test with TEST(SuiteName, CaseName).
- Rebuild and run via CTest or VS Test Explorer.
VS Project Test
- Drop new headers/implementation under cand_official_compiler_tests/ut/....
- Add the file to cand_official_compiler_tests.vcxproj.
- Include <gtest/gtest.h> and write TEST(...) cases.
- Build and run via Visual Studio.
Integration Test (.candi)
- Add foo.candi to tests/integration/.
- Create foo.out with expected stdout.
- Verify compilation and runtime behavior via the integration runner.
- Commit both .candi and .out files.

With these suites and helpers in place, you can confidently validate new features, catch regressions early, and extend Candi’s language support.

Contributing & Development Workflow

This project is archived but welcomes pull requests, issue reports, and design discussions. Use the following guidelines to contribute effectively.

Project Governance

Repository status: Archived. No active roadmap or release schedule.
Pull requests: Welcome. Maintainers review on a best-effort basis.
Issue tracker: All historic issues transferred; file new issues for bugs or proposals.

Branching Model

Adopt a simple Git Flow-inspired workflow:

main
Stable snapshots. Merges only from develop or hotfix branches.
develop
Integration branch for features. Contributors target this branch.
feature/XYZ
Short-lived feature branches off develop. Name using kebab-case, e.g., feature/new-parser.
hotfix/XYZ
Branch off main to address critical bugs. After merge, tag a new release and merge back into develop.

Workflow example:

# Start a new feature
git checkout develop
git pull origin develop
git checkout -b feature/add-expression-evaluator

# Work, then open PR to develop
git push -u origin feature/add-expression-evaluator

Code Style & Naming Conventions

Follow the rules in standard_naming_rules.h to avoid reserved-identifier conflicts:

Identifier Rules

Do not start any identifier with _ or __.
Do not use names in the std namespace or inject into it.
Use PascalCase for types, enums, and classes.
Use camelCase for functions, methods, and variables.
Constants use ALL_CAPS with underscores.

Valid Examples

// Types and enums
class TokenStream { /* ... */ };
enum ParseMode { ParseModeStrict, ParseModeLenient };

// Functions and variables
void tokenizeInput(const std::string& source);
int parseExpression(int precedenceLevel);

// Constants
static constexpr int MAX_TOKENS = 1024;

Invalid Examples

// Reserved or non-compliant
class _InternalParser;         // Leading underscore
int __compute;                 // Double underscore
using namespace std;           // Injecting into std
enum std::ErrorCode { Err };   // Defining in std namespace

Design Proposals & Discussions

Use GitHub’s issue tracker for high-level proposals or design discussions:

Search existing issues to avoid duplication.
Create a new issue with the “proposal” label and a clear title.
Describe motivation, API sketches, and potential impact.

For ad-hoc chat, use the “Discussions” tab (if enabled), or tag maintainers in your issue.

Documentation

Contents

Quick Actions

Contents

Quick Actions

Chat about this codebase

Chat about this codebase

Project Overview

Goals

Major Sub-Projects

Quick Build & Run

Who Should Use This Repo

Getting Started

Prerequisites

1. Clone the Repository

2. Restore Packages

3. Build the Solution

In Visual Studio

On the Command Line

4. Run the Test Suite

In Visual Studio

On the Command Line

5. First Compilation: “Hello, Cand!”

Cand Language Guide

Keywords

Basic Types

Literals

Operators

Control Structures

Directives

Macros

Cross-Reference

Compiler Architecture

Token Cursor (tk_cursor): Navigating and Inspecting Token Streams

Expression Parsing with Parenthesizer

AST Node Construction Patterns

Macro Expansion (macro_expander.hpp)

rtenv: Managing Variable Scopes and Lookups

Error Message Generators in Candi Compiler

Extending the Compiler & Language

Coding Standards & File Templates

Example Header Skeleton

Template Usage

Worked Example: Adding “^” (Power) Operator

1. Tokenizer

2. Parser

3. AST & Constant Evaluation

4. Testing

Practical Tips

Testing & Validation

Google-Test Suites

Candi Compiler Unit Tests

Core Test Utilities

Scripted Integration Tests (.candi)

Adding New Tests

Contributing & Development Workflow

Project Governance

Branching Model

Code Style & Naming Conventions

Identifier Rules

Valid Examples

Invalid Examples

Design Proposals & Discussions

Token Cursor (`tk_cursor`): Navigating and Inspecting Token Streams

Macro Expansion (`macro_expander.hpp`)