Skip to content

Aditya-Experiments/syntax-analysis-detection-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🔍 SyntaxScope — Syntax Analysis Detection System

A comprehensive, interactive syntax analysis detection system demonstrating compiler design principles

JavaScript HTML5 CSS3 Compiler Design


📋 Table of Contents


🎯 Overview

SyntaxScope is an interactive web-based project that demonstrates the design and implementation of a syntax analysis detection system — a core component of compiler design. It takes user-provided arithmetic expressions or code snippets and analyzes them to verify whether they follow the rules of a defined context-free grammar (CFG).

What is Syntax Analysis?

Syntax analysis (parsing) is the second phase of a compiler. It takes a stream of tokens from the lexical analyzer and determines whether the token sequence can be generated by the grammar of the source language. The parser constructs a parse tree that represents the hierarchical syntactic structure of the program.

Source Code → [Lexer] → Tokens → [Parser] → Parse Tree → [Semantic Analyzer] → AST → [Code Gen] → Output
                                    ↑
                            You are here!

Lexical Analysis vs. Syntax Analysis

Aspect Lexical Analysis Syntax Analysis
Phase First phase Second phase
Input Character stream Token stream
Output Tokens Parse tree / AST
Grammar Regular expressions Context-Free Grammar
Errors Invalid characters Structural errors
Tool Finite Automaton Pushdown Automaton

✨ Features

  • 🔄 Three Parser Implementations: Recursive Descent, LL(1), and LR(0)
  • 📝 Interactive Tokenizer: Real-time lexical analysis with token visualization
  • 🌳 Parse Tree Visualization: Canvas-based tree rendering with color-coded nodes
  • Syntax Error Detection: Meaningful error messages with position indicators
  • 📊 Step-by-Step Tracing: Watch the parsing process unfold step by step
  • 🧪 Automated Testing: 15+ built-in test cases (valid and invalid)
  • 📱 Responsive Design: Works on desktop, tablet, and mobile
  • 🎨 Premium Dark Theme: Modern glassmorphism UI with animations

🏗️ Architecture

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│   Input      │────▶│   Lexer      │────▶│   Parser     │
│  Expression  │     │  (Tokenizer) │     │  (RD/LL1/LR) │
└──────────────┘     └──────────────┘     └──────┬───────┘
                                                  │
                    ┌──────────────┐     ┌────────▼───────┐
                    │  Tree        │◀────│   AST          │
                    │  Renderer    │     │   (Parse Tree) │
                    └──────────────┘     └────────────────┘

📐 Grammar Definition

Context-Free Grammar (CFG)

A formal grammar G = (V, Σ, P, S) where:

  • V (Variables): {Expr, Term, Factor, Primary}
  • Σ (Terminals): {NUMBER, ID, +, -, *, /, ^, (, )}
  • S (Start symbol): Expr

Production Rules

1.  Expr    → Expr + Term          (Addition)
2.  Expr    → Expr − Term          (Subtraction)
3.  Expr    → Term                 (Base case)
4.  Term    → Term * Factor        (Multiplication)
5.  Term    → Term / Factor        (Division)
6.  Term    → Factor               (Base case)
7.  Factor  → Primary ^ Factor     (Exponentiation, right-assoc)
8.  Factor  → Primary              (Base case)
9.  Primary → NUMBER               (Number literal)
10. Primary → IDENTIFIER           (Variable)
11. Primary → ( Expr )             (Grouping)
12. Primary → − Primary            (Unary negation)

Operator Precedence (low → high)

Level Operators Associativity
1 +, - Left
2 *, / Left
3 ^ Right
4 Unary - Right (prefix)

🔧 Parser Implementations

1. Recursive Descent Parser (Top-Down)

A hand-written parser where each non-terminal in the grammar has a corresponding function. The parser processes input from left to right and constructs a leftmost derivation.

function parseExpr() {
    let node = parseTerm();
    while (token === '+' || token === '-') {
        let op = token;
        advance();
        node = { type: 'BinaryExpr', op, left: node, right: parseTerm() };
    }
    return node;
}

Pros: Easy to implement, excellent error messages, mirrors grammar structure
Cons: Cannot handle left-recursive grammars directly

2. LL(1) Parser (Table-Driven)

Uses a parsing table built from FIRST and FOLLOW sets. Reads input Left-to-right, produces Leftmost derivation, uses 1 token of lookahead.

Pros: No recursion overhead, predictable O(n) performance
Cons: Requires grammar transformation (left-recursion elimination)

3. LR(0) Parser (Shift-Reduce)

A bottom-up parser that shifts tokens onto a stack and reduces them using grammar rules. Uses operator precedence to resolve shift/reduce conflicts.

Pros: Handles wider class of grammars, very powerful
Cons: Complex to implement, harder error messages


❌ Error Detection

The system provides detailed, meaningful error messages:

Input: 3 + * 5
       3 + * 5
           ^ error here
ParseError: Unexpected token STAR('*') — expected NUMBER, IDENTIFIER, '(', or unary '-'

Input: (3 + 5
       (3 + 5
             ^ error here
ParseError: Expected RPAREN but found EOF('EOF')
  • Position tracking: Each token stores its character position for precise error pointing
  • Contextual messages: Errors indicate what was expected vs. what was found
  • Visual indicators: Source code with caret pointing to the error location

📁 Project Structure

syntax-analysis-detection-system/
├── index.html                      # Main HTML page
├── styles.css                      # Complete CSS design system
├── js/
│   ├── lexer.js                    # Tokenizer — converts input to tokens
│   ├── recursive-descent-parser.js # Recursive Descent parser implementation
│   ├── ll1-parser.js               # LL(1) table-driven parser
│   ├── lr-parser.js                # LR(0) shift-reduce parser
│   ├── tree-renderer.js            # Canvas-based parse tree visualizer
│   └── app.js                      # Main application controller
├── README.md                       # Project documentation
└── LICENSE                         # MIT License

🚀 Getting Started

Prerequisites

  • A modern web browser (Chrome, Firefox, Safari, Edge)
  • No build tools or dependencies required!

Running Locally

  1. Clone the repository:

    git clone https://github.com/YOUR_USERNAME/syntax-analysis-detection-system.git
    cd syntax-analysis-detection-system
  2. Open in browser:

    # Option 1: Simply open the file
    open index.html
    
    # Option 2: Use a local server
    npx serve .
    
    # Option 3: Python server
    python3 -m http.server 8080
  3. Try it out:

    • Enter an expression like (3 + 5) * 2
    • Select a parser type
    • Click "Parse" to see tokens, steps, and the parse tree

🧪 Test Cases

Valid Inputs

# Input Description
1 3 + 5 Simple addition
2 3 + 5 * 2 Precedence: multiply before add
3 (3 + 5) * 2 Parenthesized expression
4 2 ^ 3 ^ 2 Right-associative exponentiation
5 -x + 3 Unary minus with identifier
6 a * b + c / d Multiple operators
7 ((1 + 2)) Nested parentheses
8 3.14 * r ^ 2 Decimal with exponent
9 -(-x) Double unary negation
10 42 Single number

Invalid Inputs

# Input Expected Error
11 3 + * 5 Missing operand between operators
12 (3 + 5 Unmatched left parenthesis
13 3 + 5) Unmatched right parenthesis
14 3 + Trailing operator
15 (empty) Empty input

🌐 Applications

Syntax analysis is used extensively across computing:

  • 🛠️ Compilers & Interpreters — GCC, Clang, V8 Engine
  • 📝 Code Editors & IDEs — VS Code, IntelliJ, Tree-sitter
  • 🔎 Linters & Formatters — ESLint, Prettier, Black
  • 🗄️ Database Query Engines — PostgreSQL, MySQL, SQLite
  • 🌐 Web Browsers — HTML/CSS/JS parsing (Blink, Gecko, WebKit)
  • 🤖 Natural Language Processing — spaCy, NLTK, Stanford NLP

🛠️ Technologies Used

Technology Purpose
JavaScript (ES6+) Parser logic, lexer, application controller
HTML5 Canvas Parse tree visualization
CSS3 Dark theme, animations, glassmorphism, responsive design
Inter Font UI typography
JetBrains Mono Code/grammar typography

📚 Key Learnings

  1. Grammar design matters — A well-structured grammar directly enables clean parser implementation
  2. Top-down vs Bottom-up — Each approach has tradeoffs; recursive descent is intuitive, LR is powerful
  3. Error recovery is crucial — Real-world parsers need robust error recovery
  4. Visualization aids understanding — Parse trees make abstract structure tangible

🚀 Possible Improvements

  • Extend grammar to support assignment, conditionals, loops
  • Implement error recovery strategies (panic mode, phrase-level)
  • Add semantic analysis phase (type checking)
  • Support LALR(1) and SLR(1) parsing algorithms
  • Generate intermediate code from the AST
  • Integrate with Lex/Yacc or ANTLR for comparison

📄 License

This project is licensed under the MIT License — see the LICENSE file for details.


Built with ❤️ for Compiler Design

About

A comprehensive Syntax Analysis Detection System demonstrating parsing techniques (Recursive Descent, LL(1), LR(0)), parse tree visualization, and error detection — built for Compiler Design

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors