Skip to content

ToastCoder/HRBuddy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HRBuddy

HR Buddy is an application which is inspired by HR Chatbot portals. Uses combination of Llama (LLM) + Nomic (Embedding) models. Uses Retrieval Augumented Generation for identify the context strictly from the HR Policies PDF.

flowchart TD
    UI["Streamlit UI"]
    SESS["Session State<br/>(history, session_id)"]
    MDB[("MongoDB<br/>Auth & Chat History")]

    subgraph RAG [RAG Engine]
        direction LR
        subgraph Hybrid [Hybrid Retrieval]
            SEM["Semantic Search<br/>ChromaDB + OllamaEmbeddings<br/>(MMR, fetch_k=18)"]
            BM25["Keyword Search<br/>BM25 (heading-enriched<br/>chunks, top_k=6)"]
            FUSION["Weighted Fusion<br/>(semantic=0.7, bm25=0.3)"]
            SEM --> FUSION
            BM25 --> FUSION
        end
        CTX["Context Assembly<br/>(top_k=6 documents)"]
        LLM["Ollama Llama 3.2<br/>3B params<br/>(temp=0.1, ctx=4096)"]
        Hybrid --> CTX --> LLM
    end

    UI --> SESS
    SESS --> MDB
    UI -->|"user input + history"| RAG
    LLM -->|"response stream"| UI
Loading

Prerequisites

Before running the application, ensure your system has the following:

  • Docker & Docker Compose installed.
  • Hardware: Minimum 8GB RAM (16GB+ recommended) to run the Llama 3.2 model smoothly.
  • OS: Linux or macOS (Windows users should use WSL2).

Getting Started

If you are on MacOS / Linux, simply make the shell script executable

chmod +x run.sh

Then, just run the shell script.

./run.sh

Note: If you have any other shell instead of bash, open the first line of run.sh and replace the first line with the shell of your choice.

This script will handle all the setup of Ollama Package and the model and as well as builds the docker container.

Tech Stack

  • Frontend: Streamlit
  • AI/LLM: Ollama (Llama 3.2 3B)
  • Embeddings: Nomic Embed Text
  • Vector Store: ChromaDB
  • Retrieval: Hybrid search (semantic + BM25 keyword)
  • Security: Prompt injection defense (system message isolation, input sanitization)
  • Database: MongoDB (for user authentication and chat history)
  • Orchestration: Langchain & Docker

Search Configuration

The app uses hybrid search combining semantic (vector) and keyword (BM25) retrieval:

Parameter Default Description
enabled true Toggle hybrid search on/off
semantic_weight 0.7 Weight for semantic (embedding) similarity scores
bm25_weight 0.3 Weight for BM25 keyword matching scores
top_k 6 Number of final documents returned to the LLM
fetch_k 18 Documents fetched per retriever before fusion

Disable hybrid search ("enabled": false) to fall back to semantic-only retrieval.

Security — Prompt Injection Defense

The RAG engine implements defense-in-depth against prompt injection via three layers:

1. System Message Isolation Instructions and retrieved context are sent as a role: system message (authoritative in Llama's chat template), separate from the user message. This prevents user input from overriding behavior rules.

2. Input Sanitization Known injection patterns are stripped from all user-facing text before it reaches the LLM:

  • Bracket-based overrides ([SYSTEM UPDATE], [OVERRIDE], [TASK])
  • "Ignore previous instructions" variants
  • Mode-switching attacks ("you are now in developer mode")
  • Prompt extraction attempts ("output your system prompt")
  • Session ID is sanitized to alphanumeric characters only

3. Structured History Conversation history is reconstructed as real user / assistant message pairs rather than flat text, preventing re-injection of successful attacks across conversation turns.

Customizing the HR Policy

By default, the application uses the provided 2016 HR Manual. To use your own data:

  1. Delete the existing PDF in the rag_source/ directory.
  2. Place your company's HR policy PDF into rag_source/.
  3. Update the PDF_PATH variable in main.py if the filename changes.
  4. Restart the containers to trigger a fresh vector embedding.

Troubleshooting

Ollama Connection Refused inside Docker: If the Streamlit app cannot reach Ollama, you need to configure Ollama to listen to the Docker bridge network.

  1. Run sudo systemctl edit ollama.service
  2. Add the following under the [Service] block: Environment="OLLAMA_HOST=0.0.0.0"
  3. Save, then run sudo systemctl daemon-reload and sudo systemctl restart ollama.

Author

ToastCoder * GitHub: @ToastCoder

About

HR Buddy is an application which is inspired by HR Chatbot portals. Uses combination of Llama (LLM) + Nomic (Embedding) models. Uses Retrieval Augumented Generation for identify the context strictly from the HR Policies PDF.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors