HRBuddy

HR Buddy is an application which is inspired by HR Chatbot portals. Uses combination of Llama (LLM) + Nomic (Embedding) models. Uses Retrieval Augumented Generation for identify the context strictly from the HR Policies PDF.

flowchart TD
    UI["Streamlit UI"]
    SESS["Session State<br/>(history, session_id)"]
    MDB[("MongoDB<br/>Auth & Chat History")]

    subgraph RAG [RAG Engine]
        direction LR
        subgraph Hybrid [Hybrid Retrieval]
            SEM["Semantic Search<br/>ChromaDB + OllamaEmbeddings<br/>(MMR, fetch_k=18)"]
            BM25["Keyword Search<br/>BM25 (heading-enriched<br/>chunks, top_k=6)"]
            FUSION["Weighted Fusion<br/>(semantic=0.7, bm25=0.3)"]
            SEM --> FUSION
            BM25 --> FUSION
        end
        CTX["Context Assembly<br/>(top_k=6 documents)"]
        LLM["Ollama Llama 3.2<br/>3B params<br/>(temp=0.1, ctx=4096)"]
        Hybrid --> CTX --> LLM
    end

    UI --> SESS
    SESS --> MDB
    UI -->|"user input + history"| RAG
    LLM -->|"response stream"| UI

Prerequisites

Before running the application, ensure your system has the following:

Docker & Docker Compose installed.
Hardware: Minimum 8GB RAM (16GB+ recommended) to run the Llama 3.2 model smoothly.
OS: Linux or macOS (Windows users should use WSL2).

Getting Started

If you are on MacOS / Linux, simply make the shell script executable

chmod +x run.sh

Then, just run the shell script.

./run.sh

Note: If you have any other shell instead of bash, open the first line of run.sh and replace the first line with the shell of your choice.

This script will handle all the setup of Ollama Package and the model and as well as builds the docker container.

Tech Stack

Frontend: Streamlit
AI/LLM: Ollama (Llama 3.2 3B)
Embeddings: Nomic Embed Text
Vector Store: ChromaDB
Retrieval: Hybrid search (semantic + BM25 keyword)
Security: Prompt injection defense (system message isolation, input sanitization)
Database: MongoDB (for user authentication and chat history)
Orchestration: Langchain & Docker

Search Configuration

The app uses hybrid search combining semantic (vector) and keyword (BM25) retrieval:

Parameter	Default	Description
`enabled`	`true`	Toggle hybrid search on/off
`semantic_weight`	`0.7`	Weight for semantic (embedding) similarity scores
`bm25_weight`	`0.3`	Weight for BM25 keyword matching scores
`top_k`	`6`	Number of final documents returned to the LLM
`fetch_k`	`18`	Documents fetched per retriever before fusion

Disable hybrid search ("enabled": false) to fall back to semantic-only retrieval.

Security — Prompt Injection Defense

The RAG engine implements defense-in-depth against prompt injection via three layers:

1. System Message Isolation Instructions and retrieved context are sent as a role: system message (authoritative in Llama's chat template), separate from the user message. This prevents user input from overriding behavior rules.

2. Input Sanitization Known injection patterns are stripped from all user-facing text before it reaches the LLM:

Bracket-based overrides ([SYSTEM UPDATE], [OVERRIDE], [TASK])
"Ignore previous instructions" variants
Mode-switching attacks ("you are now in developer mode")
Prompt extraction attempts ("output your system prompt")
Session ID is sanitized to alphanumeric characters only

3. Structured History Conversation history is reconstructed as real user / assistant message pairs rather than flat text, preventing re-injection of successful attacks across conversation turns.

Customizing the HR Policy

By default, the application uses the provided 2016 HR Manual. To use your own data:

Delete the existing PDF in the rag_source/ directory.
Place your company's HR policy PDF into rag_source/.
Update the PDF_PATH variable in main.py if the filename changes.
Restart the containers to trigger a fresh vector embedding.

Troubleshooting

Ollama Connection Refused inside Docker: If the Streamlit app cannot reach Ollama, you need to configure Ollama to listen to the Docker bridge network.

Run sudo systemctl edit ollama.service
Add the following under the [Service] block: Environment="OLLAMA_HOST=0.0.0.0"
Save, then run sudo systemctl daemon-reload and sudo systemctl restart ollama.

Author

ToastCoder * GitHub: @ToastCoder

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
config		config
core		core
rag_source		rag_source
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HRBuddy

Prerequisites

Getting Started

Tech Stack

Search Configuration

Security — Prompt Injection Defense

Customizing the HR Policy

Troubleshooting

Author

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HRBuddy

Prerequisites

Getting Started

Tech Stack

Search Configuration

Security — Prompt Injection Defense

Customizing the HR Policy

Troubleshooting

Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages