Home

Documentation Python LangChain HuggingFace Gradio Ollama llama.cpp PostgreSQL ChromaDB

Example code demonstrating local LLM inference with various backends and libraries.

Overview

This repository contains chatbot demos and hands-on activities for learning prompt engineering and local LLM deployment.

Demos

Chatbots (demos/chatbots/)

  • HuggingFace chatbot: Direct model loading with Transformers — no inference server needed

  • Ollama chatbot: Terminal chatbot using LangChain + a local Ollama server

  • llama.cpp chatbot: Terminal chatbot using the OpenAI-compatible llama.cpp API

  • Gradio chatbot: Web UI with switchable Ollama / llama.cpp backends and customizable system prompt

LangChain patterns (demos/langchain_patterns/)

  • LangChain demo: Prompt templates, output parsers, LCEL chains, and few-shot learning

  • ReAct agent: LangChain agent with custom tools and multi-step reasoning (two versions: framework and manual)

RAG system (demos/rag_system/)

  • RAG demo: Ingest Wikipedia articles into a pgvector knowledge base and query them with a grounded LLM

Infrastructure

Get started

See the Quickstart guide for installation and setup, then explore the Demos to learn about different inference approaches.