Skip to main content
Technology

🤖 Building a Universal AI Chatbot: 100+ Free Models in One Interface

# 🤖 Building a Universal AI Chatbot: 100+ Free Models in One Interface Have you ever wondered what it would be like to have access to over 100 AI mo...

🤖 Building a Universal AI Chatbot: 100+ Free Models in One Interface

Have you ever wondered what it would be like to have access to over 100 AI models from different providers, all in one place, without spending a dime? That's exactly what I set out to build with the Universal AI Chatbot project!

🎯 The Problem

The AI landscape is fragmented. You have amazing models from OpenAI, Google, Meta, Mistral, and many others, but they all require:

  • Different API endpoints
  • Different authentication methods
  • Different code implementations
  • Often, different pricing structures

As a developer and AI enthusiast, I wanted a single, unified interface where I could experiment with any model I wanted, compare their responses, and switch between providers seamlessly.

💡 The Solution

I built a Multi-Backend AI Chatbot that connects to 5 major AI providers and gives you access to 100+ free models through a beautiful, modern Gradio interface. The best part? Most models are completely free to use!

🔗 View Project on GitHub

✨ Key Features

🔌 Five AI Providers in One Place

The chatbot integrates with:

  1. Ollama - Run powerful models locally on your machine

    • 100% free and private
    • No API keys required
    • 14 models including Llama 3.3, Mistral, CodeLlama, DeepSeek R1
  2. OpenRouter - Access to 26 validated free models

    • Models with the :free suffix
    • Includes Llama 3.3, Gemini 2.0 Flash, DeepSeek R1
    • Easy API integration
  3. GitHub Models - 21 cutting-edge models including GPT-5 series

    • OpenAI's latest: GPT-5 nano/mini, o3-mini, o4-mini
    • Meta's Llama 3.3 70B and Llama 3.2 90B Vision
    • Microsoft's Phi-4 series
    • Free for prototyping
  4. Groq - 11 models with ultra-fast inference

    • Llama 4 Maverick and Scout
    • OpenAI GPT-OSS 120B
    • Lightning-fast response times
    • 14,400 free requests per day
  5. Google Gemini - 13 Google AI models

    • Gemini 3 Preview, Gemini 2. 5 Pro/Flash
    • Gemma 3 series
    • 1,500 free requests per day

🎨 Beautiful, Modern Interface

Built with Gradio 5.0+, the interface features:

  • Custom theme with a clean, professional design
  • Real-time streaming responses for instant feedback
  • Live API status indicators to show which providers are available
  • Dynamic model dropdown that updates based on selected provider
  • Advanced controls for fine-tuning responses:
    • System prompt customization
    • Temperature control (0.0 - 2.0)
    • Max tokens setting
    • Preset prompts (Code Expert, Writer, Analyst, Teacher)

🛠️ Model Validation Tool

One of the unique features is the automated model validation system (validate_models.py):

  • Tests real-time availability of all models
  • Detects authentication errors, rate limits, and server issues
  • Generates detailed JSON reports with response times
  • Helps you know which models are currently working

🚀 Advanced Capabilities

  • Streaming responses - Watch the AI generate text in real-time
  • Conversation history - Maintains context throughout your chat
  • Safety checks - Handles empty responses and API errors gracefully
  • Export functionality - Save your conversations to Markdown
  • Stop generation - Interrupt long responses when needed
  • Error handling - Provides helpful suggestions when things go wrong

🏗️ Technical Architecture

Technology Stack

Package Version Purpose
Gradio >= 5.0 Modern web UI framework for building interactive interfaces
OpenAI >= 1.0 OpenAI-compatible API client for unified provider access
Python-dotenv >= 1.0 Environment variable management for secure API key storage
Google-generativeai >= 0.3 Official Google Gemini SDK for Google AI models

How It Works

  1. Unified API Interface: The chatbot uses OpenAI's API format as a standard, which most providers now support. This means less code and easier maintenance.

  2. Dynamic Provider Switching: When you select a provider, the app automatically:

    • Updates the available models
    • Checks API connectivity
    • Configures the appropriate endpoints
  3. Streaming Architecture: Instead of waiting for complete responses, the chatbot uses Server-Sent Events (SSE) to stream tokens as they're generated:

# Simplified streaming logic
for chunk in stream:
    if chunk. choices and chunk.choices[0].delta. content:
        token = chunk.choices[0].delta.content
        full_response += token
        yield full_response
  1. Environment Configuration: API keys are stored securely in a .env file (never committed to Git), making it easy to manage credentials:
OPENROUTER_API_KEY=sk-or-v1-your-key-here
GITHUB_TOKEN=ghp_your-token-here
GROQ_API_KEY=gsk_your-key-here
GOOGLE_API_KEY=AIzaSy-your-key-here

Key Technical Improvements

  1. GPT-5/o-series Compatibility (December 21, 2025)

    • Added max_completion_tokens parameter support
    • Fixed compatibility issues with advanced OpenAI models on GitHub
  2. Empty Streaming Check

    • Prevents "list index out of range" errors
    • Handles cases where API returns empty choices
  3. Enhanced Error Messages

    • Provides actionable feedback
    • Suggests solutions for common issues

📊 Performance & Capabilities

Model Coverage (100+ Total)

Provider Models Speed Free Tier
Ollama 14 Fast (local) Unlimited
OpenRouter 26 Medium Rate-limited
GitHub Models 21 Medium 10-15 req/min
Groq 11 Ultra-fast 14,400/day
Gemini 13 Fast 1,500/day

Use Cases

  1. Code Development

    • Use CodeLlama or GPT-4o for code generation
    • Get instant syntax help and debugging
  2. Content Creation

    • Leverage Gemini or Llama for creative writing
    • Generate blog posts, stories, and marketing copy
  3. Data Analysis

    • Use o3-mini or DeepSeek for analytical tasks
    • Process and interpret complex datasets
  4. Learning & Education

    • Phi-4 reasoning models for step-by-step explanations
    • Compare different models' teaching approaches
  5. Research & Experimentation

    • Test how different models handle the same prompt
    • Compare response quality and speed

🎓 What I Learned

Building this project taught me several valuable lessons:

1. API Standardization is Powerful

OpenAI's API format has become the de facto standard. Most providers now offer OpenAI-compatible endpoints, which made multi-backend support much easier than expected.

2. Streaming Improves UX Dramatically

Real-time response streaming transforms the user experience. Instead of waiting 10-30 seconds for a complete response, users see progress immediately.

3. Error Handling is Critical

With multiple providers and network requests, things will go wrong. Robust error handling with clear messages saves hours of debugging frustration.

4. Model Availability is Fluid

AI providers constantly add, remove, and rename models. Building a validation tool was essential to keep track of what actually works.

5. Local Models are Underrated

Ollama proves that you don't always need cloud APIs. Local models offer:

  • Zero cost
  • Complete privacy
  • No rate limits
  • No internet dependency

🚀 Getting Started

Want to try it yourself? Here's how:

1. Clone the Repository

git clone https://github.com/M-F-Tushar/Multi-Backend-Chatbot-with-Gradio.git
cd Multi-Backend-Chatbot-with-Gradio

2. Install Dependencies

pip install gradio openai python-dotenv google-generativeai

3. Set Up API Keys

Create a .env file:

cp .env.example .env

Add your API keys (only for providers you want to use):

OPENROUTER_API_KEY=your-key-here
GITHUB_TOKEN=your-token-here
GROQ_API_KEY=your-key-here
GOOGLE_API_KEY=your-key-here

Note: All these API keys are free to obtain! Check the README for detailed instructions.

4. Run the Chatbot

Open Chatbot.ipynb in Jupyter Notebook and run all cells. The interface will launch at http://127.0.0.1:7860.

For Ollama (local models):

# Install Ollama from https://ollama.ai
ollama pull llama3.2:1b
ollama serve

🔮 Future Enhancements

I'm constantly improving this project. Here's what's on the roadmap:

  • Vision Model Support - Upload images and use multimodal models
  • Voice Input/Output - Speak to the chatbot and hear responses
  • Chat History Persistence - Save conversations to a database
  • Multi-User Support - Authentication and user-specific histories
  • Usage Statistics - Track API costs and usage across providers
  • Docker Deployment - Containerized deployment for cloud platforms
  • Additional Providers - Anthropic Claude, Mistral API, Cohere
  • Custom Fine-Tuning - Support for custom-trained models

🤝 Contributing

This is an open-source project, and I welcome contributions! Whether you want to:

  • Fix bugs
  • Add new features
  • Improve documentation
  • Add more AI providers
  • Optimize performance

Check out the Contributing Guide and submit a pull request!

📈 Impact & Reception

Since launching on GitHub, the project has:

  • ⭐ Gained 4 stars (and growing!)
  • 🔄 Been forked by developers worldwide
  • 💬 Sparked discussions about AI accessibility
  • 🎓 Helped students learn about LLM integration

💭 Final Thoughts

Building this Universal AI Chatbot has been an incredible journey. It started as a simple idea - "What if I could talk to any AI model from one interface?" - and evolved into a comprehensive tool that democratizes access to cutting-edge AI.

The best part? **Everything is free and open-source. ** You can use it, modify it, learn from it, and build upon it.

Why This Matters

  1. Democratization of AI: Not everyone can afford enterprise API credits. This project proves you can access world-class AI for free.

  2. Comparison Shopping: Before committing to a paid API, you can test multiple providers and see which works best for your use case.

  3. Learning Platform: Students and developers can experiment with different models to understand their strengths and weaknesses.

  4. Privacy Options: With Ollama support, you can run everything locally without sending data to external servers.

📣 Let's Connect!

If you found this project interesting or useful:

  • Star the repository on GitHub
  • 🐛 Report bugs or suggest features via Issues
  • 💬 Share your experience in the Discussions section
  • 🔄 Fork and contribute - all contributions welcome!

📋 Technical Specifications Summary

Language: Python 3.8+
Framework: Gradio 5.0+
Architecture: Multi-provider API integration
Models: 100+ across 5 providers
License: MIT (Open Source)
Platform: Jupyter Notebook / VS Code
Deployment: Local, Cloud-ready


This project represents my belief that powerful AI tools should be accessible to everyone, regardless of budget or technical expertise. Happy chatting! 🚀


Tags: #AI #MachineLearning #Chatbot #Python #Gradio #OpenSource #LLM #NLP #Ollama #Gemini #OpenRouter #Groq #GitHub #GPT5 #DeepSeek #Llama

Last Updated: January 2, 2026
Project Status: Active Development
Version: 1.0