🔥FLASH SALE: 30% OFF everything!
LogoTemplateFame
icon of DegreeGuru

DegreeGuru

An AI chatbot for expert answers on university degrees, built with Vercel AI SDK, Langchain, Upstash Vector, and OpenAI.

Introduction

DegreeGuru is an open-source project demonstrating how to build a Retrieval-Augmented Generation (RAG) AI chatbot using the Vercel AI SDK, Langchain, Upstash Vector, and OpenAI. It's designed to provide expert answers on custom data, exemplified by university degrees.

Key features include:

  • Built-in Crawler: Scrapes specified websites, automatically making data available for the AI.
  • Real-time Performance: Delivers fast answers leveraging Upstash Vector for efficient data retrieval and real-time data streaming.
  • API Protection: Incorporates rate limiting using Upstash Redis to prevent API abuse.
  • Domain Agnostic: Easily adaptable to any custom dataset by modifying the crawler.yaml configuration.

The technical stack comprises:

  • Crawler: Developed with Scrapy (Python) for efficient web data extraction.
  • Chatbot Application: Built on Next.js, providing a modern and responsive user interface.
  • Vector Database: Utilizes Upstash Vector for storing and querying vector embeddings of the scraped data.
  • LLM Orchestration: Employs Langchain.js to manage interactions with large language models.
  • Generative AI: Powered by OpenAI's gpt-3.5-turbo-1106 for generating expert responses.
  • Embeddings: Uses OpenAI's text-embedding-ada-002 for creating vector representations of text.
  • Streaming: Leverages Vercel AI for seamless text streaming in chatbot responses.

The project provides a comprehensive quickstart guide for local development, covering environment setup (Upstash Vector, Upstash Redis, OpenAI API keys), Python library installation, and crawler configuration via crawler.yaml and settings.py. A Docker-compose option is also available for simplified deployment. Users can customize the chatbot's behavior, including streaming modes and the AGENT_SYSTEM_TEMPLATE, to tailor it to specific use cases. While robust, current limitations include the UpstashVectorStore being a work-in-progress within Langchain, potential message history issues in non-streaming mode, and challenges in explicitly displaying sources during streaming.

Information

Newsletter

Join the Community

Subscribe to our newsletter for the latest news and updates