Location>code7788 >text

RAG+AI Workflow+Agent: How to Choose an LLM Framework, a Comprehensive Comparison of MaxKB, Dify, FastGPT, RagFlow, Anything-LLM, and More!

Popularity:758 ℃/2024-07-30 15:15:58

RAG+AI Workflow+Agent: How to Choose an LLM Framework, a Comprehensive Comparison of MaxKB, Dify, FastGPT, RagFlow, Anything-LLM, and More!

MaxKB = Max Knowledge Base, is an open source knowledge base Q&A system based on the LLM Large Language Model, designed to be the strongest brain of an organization. It helps organizations manage knowledge efficiently and provides intelligent Q&A capabilities. Imagine having a virtual assistant that can answer questions about your company's internal knowledge, whether it's policies, procedures, or technical documents, MaxKB can provide fast and accurate answers: how to access the company intranet, how to submit visual design requirements, etc.

Official Website:/

1.1 Introduction

  1. out-of-the-box: Support for direct uploading of documents, automatic crawling of online documents, support for automatic text splitting, vectorization, RAG (Retrieval Augmented Generation), intelligent Q&A interactive experience is good;

  2. Seamless embeddingIt supports zero-coding and rapid embedding into third-party business systems, so that existing systems can quickly have intelligent Q&A capabilities and improve user satisfaction;

  1. Flexible schedulingA powerful workflow engine is built-in to support orchestrating AI workflows to meet the needs of complex business scenarios;

  1. model neutralityIt supports interfacing with various big language models, including local private big models (Llama 3 / Qwen 2, etc.), domestic public big models (Tongyi Qianqian / Wisdom Spectrum AI / Baidu Qianfan / Kimi / DeepSeek, etc.) and foreign public big models (OpenAI / Azure OpenAI / Gemini, etc.).

1.2 Technical framework and rationale

  • technology stack
    • Front-end:, logicflow
    • Backend: Python / Django
    • Langchain:Langchain
    • Vector Databases: PostgreSQL / pgvector
    • Big Models: Ollama, Azure OpenAI, OpenAI, Tongyi Thousand Questions, Kimi, Baidu Thousand Sails Big Models, Xunfei Starfire, Gemini, DeepSeek, and more.

Dify is an open source Large Language Model (LLM) application development platform. It combines the concepts of Backend as Service and LLMOps to enable developers to rapidly build production-grade generative AI applications. Even if you are a non-technical person, you can participate in the process of defining and operating data for AI applications.

Because Dify has built-in the key technology stack needed to build LLM applications, including support for hundreds of models, an intuitive Prompt orchestration interface, a high-quality RAG engine, a robust Agent framework, flexible process orchestration, and an easy-to-use set of interfaces and APIs, it saves developers a lot of time re-creating wheels and allows them to focus on innovation and business needs. requirements

  • Official website:/zh

  • github:/langgenius/dify?tab=readme-ov-file

2.1 Introduction

Dify is an open source LLM application development platform. Its intuitive interface combines AI workflows, RAG pipelines, Agents, model management, observability features, and more to take you from prototype to production quickly. Below is a list of its core features:

  1. workflow: Build and test powerful AI workflows on canvas, utilizing all of the following features and more.

  2. Comprehensive modeling support: Seamless integration with hundreds of proprietary/open-source LLMs and dozens of inference providers and self-hosted solutions, covering GPT, Mistral, Llama3, and any model compatible with the OpenAI API.

  3. Prompt IDE: An intuitive interface for producing prompts, comparing model performance, and adding other features (such as text-to-speech) to chat-based applications.

  4. RAG Pipeline: Extensive RAG functionality covering everything from document ingestion to retrieval, with out-of-the-box support for extracting text from PDF, PPT and other common document formats.

  5. Agent: You can define an Agent based on LLM function calls or ReAct, and add pre-built or custom tools to the Agent.Dify provides more than 50 built-in tools for AI Agents, such as Google Search, DELL-E, Stable Diffusion, and WolframAlpha.

  6. LLMOps: Monitor and analyze application logs and performance over time. You can continuously improve tips, datasets, and models based on production data and annotations.

  7. back-end as a service: All of Dify's features come with corresponding APIs, so you can easily integrate Dify into your own business logic.

2.2 System framework

The workflow reduces system complexity by decomposing complex tasks into smaller steps (nodes), reduces the dependence on cue word techniques and model reasoning capabilities, improves the performance of LLM applications for complex tasks, and enhances the interpretability, stability, and fault tolerance of the system.

Dify workflows are of two types:

  • Chatflow: for conversational scenarios, including customer service, semantic search, and other conversational applications that require multi-step logic in building responses.

  • Workflow: geared towards automation and batch processing scenarios for applications such as high-quality translation, data analytics, content generation, email automation, and more.

In order to solve the complexity of user intent recognition in natural language input, Chatflow provides question understanding class nodes. Compared with Workflow, Chatflow adds support for Chatbot features, such as: conversation history (Memory), labeled replies, Answer nodes, and so on.

In order to solve complex business logic in automation and batch processing scenarios, workflow provides rich logic nodes, such as code nodes, IF/ELSE nodes, template conversion, iteration nodes, etc. In addition to this, it will also provide timing and event triggering capabilities, so as to facilitate the construction of automated processes.

  • Common Cases

    • Customer Service: By integrating LLM into your customer service system, you can automate answers to frequently asked questions and reduce the workload of your support team. LLM understands the context and intent of customer queries and generates helpful and accurate answers in real time.

    • Content Generation: Whether you need to create blog posts, product descriptions, or marketing materials, LLM can help you by generating high-quality content. Simply provide an outline or topic and LLM will utilize its extensive knowledge base to produce engaging, informative and well-structured content.

    • Task Automation: Can be integrated with various task management systems such as Trello, Slack, Lark, to automate project and task management. By using natural language processing, LLM understands and interprets user input, creates tasks, updates status and assigns priorities without manual intervention.

    • Data Analysis and Reporting: can be used to analyze large data sets and generate reports or summaries. By providing relevant information to LLM, it can identify trends, patterns, and insights, turning raw data into actionable intelligence. This is particularly valuable for organizations looking to make data-driven decisions.

    • Email Automation Processing: LLM can be used to draft emails, social media updates and other forms of communication. By providing a brief outline or key points, LLM can generate a well-structured, coherent and contextually relevant message. This saves a lot of time and ensures that your responses are clear and professional.

FastGPT is a powerful platform focused on knowledge base training and automated workflow orchestration. It provides an easy-to-use visual interface that supports automated data preprocessing and workflow orchestration based on Flow modules.FastGPT supports the creation of RAG systems, provides automated workflows, and other features that make it simple to build and use a RAG system without the need to write complex code.

  • Official:/

  • github:/labring/FastGPT

3.1 FastGPT capability

  1. Dedicated AI Customer Service : Train by importing documents or existing question-and-answer pairs so that AI models can answer questions based on your documents in an interactive dialog.
    • Multi-library reuse, mixing
    • Chunk record modification and deletion
    • Source file storage
    • Supports manual input, direct segmentation, QA split importing
    • Support txt, md, html, pdf, docx, pptx, csv, xlsx (need more can PR file loader)
    • Support url reading, CSV batch importing
    • Mixed Search & Rearrangement
  2. Easy-to-use visualization interface FastGPT adopts an intuitive visual interface design and provides rich and practical functions for various application scenarios. The creation and training process of AI customer service can be easily accomplished through simple and easy-to-understand operation steps.
  3. Automatic data preprocessingFastGPT provides manual input, direct segmentation, LLM auto-processing, and CSV data importing paths, including "Direct Segmentation", which supports PDF, WORD, Markdown and CSV documents as context, and FastGPT automatically preprocesses, vectorizes, and QA segments text data. FastGPT automatically preprocesses, vectorizes and QA segments text data, saving manual training time and improving performance.
  4. Workflow orchestration : Workflow orchestration based on the Flow module can help you design more complex question and answer processes. For example, querying databases, checking inventory, scheduling lab appointments, etc.
    • Provides easy mode, no need to manipulate the choreography
    • Workflow orchestration
    • Tool Call
    • Plug-ins - Workflow Encapsulation Capabilities
    • Code sandbox
  5. Powerful API Integration: FastGPT's external API interface is aligned with the official OpenAI interface, which allows direct access to existing GPT applications and easy integration into platforms such as WeChat, Public, and Flybook.

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine built on deep document understanding.RAGFlow provides a streamlined RAG workflow for organizations and individuals of all sizes, combined with a large language model (LLM) to provide reliable questions and answers for users' data in different complex formats. RAGFlow provides a streamlined RAG workflow for organizations and individuals of all sizes, combined with a Large Language Model (LLM) to provide reliable questions and answers, as well as justified citations, for a wide range of complex data formats.

Official website:/

Github:/infiniflow/ragflow/blob/main

4.1 Functionality

  • "Quality in, quality out"

    • Based on deep document understanding, it is able to extract insights from unstructured data in various complex formats.
    • Really fast needle-in-a-haystack testing in infinite context (token) scenarios.
  • Template-based text slicing

    • Not just smart, but also controllable and interpretable.
    • Multiple text templates to choose from
  • Justified, minimized hallucinations (hallucination)

    • Visualization of the text slicing process with support for manual adjustment.
    • Justified: answers provide snapshots of key references and support traceability.
  • Compatible with various heterogeneous data sources

    • Support rich file types, including Word documents, PPT, excel tables, txt files, images, PDF, photocopies, copies, structured data, web pages and more.
  • Full hassle-free, automated RAG workflow

    • Fully optimized RAG workflows support ecosystems ranging from personal applications to mega-enterprises.
    • Configurations are supported for the Large Language Model LLM as well as the Vector Model.
    • Based on multiplexed recall, fusion reordering.
    • Provides easy-to-use APIs for easy integration into various enterprise systems.
  • Recently Updated Features

    • 2024-07-23 Support for parsing audio files.
    • 2024-07-21 Support for more big model vendors (LocalAI/OpenRouter/StepFun/Nvidia).
    • 2024-07-18 Supporting arithmetic in Graph : Wikipedia, PubMed, Baidu and Duckduckgo.
    • 2024-07-08 Support for Agentic RAG: Graph-based workflow.

4.2 System architecture

-LLM

AnythingLLM is a full-stack application that allows you to build a private ChatGPT using off-the-shelf commercial biglanguage models or popular open-source biglanguage models, combined with a vector database solution, and no longer be constrained: you can run it locally or host it remotely, and be able to chat intelligently with any document you provide.

AnythingLLM divides your documents into objects called workspaces. Workspaces function like threads, with the added containerization of documents. Workspaces can share documents, but the contents of the workspaces do not interfere with or pollute each other, so you can keep the context of each workspace clear.

Official:/

github:/Mintplex-Labs/anything-llm

  • Some features of AnythingLLM

    • Multi-user instance support and rights management
    • Intelligent body Agents in the workspace (browsing the web, running code, etc.)
    • Embeddable chat windows customized for your website
    • Support for multiple document types (PDF, TXT, DOCX, etc.)
    • Manage documents in vector databases through a simple user interface
    • Two dialog modes: chat and query. Chat mode keeps a record of previous conversations. The query mode is for simple questions and answers about your documents.
    • The content of the referenced document will be provided in the chat
    • 100% cloud deployment ready.
    • "Deploying your own LLM model".
    • Manage very large documents with high efficiency and low consumption. Embedding a huge document or transcript just once. Save up to 90% over other document chatbot solutions.
    • Full developer API for custom integrations!
  • Supported LLMs, Embedded Models and Vector Databases

    • LLM: Includes any open source compatible model, OpenAI, Azure OpenAI, Anthropic ClaudeV2, LM Studio, and LocalAi.
    • Embedding models: the AnythingLLM native embedder, OpenAI, Azure OpenAI, LM Studio, and LocalAi.
    • Vector databases: LanceDB (default), Pinecone, Chroma, Weaviate and QDrant.
  • Technology Overview

    • The whole project is designed as a single-threaded structure and consists of three main parts: the collector, the front-end and the server.
    • collector: Python tool for quickly converting online resources or local documents into LLM-ready formats.
    • frontend: ViteJS + React frontend for creating and managing all content available to LLM.
    • server: NodeJS + Express server that handles all vector database management and LLM interactions.

6. More LLM framework recommendations

More frameworks are recommended in the following articles: LLM framework, RAG framework, Agent framework.

  • How LLMops are reshaping the O&M landscape for the new AI-native paradigm [industry paradigm], and mainstream LLMops recommendations
  • Selected domestic large model Agent application cases , and the mainstream Agent framework open source project recommendation

6.1 DB-GPT: Defining the Next Generation of Database Interaction with Privatized LLM Technology

DB-GPT is an open source AI Native Data App Development framework with AWEL (Agentic Workflow Expression Language) and Agents.

The purpose is to build an infrastructure for the large model domain, making it easier and more convenient to build large model applications around databases by developing multiple technical capabilities such as multi-model management (SMMF), Text2SQL effect optimization, RAG framework as well as optimization, Multi-Agents framework collaboration, AWEL (intelligent body workflow orchestration), and so on.

In the era of Data 3.0, based on models and databases, enterprises/developers can build their own proprietary applications with less code.

  • github:/eosphoros-ai/DB-GPT

6.1.1 Architecture program

  • The core competencies have the following main components.

    • RAG (Retrieval Augmented Generation), RAG is the most landed practice at the moment, but also the most urgent area, DB-GPT has now realized a set of framework based on RAG, users can build knowledge applications based on DB-GPT's RAG capabilities.

    • GBI: Generative BI is one of the core capabilities of DB-GPT project, which provides the basic digital intelligence technology guarantee for building enterprise report analysis and business insights.

    • Fine-tuning framework: Model fine-tuning is an indispensable ability for any enterprise to land in vertical and segmented areas, DB-GPT provides a complete fine-tuning framework to realize seamless connectivity with DB-GPT projects, and in the recent fine-tuning, the accuracy rate based on spider has reached 82.5%.

    • Data-Driven Multi-Agents Framework: DB-GPT provides a data-driven, self-evolving Multi-Agents framework with the goal of continuous data-based decision making and execution.

    • Data Factory: Data Factory is mainly to do the cleaning and processing of credible knowledge and data in the era of big model.

    • Data Source: Connect to various data sources to realize seamless connection of production business data to DB-GPT core capabilities.

6.1.2 RAG Production Landing Practice Architecture

6.2 Langchain-Chatchat

  • github:/chatchat-space/Langchain-Chatchat

The project supports the mainstream open source LLM, Embedding models and vector databases in the market, and can realize offline private deployment using all open source models. At the same time, the project also supports OpenAI GPT API calls, and will continue to expand access to various models and model APIs.

The principle is shown in the figure below: the process includes loading the file -> reading the text -> text segmentation -> text vectorization -> question vectorization -> matching the top k most similar to the question vector among the text vectors -> the matched text is added as a context along with the question to the prompt -> and submitting it to the LLM to generate the answer.

7. Summary (selection of recommendations)

When choosing an AI application development platform, it is important to understand the features, community support, and ease of deployment of different platforms.

7.0 Selection of Strengths and Weaknesses

Advantages and Disadvantages of MaxKB/Dify

  • dominance

    • Large model access flexibility: Provides a variety of large model access, support for a variety of API interfaces, so that developers can flexibly select and switch models according to demand, which is particularly important for application scenarios that require high-performance models.

    • Powerful Chat Features: The Chat feature not only supports multi-round conversations, but also enhances the user experience through intelligent recommendations and contextual understanding for scenarios that require complex interactions.

    • Rich knowledge base support: Built-in knowledge base management system that supports the import and export of multiple data formats, making it easy for users to manage and utilize knowledge resources.

    • Efficient Workflow DesignWorkflow is designed to be simple and intuitive, and supports drag-and-drop operation, which makes it possible for non-technical people to get started quickly and greatly reduces the threshold of use.

    • Prompt IDE: The Prompt IDE tool provided allows developers to debug and optimize prompts more intuitively, improving development efficiency.

  • inferior

    • learning curve: Although the interface is designed to be more user-friendly, it still takes some time for beginners to familiarize themselves with its workflow and functions.

    • Community Support: Compared to some mature development platforms, the community activity and resourcefulness still needs to be improved, which may affect the speed of developers in solving problems.

    • Degree of customization: While Dify offers a wealth of functionality, further development and tweaking may be required for certain highly customized requirements.

Advantages and Disadvantages of FastGPT/RagFlow

  • dominance

    • Agent Intelligence: Agent intelligences are powerful enough to automate complex tasks, reducing the need for human intervention and making them suitable for scenarios where a large number of tasks need to be automated.

    • LLMOps Support: LLMOps support is provided, making it easier for developers to train, optimize, and deploy models, which is critical for continuous iteration and optimization of AI models.

    • back-end as a service: Provides the function of back-end as a service , simplifies the back-end development process , so that developers can focus more on the front-end and business logic development .

    • Powerful RAG engine: The RAG engine can efficiently process and retrieve large amounts of data for application scenarios that require fast response and high throughput.

  • inferior

    • Functional complexity: The functions of FastGPT are more complicated, for beginners, it may take a longer time to master its usage and skills.

    • Deployment difficulty: Compared to some lightweight development platforms, the deployment process of FastGPT can be more complex and requires some technical background and experience.

    • user: Although FastGPT is powerful, its user interface may not be as intuitive and friendly as some of its competitors, which may affect the user experience.

7.1 Selection of platforms based on needs

Choosing the right platform starts with defining your needs.Dify and FastGPT each have their own characteristics and are suitable for different application scenarios.

  • MaxKB/Dify: Suitable for developers who need to build and deploy AI applications quickly, it provides a rich set of pre-built templates and integration tools that enable developers to get started quickly, and is especially suitable for beginners and teams that need to validate ideas quickly.

  • FastGPT/RagFlow: Suitable for enterprise-level users who need highly customized and complex workflows, providing a powerful RAG engine and Workflow orchestration that can handle complex business logic and data processing needs.

  • The following factors should be considered when selecting a platform:

    • Project Size: If it's a small project or startup team, the rapid deployment and ease of use of MaxKB/Dify may be more appropriate. If it's a large enterprise-level project, the power and customization of FastGPT/RagFlow is more appropriate.

    • Technology stack: Consider the team's existing technology stack and members' technical backgrounds. The technical implementation is different, and choosing a platform that matches the team's technology stack can reduce learning costs and development difficulty.

    • Functional requirements: Define the core functions required by the project, such as large model access, Chat function, knowledge base, etc. Dify and FastGPT have their own advantages in these functions, so choose according to the specific needs.

7.2 Community vs. Support

Community support and resourcefulness are also critical to platform selection.

  • MaxKB/Dify: has an active community that provides a wealth of documentation, tutorials and sample code. Community members often share tips and solutions, so you can get quick help for any problems you encounter.

  • FastGPT/RagFlow: The community is relatively small, but offers a professional technical support team. For enterprise-level users, FastGPT provides customized technical support and consulting services to ensure the smooth running of the project.

  • The following factors should be considered when selecting a platform:

    • Community Activity: An active community means more resources and faster problem solving. A more active community is good for developers who need to solve problems quickly.

    • Technical Support: For enterprise-level users, professional technical support is crucial. Professional technical support is provided for users who have high requirements for technical support.

7.3 Deployment and ease of use

Ease of deployment and use directly impacts development efficiency and cost.

  • MaxKB/Dify: provides an easy-to-use interface and one-click deployment features, enabling developers to quickly deploy applications to the cloud or locally. The documentation is detailed and suitable for beginners to get started quickly.

  • FastGPT/RagFlow: Relatively complex to deploy and requires some technical background and configuration. Provides powerful customization capabilities, suitable for users with high requirements for performance and functionality.

  • The following factors should be considered when selecting a platform:

    • Deployment Difficulty: MaxKB/Dify's deployment process is simple and suitable for developers who need to deploy quickly. FastGPT/RagFlow's deployment is relatively complex, but offers more configuration options.

    • Ease of use: MaxKB/Dify's user interface is user-friendly and easy to operate. FastGPT/RagFlow's user interface is relatively complex, but offers more features and customization options.

More quality content please pay attention to the public number: Ting, artificial intelligence; will provide some related resources and quality articles, free access to read.

More quality content please follow CSDN: Ting, Artificial Intelligence; will provide some related resources and quality articles, free access to read.