LLM Application Practice: Kotaemon-1. Introduction and Deployment Practice for Documentation Q&A System

1.contexts

roots or stems of plantsqiang~The last couple of weeks have been focused on a very hot open source document quiz systemKotaemonnotice from8moon28In just two short weeks, from Sunday to the present.githubRapid growth of starred labels10K, so the plan is to dig deeper into the principles and mysteries.

This article focuses onKotaemonThe profile information, which covers the main features, is different from the traditional documentationRAGThe difference between the two, how they are deployed, and a demonstration of their effectiveness。

Please stay tuned for the follow-up content, which will be disassembled and read for the core modules.~

2.synopsis

KotaemonanRAG UIpage, mainly forDocQAend-users and build your ownRAG pipelinedeveloper。

2.1end user

1)Provides an easy-to-use interface based on theRAGQ&A's are simple and minimizedUIinterfaces

2)Supporting manyLLM APIprovider (company)(as ifOpenAI, Cohere)and locally deployedLLM(as ifollamacap (a poem)llama-cpp-python)

2.2developers

1)Provides a way to buildRAGDocumentation Q&Apipelineframeworks

2)utilizationGradioDevelopment, based on the providedUICan be customized and viewedRAG pipelineoperating procedure

3.Main features

1)Hosting your ownQA Web UI: Support multi-user login, support file private or public, support share with others assistance

2)managerialLLMcap (a poem)Embeddingmould: Support for localLLMand popularAPIprovider (company)

3)hybridRAG pipeline: Supports hybrid searching of full textbooks and vectors, andre-rankingto guarantee the quality of searches

4)Support for multimodal documents: support for documents containing images and tables.NMultiple documents for Q&A, support for multimodal document parsing

5)Document preview with advanced citations: The system provides specific citation information by default, safeguardingLLMAccuracy of Answers. Directly within the browser'sPDFThe viewer looks at the citations and highlights them.

6)Supporting complex reasoning methods: using question decomposition to answer complex/Multi-hop problem. Supports the use ofReAct、ReWooand othersAgentdraw inferences

7)Adjustable settingsUI: Adjustment of parameter information for retrieval and generation processes, including retrieval process and reasoning process settings

4.With traditional documentsRAGexclusionary rule

(1)existweb uiThe interface is straightforwardPDFPreview and pass theLLMThe reference callback method highlights related sentences, greatly aiding in scrutinizing theLLMcontext

(2)Support for complex reasoning methods. The goal is to support more complex reasoning methods based onAgentof reasoning frameworks such as problem splitting and multi-hop problems,React、ReWooet al. (and other authors)

(3)Support for changes in page configurationpromptYou can also adjust the search and reasoning modules you want to use.

(4)Expandability is good, based ongradioDeveloped to be easily added or removedUIComponent CustomizationRAG pipeline

5.Deployment

5.1end user

- available atgithub(used form a nominal expression)releasepage to download the latestand unzip it.

- go intoscripts, depending on the system installed, such aswindowsSystem double-clickrun_windows.bat，linuxsystemsbash run_linux.sh

- After installation, the program asks to startketem(used form a nominal expression)UIAnswer"Continue."

- If launched, it will automatically open in your browser with a default account name and password ofadmin/admin

5.2developers

5.2.1utilizationDockerInstallation Recommendations

# (of a computer) run

docker run -e GRADIO_SERVER_NAME=0.0.0.0 -e GRADIO_SERVER_PORT=7860 -p 7860:7860 -it --rm /cinnamon/kotaemon:latest

# interviewsuiAddress:http://localhost:7860/

5.2.2Source Code Installation

# Creating a Virtual Environment

conda create -n kotaemon python=3.10

conda activate kotaemon

# Download source code

git clone /Cinnamon/kotaemon

cd kotaemon

# Installation of dependencies

pip install -e "libs/kotaemon[all]"

pip install -e "libs/ktem"

# Updating the environment variables file.enve.g.API key

# (selectable) If you want to show it inside the browserPDFYou can downloadPDF_JS viewerUnzip tolibs/ktem/ktem/assets/prebuiltcatalogs

# openswebservice and use theadmin/adminlog in

python

5.2.3Application customization

Application data is saved by default in the./ktem_app_datafile, if you want to migrate to a new machine, just copy that folder.

Customizable for advanced users or special purposes.envcap (a poem)file

(1)set up

# Setting up the Document Storage Engine(The engine supports full text search)

KH_DOCSTORE=(Elasticsearch | LanceDB | SimpleFileDocumentStore)

# Setting up the vector storage engine(Support vector search)

KH_VECTORSTORE=(ChromaDB | LanceDB | InMemory)

# Enable or disable multimodalQA

KH_REASONINGS_USE_MULTIMODAL=True

# Add new reasoningpipelineor modify an existing

KH_REASONINGS = [

"",

]

)

(2).envset up

This file provides another way to set up models and credentials.

# Can be setOpenAIlinkage

OPENAI_API_BASE=/v1

OPENAI_API_KEY=<your OpenAI API key here>

OPENAI_CHAT_MODEL=gpt-3.5-turbo

OPENAI_EMBEDDINGS_MODEL=text-embedding-ada-002

5.2.4Setting up the localLLMup toEmbeddingmould

(1)testimonialsOllama OpenAICompatible Services

#mountingollamaand start the program, see/ollama/ollama

#pull model

ollama pull llama3.1:8b

ollama pull nomic-embed-text

(2)existResourcespageLLMscap (a poem)Embeddingseparate settingLLMcap (a poem)Embedding

api_key: ollama

base_url: http://localhost:11434/v1/

model: llama3.1:8b (for llm) | nomic-embed-text (for embedding)

(3)Use the local model forRAG

1)localizeLLMcap (a poem)EmbeddingThe model is set todefault

2)commander-in-chief (military)File Collectionhit the nail on the headEmbeddingSet as local model(for example: ollama

3)existRetrieval Settingpage, select the local model as theLLMCorrelation Score Modeling. If your machine can't handle a large number of simultaneousLLMParallel request, can be unchecked”Use LLM relevant scoring”

4)Now it's time to upload files for the document quiz experience.

6.Experience

seeing it a thousand times is better than seeing it once with one's hands~

roots or stems of plantsqiang~Deployed using source code installation, usingopenai(used form a nominal expression)LLMmouldgpt-4o-minicap (a poem)Embeddingmouldtext-embedding-3-small(How to use the free version ofopenaicarry outapiExperience, can be contacted by private message~). Secondly, the use ofMindSearchof the paper for test validation.

6.1Building Document Indexing Information

6.1.1Upload a document

6.1.2utilizationsimpleinference strategy

simpleThe inference strategy corresponds to thehit the nail on the headFullQAPipeline. Issues: “what are the components of MindSearch?”，The effect is as follows：

6.1.3utilizationcomplexinference strategy

complexThe inference strategy corresponds to thehit the nail on the headFullDecomposeQAPipeline, i.e., the complex problem is broken down into simple sub-problems. Problems: “Please describe the performance of MindSearch on both open-source and closed-source datasets.?”

6.1.4utilizationreactinference strategy

Reactis anRAG Agenttechnique that takes the user's problem and designs a plan and iteratively loops through its execution to satisfy a specific end conditioning.ReactCalls can be made in conjunction with tools, such as search engines,WikiEncyclopedias, etc. Question:”Tell me somethong about 'University of Science and Technology of China'”

6.1.5utilizationReWooinference strategy

ReWooIt's also aRAG Agenttechniques, stage one develops a step-by-step plan, stage two addresses each of the sub-steps, tools can also be used to help with the reasoning process, and upon completion of all stages, theReWooAnswers will be summarized. Question:”Tell me somethong about 'University of Science and Technology of China' and 'shanghai ai Laboratory '”

6.2construct (sth abstract)GraphRAGIndexing Information

KotaemonIntegrates Microsoft's previously open-sourcedGraphRAGframework, which consists of steps such as graph and index construction and graph retrieval. Problems: “the author’s of this paper”

7.summarize

One word is enough.~

This article focuses on the open source document Q&A systemKotaemonThe introduction of the document, including the main functional features, with the traditional documentRAGThe differences, deployment tutorials, and effects experience.

at the present timeKotaemonThe Chinese language support is not friendly, but since it can be used through theollamadeployLLMcap (a poem)Embeddingmodel, so supporting Chinese language is also relatively easy to develop and integrate.

A follow-up series will do a detailed source code dimensional analysis of the retrieval and inference modules in the framework, so stay tuned!~

If you have questions about the deployment process or are unsuccessful, or would like to get free access to theopenaiThe guest, can communicate by private message.

If you have any questions or want to cooperate with the customer, you can communicate by private message.

8.consultation

(1) Kotaemonrepository: /Cinnamon/kotaemon