Hand Rub Large Model Task03: Hand Rub a Minimal Agent System

preamble

Training a big model is a high investment and low return thing, besides the training is something done by big giant companies; usually we do fine-tuning or Agent etc. on top of the existing big model; there is no doubt about the capability of the big model, but the big model may seem to be a little bit incompetent for some real-time problems, or some proprietary domain problems. Therefore, we need some tools to empower the big model, to give the big model a grip, so that the big model and the real world happens to align the granularity, so that we get a better use of the big model.

1. Construct Agent

Here is a brief description of the structure of the Agent, which is a React structure that provides a system_prompt that allows the big model to know which tools it can call and in what format it will output. Every time a user asks a question, if a tool needs to be called, two calls to the big model are made, the first parsing the user's question, selecting the tool and parameters to call, and the second integrating the results returned by the tool with the user's question. This enables a React structure as shown below.

class Agent.
    def __init__(self, path: str = '') -> None.
        None.

    def build_system_input(self).
        # Build the system input described above
        pass

    def parse_latest_plugin_call(self, text).
        # Parse the first big model to return the selected tool and tool parameters
        pass

    def call_plugin(self, plugin_name, plugin_args).
        # Call the selected tool
        pass

    def text_completion(self, text, history=[]).
        # Integrate two calls
        pass

2.Call example

The InternLM2-chat-7B model was used for Q&A testing:

3. Summary

Using a 4060ti 16G graphics card to test Agent, consuming 15G of video memory, the overall test down to the feeling of the Shu Sheng - Pu language 7b model to complete some basic quiz tasks is not much of a problem, but the context to answer the question is still not in place to understand; not so smart.