AI Practical: Spring-AI Re-Update! A closer look at Advisors

On October 8, 2024, Spring AI was updated again, and although the current version is still unstable (1.0.0-M3), the blogger will continue to follow these developments and deeply analyze its technical underpinnings from the perspective of popular intelligences. Currently, Spring AI is still in a niche state, and has not yet gone through the open source community's multi-year maintenance and stabilization process, which is in stark contrast to the Spring framework, which is already more mature. Even the stable version of Spring AI (1.0.0-SNAPSHOT), in the common maven repository is also difficult to find , you still need to Spring's jfrog repository access.

Okay, let's stop beating around the bush and get straight to the point. In version 1.0.0-M3, a number of important updates were made, and I'm going to go through each of these features in detail. Today's focus is to deeply analyze the concept of Advisors, as it has many similarities with some of the techniques we are currently using in our work, and can help you understand the relevant content more easily. Therefore, I believe that through this part of the explanation, you will be able to better grasp the core features of Spring AI. Now, let's get started!

What is Spring AI Advisors?

The core function of Spring AI Advisor is to intercept and potentially modify the components of the chat request and response streams in an AI application. The AroundAdvisor is a key player in this system, allowing developers to dynamically transform or utilize information during these interactions.

The main advantages of using Advisor include:

Encapsulation of repetitive tasks: The ability to package common generative AI patterns into reusable units simplifies the development process.
data conversion: Enhance the data sent to the language model (LLM) and optimize the format of the response returned to the client to improve the quality of the interaction.
portability (programming language): Create reusable transformation components that work across different models and use cases, improving code flexibility and adaptability.

You might think that this is quite similar to the way we use AspectJ in Spring. In fact, as you read through today's article, you'll realize that it's just a change of name, and the main functionality is actually the same - it's all about enhancing the capabilities of the application.

Advisors VS Advised

Let's briefly clarify the terms "Advisors" and "Advised", which are not directly related to each other. In fact, they are not directly related to each other, just because we often come across them when looking at the source code; Advisors are the various enhanced classes we create that are responsible for handling request links differently. Advised is an adjective that describes a class that is no longer an ordinary class, but has been enhanced to take on new characteristics. Although they also enhance the requesting class, this enhancement is mainly realized by means of attribute migration.

The following illustration will help to further understand this.

How Advisors Work

If you've written AspectJ annotated classes before, then you should be able to easily surmise how Advisors work. In the Advisor system, individual Advisors operate in a chained structure, with each Advisor in the sequence having the opportunity to process incoming requests and outgoing responses. This chained processing mechanism ensures that each Advisor can add its own logic to the request and response streams, allowing for more flexible and customizable functionality.

To help understand the process, here's a quick look at the official flowchart provided. This diagram shows in detail how individual Advisors interact in the request chain and how they work together to enhance the overall functionality.

Let me explain the whole process in general:

First, we encapsulate the various request parameter configurations, as shown in the previous screenshot of AdvisedRequest, which will not be detailed here. Next, each Advisor in the chain processes the request, possibly modifying it, and forwards the execution flow to the next Advisor in the chain.It is worth noting that some Advisors may also choose not to invoke the next entity, thus preventing the request from being passed on.

Eventually, the Advisor sends the request to the Chat Model.The response from the Chat Model is passed back through the Advisor chain to the original request path, forming a combination of the original context and the suggested context. Each Advisor has the opportunity to process or modify this response to ensure that it meets expectations. Finally, the system will return an AdvisedResponse to the client.

Next, we'll dive into how to actually use Advisor.

Using Advisor

Embedded Advisor

As part of Spring AI, there are several official Advisor examples built into the system, which are not only few in number, but also have different features that demonstrate well the practical application scenarios of Advisors. Let's take a look at each of these built-in Advisors one by one to gain a deeper understanding of how they play their respective roles in the request processing chain.

MessageChatMemoryAdvisor is a commonly used class that we mentioned before and plays an important role in the request handling process. The main function of this Advisor is to add the questions asked by the user and the answers given by the model to the history, thus creating a contextual memory enhancement mechanism. In this way, the system is able to better understand the user's needs and provide a more coherent and relevant response.
- It is important to note that not all AI models support this type of contextual memory storage and management. Some models may not implement the appropriate history functionality, so when using MessageChatMemoryAdvisor, it is critical to ensure that the model being used has this support.
PromptChatMemoryAdvisor's functionality is further enhanced by MessageChatMemoryAdvisor, and its main role lies in the way contextual chat logs are handled. Unlike MessageChatMemoryAdvisor, PromptChatMemoryAdvisor does not pass the contextual chat logs directly into the messages parameter, but cleverly encapsulates them in the systemPrompt prompt word. This design allows the system to effectively add context history memory regardless of whether the model used supports the messages parameter or not.
The main function of QuestionAnswerAdvisor is to perform RAG (Retrieval-Augmented Generation) retrieval, a process that involves efficiently invoking the knowledge base. When a user asks a question, QuestionAnswerAdvisor first searches the knowledge base and adds the relevant referenced text that matches the user's question, thus providing a richer and more accurate context for the generated answer.
- In addition, the Advisor sets a default prompt word designed to ensure the quality and relevance of the answer. If no matching text can be found in the knowledge base, the system will refuse to answer the user's question.
The core function of SafeGuardAdvisor is to perform sensitive word checks to ensure the security and compliance of the system when processing user input. When a user submits information that triggers the sensitive word mechanism, SafeGuardAdvisor will immediately intercept the request midway through the process, avoiding the need to continue to invoke large models for processing.
SimpleLoggerAdvisor: This is a tool for log printing that we've practiced and dug into before, so we won't go into it again here.
VectorStoreChatMemoryAdvisor: This component implements a long-term memory function that is able to store each time the user asks a question and the model's answer into a vector database. At each time the user asks a question, the system performs a retrieval and accumulates the retrieved information to the back of the system prompt words in order to provide more accurate contextual hints for the larger model. However, it is important to note here that without proper maintenance of thechat_memory_conversation_idThis can lead to unlimited writes and retrievals, which can lead to potentially catastrophic bugs. it is therefore critical to ensure that this identifier is managed and updated in order to avoid system instability and data disruption.

Here, we focus onchat_memory_conversation_id parameter, which is a key element in all Advisors. It is important to maintain this parameter properly for each user to avoid generating new IDs by default each time, which can create a lot of garbage data in the vector database. In addition, once this parameter is maintained, it can be used in the background to clean up old data in the vector database. Note that we're using thechat_memory_conversation_id, rather than a simple ID, so a query is required for deletion and cleanup.

Customize Advisor

In fact, we've implemented a simple logging Advisor before. today, we're going to build a more advanced Advisor based on the Re-Reading (Re2) technology. in fact, it's not that complicated to understand the process, the core of which lies in wrapping up the question of the request in advance.

For those of you who are interested, I recommend that you check out the implementation and effects associated with the Re2 technology, details of which can be found in this paper:Re2 Technology Realization。

Also, regarding optimization of cue words, if you're particularly interested in this area, I'd recommend browsing the free and open-source blog documentation, where there are a lot of valuable resources to consult:Cue word guide。

Next, without further ado, let's take a look at the official sample code:

public class ReReadingAdvisor implements CallAroundAdvisor, StreamAroundAdvisor {
  private static final String DEFAULT_USER_TEXT_ADVISE = """
      {re2_input_query}
      Read the question again: {re2_input_query}
      """;

  @Override
  public String getName() {
      return ().getSimpleName();
  }

  @Override
  public int getOrder() {
      return 0;
  }

  private AdvisedRequest before(AdvisedRequest advisedRequest) {

      String inputQuery = (); //original user query

      Map<String, Object> params = new HashMap<>(());        
      ("re2_input_query", inputQuery);

      return (advisedRequest)
              .withUserText(DEFAULT_USER_TEXT_ADVISE)
              .withUserParams(params)
              .build();
  }

  @Override
  public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {
      return (before(advisedRequest));
  }

  @Override
  public Flux<AdvisedResponse> aroundStream(AdvisedRequest advisedRequest, StreamAroundAdvisorChain chain) {
      return (before(advisedRequest));
  }
}

As you can see, there is not really too much code written in the implementation here, just a declaration of a global text template and a simple encapsulation of it before the request. The core of this design idea is to improve the validity and contextual relevance of the request by preprocessing the template. According to Re2's official description, this approach not only simplifies the code structure, but also significantly improves the answering effectiveness of the model.

Shared ParametersAdvisor

In previous lectures, for example, we discussed themessageChatMemoryAdvisorThe default settings are actually written specifically for parameter configuration at the time of the bean declaration. Nonetheless, we can still configure it dynamically by passing in parameters, and this flexibility allows us to adapt the parameters to our actual needs. You can pass in any parameter you need and read and apply it in an overridden method, making the Advisor more flexible and adaptable to different scenarios.

Here, we'll take the officialmessageChatMemoryAdvisor As an example, show how it was previously written to better understand this configuration process.

ChatDataPO functionGenerationByText(@RequestParam("userInput") String userInput) {
    OpenAiChatOptions openAiChatOptions = ()
            .withModel("hunyuan-pro").withTemperature(0.5).build();
    String content =
            .prompt()
            .system("Please, as a little AI assistant for Xiaoyu, please format the data returned by the tool to respond to the user's questions in a friendly way. Develop a travel guide with flight, hotel, and train information.")
            .user(userInput)
            .options(openAiChatOptions)
            .advisors(messageChatMemoryAdvisor,myLoggerAdvisor,promptChatKnowledageAdvisor)
// The configuration class is as follows:
@Bean
MessageChatMemoryAdvisor messageChatMemoryAdvisor() {
    InMemoryChatMemory chatMemory = new InMemoryChatMemory();
    return new MessageChatMemoryAdvisor(chatMemory, "123",10);
}

The effect of passing parameters can be rewritten in the following way, with some redundant and repetitive code omitted from the specific implementation in order to show the main logic more clearly:

.advisors(messageChatMemoryAdvisor,myLoggerAdvisor,promptChatKnowledageAdvisor)
.advisors(advisor -> ("chat_memory_conversation_id", "678")
        .param("chat_memory_response_size", 100))

In this way, we are able to read the parameters in real time and configure them appropriately before calling them, as shown in the specific source code example below:

We have officially encapsulated the commonly used read templates and provided a series of efficient and easy-to-use functional interfaces. We just need to call these predefined templates directly to quickly realize the required operations.

Updating parameters

In addition to setting some shared parameters before we start calling, we can dynamically adjust these parameters during runtime to better adapt to real-time changing requirements and environments:

@Override
public AdvisedResponse aroundCall(AdvisedRequest advisedRequest, CallAroundAdvisorChain chain) {

     = (context -> {
        ("aroundCallBefore" + getName(), "AROUND_CALL_BEFORE " + getName());  // Add multiple key-value pairs
        ("lastBefore", getName());  // Add a single key-value pair
        return context;
    });

    // Method implementation continues...
}

That's it for today's Advisors presentation, I hope it has given you some new inspiration and food for thought.

summarize

Spring AI Advisors provides a powerful and flexible approach designed to significantly enhance the functionality and performance of your AI applications. By leveraging this API, you can create more complex, reusable, and easy-to-maintain AI components that improve development efficiency and system scalability.

Whether you're implementing custom logic to meet specific business needs, managing conversation history to optimize the user experience, or improving model inference for more accurate results, Advisors provides you with a clean and efficient solution. This flexibility allows developers to respond quickly to changes while keeping code clean and readable, which in turn provides a smoother and smarter experience for users.

I'm Rain, a Java server-side coder, studying the mysteries of AI technology. I love technical communication and sharing, and I am passionate about open source community. I am also a Tencent Cloud Creative Star, Ali Cloud Expert Blogger, Huawei Cloud Enjoyment Expert, and Nuggets Excellent Author.

💡 I won't be shy about sharing my personal explorations and experiences on the path of technology, in the hope that I can bring some inspiration and help to your learning and growth.

🌟 Welcome to the effortless drizzle! 🌟