AI Practical: Spring AI + Hybrid Hands-on takes you to realize enterprise-grade stable and deployable AI business intelligences

preamble

In the previous content, we explained in detail the basic usage of Spring AI and its underlying principles. If there are still partners confused about this, welcome to refer to the following article, in-depth study and further master the relevant knowledge:/guoxiaoyu/p/18441709

Today, we will focus on the on-the-ground performance of AI in practical applications, especially how Spring AI can help enterprises achieve functional optimization as well as promote the deep integration of AI and business. We will take the current intelligent body track, which is widely pursued by large manufacturers, as an entry point to explore its application in practical scenarios. Considering that many students may have already been exposed to intelligent bodies, using this topic as the basis for discussion can more effectively help you understand the practical operation and effect of related concepts and technologies.

Therefore, in this section, we will use intelligences as a starting point and walk you through the process of easily implementing a locally stable and deployable intelligences solution. Along the way, I will go through each step in detail to ensure that you can follow along smoothly. In addition, at the end of the chapter, I will analyze the pros and cons of this solution versus existing intelligences based on my understanding to help you fully understand the pros and cons of the different options.

preliminary

Of course, Spring AI integrates interface implementations from many well-known companies. If you really want to use OpenAI's interfaces, you can consider the domestic mashup API. mashup API is compatible with OpenAI's interface specification, which means that you can directly use the official SDK provided by OpenAI to call the mashup's big model. This design greatly simplifies the migration process. You only need to replace the base_url and api_key with the configuration related to the mashups without additional modifications to your existing application. In this way, you will be able to seamlessly switch your application to the hybrid big model and enjoy the powerful AI features and support.

Request API KEY

There is no need to worry at all, after my personal testing, all interfaces are currently compatible and no anomalies or problems have been found. Applications can be made through the following link:Hybrid API Application Address

Make sure to request the relevant API KEY under your personal account.

Please be sure to keep your API KEY information safe, as it will become very important during subsequent use.

Docking Documents

Understanding some of the considerations here is not mandatory, as we don't need to interface directly with Hunyuan. In fact, we can use OpenAI-compatible interfaces directly in Spring AI, which can greatly simplify our process. If you are interested in learning more about the API documentation, you can find the interface documentation, which provides detailed instructions and guidance:API documentation

Please note in particular that the models that support function callbacks are limited to the following three, as the intelligences need to call the relevant plugins or workflows at runtime. This means that no other models than these three have this support. Please make sure to consider this when selecting a model.

Please note that at this time, hybrid has not yet released pre-paid large model resource packs, and users can only make pre-purchases for concurrent packs. For billing details, please see the graphic below.

Project Configuration

Next, we will continue to use the previous Spring AI demo project and make the necessary changes to it. The specific Maven POM dependencies that need to be adjusted are shown below:

<dependency>
    <groupId></groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>

As shown in the figure, I chose to use the functioncall model in my first profile because of its relatively low cost. However, I later found that the model did not perform well in recognizing system prompt words, and I switched to the pro model later on, so you can make your choice accordingly to your specific needs and budget.

functioncall is not sensitive to the prompt word but the result of the function callback can be well parsed, pro is sensitive to the prompt word but the result of the function callback he does not answer directly, has been outputting the contents of the planner but just do not reply to the user. There will be a detailed explanation later.

file for global configuration, and all ChatClients follow this setup. A significant benefit of this is that developers can easily switch to a different AI big model vendor without making any changes at the code level, simply by changing the appropriate dependencies in the Maven POM file. This flexibility not only improves project maintainability, but also facilitates model replacement and upgrades.

Spring AI Intelligence Body Building

Now, assuming you have done all the preparatory work, we can start building our own intelligences. First, we will focus on individually customizing the configuration parameters. As mentioned before, the documentation is a global setting that applies to all ChatClients, but each model actually has its own specific domain and application scenario. Therefore, we first need to configure how to personalize each interface to ensure that the model behaves more closely to the actual business requirements.

Personalized Configuration Model

ordinary call

First, let's observe how the code should be written under normal circumstances:

@PostMapping("/ai-function")
ChatDataPO functionGenerationByText(@RequestParam("userInput") String userInput) {
    String content =
            .prompt()
            .system("You are the hard-working Xiaoyu, a Java server-side coder, studying the mysteries of AI technology. Passionate about technical communication and sharing, and passionate about the open source community.")
            .user(userInput)
            .advisors(messageChatMemoryAdvisor)
            .functions("CurrentWeather")
            .call()
            .content();
    ("content: {}", content);
    ChatDataPO chatDataPO = ().code("text").data(().text(content).build()).build();;;
    return chatDataPO;
}

As shown in the figure, if we set a breakpoint ahead of time before we initiate a request, we'll be able to see at that moment thechatOptions parameter, which represents our default configuration settings. Therefore, our main goal is to explore how to configure thechatOptions parameters are effectively modified.

In the course of testing the cue words, we found that thefunctioncall Models forsystem The response of the cue words is not significant and does not seem to work as expected. However, a significant advantage of this model is that it supports function callback functionality (as explained in detail in the previous sections), and in addition, with thepro model compared to thefunctioncall The model is also relatively inexpensive to use, which makes it a more cost-effective option in some cases.

special call

The model can be configured individually in order to tailor the responses to the cue word. If you want to adjust a specific method instead of using a global setting like the one in this section, you can do so by modifying the corresponding parameter yourself. The configuration is shown below:

// Omit the repetitive code
OpenAiChatOptions openAiChatOptions = ()
        .withModel("hunyuan-pro").withTemperature(0.5f).build();
String content =
        .prompt()
        .system("You are the hard-working Xiaoyu, a Java server-side coder, studying the mysteries of AI technology. Passionate about technical communication and sharing, and passionate about the open source community.")
        .user(userInput)
        .options(openAiChatOptions)
        .advisors(messageChatMemoryAdvisor)
        // Omit duplicate code
}

Here, we simply configure the relevant options to complete the setup. Next, we can check the relevant configurations in the section on breakpoints to make sure that these settings have taken effect and are working properly.

In the same way of writing, for example, the pro model we set up before appears to be more friendly in handling system prompt words compared to the function-call model.

Paths of reflection

In fact, in the vast majority of intelligences, these thought paths are not displayed, only the Baidu side of the intelligences system presents them to the user. These thought paths are generated and returned by the big model, so I didn't do any additional configuration here. In fact, we can choose to return these paths as well, and the source code for that is here:

private void writeWithMessageConverters(Object body, Type bodyType, ClientHttpRequest clientRequest)
        throws IOException {

//omit code
    for (HttpMessageConverter messageConverter : ) {
        if (messageConverter instanceof GenericHttpMessageConverter genericMessageConverter) {
            if ((bodyType, bodyClass, contentType)) {
                logBody(body, contentType, genericMessageConverter);
                (body, bodyType, contentType, clientRequest);
                return;
            }
        }
        if ((bodyClass, contentType)) {
            logBody(body, contentType, messageConverter);
            (body, contentType, clientRequest);
            return;
        }
    }
//omit code
}

As you can see in the figure, we currently only have a simple printing operation and have not implemented a message converter. Considering that our business system does not need to display this information to customers, we believe that the current implementation is sufficient.

You can see the printout of the Think Path information as shown below:

[453] -| Writing [ChatCompletionRequest[messages=[ChatCompletionMessage[

Omitting the rest, the key code is as follows:

role=SYSTEM, name=null, toolCallId=null, toolCalls=null, refusal=null], ChatCompletionMessage[rawContent=What's the weather like in Changchun? , role=USER, name=null, toolCallId=null, toolCalls=null, refusal=null], ChatCompletionMessage[rawContent=Use the 'CurrentWeather' function to get the weather in Changchun. The user wants to know the current weather conditions in Changchun. The user's request is about getting weather information for a specific location, which matches the 'CurrentWeather' function provided by the tool.

,## Omit others

Configuring Plug-ins

I've previously explained in detail in a video how Smartbody creates custom plugins. In this practice, we will continue to utilize the Baidu Weather plugin to get real-time weather information. However, unlike before, this time we will integrate this functionality into the Spring AI project.

Database Configuration

Every business system usually comes with its own database to better serve its users. To demonstrate this, we will create a MySQL example with the specifics of getting the area code values and passing them to the API for calling. During this process, you can perform various operations on the database through plugins, but here we will mainly focus on the demonstration of queries.

In this example, I will continue to use Tencent Cloud Lightweight Application Server to build a MySQL standalone environment. After successfully setting up the environment, we will move on to the subsequent operations. Please make sure that all the necessary configurations and settings are completed before you start so that database queries and API calls can be made smoothly.

The following POM file dependencies are relevant to the configuration in question:

<dependency>
    <groupId>mysql</groupId>
    <artifactId>mysql-connector-java</artifactId>
    <version>5.1.49</version>
</dependency>
<dependency>
    <groupId></groupId>
    <artifactId>mybatis-plus-spring-boot3-starter</artifactId>
    <version>3.5.7</version>
</dependency>

The database connection configuration information is as follows:

=jdbc:mysql://ip:3306/agent?useSSL=false&serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8
=agent
=password
-class-name=

Import data

I have successfully completed the import of the data provided by Baidu Maps, as shown in the figure below:

Operational databases

Next, you can simply perform database operations directly inside the plugin class. AboutSearchHttpAK entity class, you can copy it directly from the Java SDK provided by Baidu Maps without additional instructions. Also, please note that theareaInfoPOMapper You need to do your own bean injection in the configuration class to make sure it works.

public class BaiDuWeatherService implements Function<Request, Response> {

    AreaInfoPOMapper areaInfoPOMapper;
    
    public BaiDuWeatherService(AreaInfoPOMapper areaInfoPOMapper) {
         = areaInfoPOMapper;
    }
    @JsonClassDescription("location:City Address,for example：Changchun prefecture level city in Jiangsu")
    public record Request(String location) {}
    public record Response(String weather) {}

    public Response apply(Request request) {
        SearchHttpAK snCal = new SearchHttpAK();
        Map params = new LinkedHashMap<String, String>();
        QueryWrapper<AreaInfoPO> queryWrapper = new QueryWrapper<>();
        ("city", ());
        List<AreaInfoPO> areaInfoPOS = (queryWrapper);
        String reslut = "";
        try {
            ("district_id", (0).getCityGeocode());
            reslut = "Weather information is available.，Please organize the information，Respond to users in a clear and understandable way:" + (params);
            ("reslut:{}", reslut);
        } catch (Exception e) {
            //Use this return with caution，It will result in a wireless call to the toolchain，So please set the number of times by yourself or just return the error.。
            //reslut = "This call failed，Please re-callCurrentWeather！";
            reslut = "This call failed了！";
    }
    return new Response(reslut);
}

Regardless of whether this operation is successful or not, be sure to avoid letting the big model initiate another call on its own. Doing so may cause the program to fall into a dead loop, which can affect the stability and reliability of the system. Always ensure proper control and management after the operation to prevent this from happening.

plug-in call

In this way, when we ask again about the weather in Changchun, the Big Model will be able to effectively utilize the data returned by the plugin to answer our questions accurately and in a timely manner.

In the previous discussion, we mentioned that the Pro model is very sensitive to system prompt words. However, it is important to note that it does not directly optimize the returned callback results.

To ensure that the system responds as expected, the use of system cue words is again recommended here for limitation and guidance. With explicit cue words, we can better control the output of the model.

Please format the data returned by the tool to respond to the user's question in a friendly manner.

Optimized to return normal results:

Workflow configuration

I will not demonstrate workflow in Spring AI here; in fact, the business logic written by some of our plugins essentially forms a logical framework for workflow. Next, I want to focus on how to utilize third-party workflow tools to quickly meet business needs.

Integration with third-party workflows

When considering the use of Spring AI for smart body functionality, we should not easily discard third-party visualization platforms. Integrating these third-party workflows can help us quickly realize the required functionality, especially in the development process, where the requirements for writing Java code are often tedious and complex, and a simple requirement may involve the creation and maintenance of multiple entity classes. In contrast, certain simple business logic can be realized through third-party workflows, which can undoubtedly improve our development efficiency and reduce unnecessary workload.

Taking the Coze Intelligent Body platform as an example, we can first focus on writing an efficient workflow. The main goal of this workflow is to provide users with a comprehensive query service, including information on travel flights, train schedules, hotel bookings, and so on.

We need to do the subsequent interfacing work after applying for the API key and scrutinize the development documentation to ensure smooth integration and implementation of the required functionality.

Workflow plug-ins

Based on the above information, we can encapsulate workflow calls into plugins. In fact, for the Intelligent Body Platform, both workflows and plug-ins essentially exist in the form of function calls, so the process of converting workflows to plug-ins is relatively simple and straightforward.

public class TravelPlanningService implements Function<RequestParamer, ResponseParamer> {

    @JsonClassDescription("dep_city:Departure City Address,For example, Changchun;arr_city:arrival city,For example, Beijing")
    public record RequestParamer(String dep_city, String arr_city) {}
    public record ResponseParamer(String weather) {}

    public ResponseParamer apply(RequestParamer request) {
        CozeWorkFlow cozeWorkFlow = new CozeWorkFlow<RequestParamer>();

        Map params = new LinkedHashMap<String, String>();
        String reslut = "";
        try {
          //I've wrapped it up here.httpcall (programming)
            reslut = ("7423018070586064915",request);;
            ("reslut:{}", reslut);
        } catch (Exception e) {
            reslut = "本次call (programming)失败了！";
        }
        return new ResponseParamer(reslut);
    }
}

Since our RequestParamer uses the record recording feature introduced in Java 14, and older versions of Fastjson do not support converting it to JSON format, it is important to use the latest version of the Fastjson dependency in your project. If you use an incompatible older version, the feature will not be executed properly or will fail.

<dependency>
    <groupId></groupId>
    <artifactId>fastjson</artifactId>
    <version>2.0.31</version>
</dependency>

After configuration, if the Coze plugin is working properly, then we can start providing the appropriate answers for the hybrid macromodel.

workflow invocation

We have successfully integrated the plugin into the request handling process, the code for the implementation is shown below:

// Omit duplicate code
.functions("CurrentWeather", "TravelPlanning")
.call()
.call() .content();
//Omit duplicate code

The response time of a hybrid macromodel is usually significantly longer due to the more lengthy information returned. In this case, our normal API calls may time out, resulting in unsuccessful retrieval of the expected results. Specific error messages are shown below:

I/O error on POST request for "/v1/chat/completions": timeout

retryTemplate timeout fix

We need to retool the current configuration. At first, I thought the problem was with the configuration of the retryTemplate, as we mentioned this in the previous discussion. However, upon closer inspection, I realized that the retryTemplate is only responsible for retry-related message configuration, and there is no timeout setting involved. To further troubleshoot the issue, I dug deeper into the back source code and eventually realized that I needed to modify the RestClientAutoConfiguration class accordingly.

It is worth noting that the RestClientAutoConfiguration class provides options for customized configuration, allowing us to have more granular control over the behavior of the request. Below is an example of the source code for this class, showing what specific tweaks we can make:

@Bean
@ConditionalOnMissingBean
RestClientBuilderConfigurer restClientBuilderConfigurer(ObjectProvider<RestClientCustomizer> customizerProvider) {
    RestClientBuilderConfigurer configurer = new RestClientBuilderConfigurer();
    (().toList());
    return configurer;
}

@Bean
@Scope("prototype")
@ConditionalOnMissingBean
 restClientBuilder(RestClientBuilderConfigurer restClientBuilderConfigurer) {
     builder = ()
        .requestFactory(());
    return (builder);
}

Therefore, we need to make the necessary changes to restClientBuilder. Currently, the DEFAULTS configuration in restClientBuilder is all null, which means it is using the default configuration. And when we call the coze workflow, it actually integrates okhttp internally as it uses the okhttp class and therefore follows the okhttp configuration.

To solve this problem, we can directly adjust the configuration of ClientHttpRequestFactorySettings to set our desired timeout. The specific configuration adjustments are shown below:

@Bean
 restClientBuilder(RestClientBuilderConfigurer restClientBuilderConfigurer) {
    ClientHttpRequestFactorySettings defaultConfigurer =  
            .withReadTimeout((5))
            .withConnectTimeout((30));
     builder = ()
            .requestFactory((defaultConfigurer));
    return (builder);
}

Note that the messageConverter is also configured here in the thought path just mentioned. You can completely personalize it if there is a specific need. The key code section is as follows, this code will call our custom method in order to implement the customized logic.

If you wish to set additional personalized configurations or information, you can refer to the following example to make adjustments.

public  configure( builder) {
    applyCustomizers(builder);
    return builder;
}

private void applyCustomizers(Builder builder) {
    if ( != null) {
        for (RestClientCustomizer customizer : ) {
            (builder);
        }
    }
}

At this point, after a series of tweaks and configurations, we have successfully solved the timeout problem. This means that when calling the hunyuan model, we can now get the returned results without any problems.

Private Knowledge Base

Since intelligences have the common and important feature of a knowledge base, we will implement this part as well. It is worth noting that hunyuan's API is compatible with vector functionality, which means that we can directly utilize the knowledge base to enhance the capabilities of the intelligences. With this implementation, we will not only enjoy unlimited access, but also a high degree of customization to meet specific business needs.

What's more, this design allows us to use the knowledge base with complete autonomy and control, and you don't have to worry about data leakage.

Vector Database Configuration

Next, we'll move on to integrating Milvus, a vector database feature we've used before. While Tencent Cloud offers its own vector database solution, it has not yet been integrated into Spring AI. For demonstration and development purposes, we have decided to use Milvus as our vector database first.

In order to successfully complete this integration, we need to configure the appropriate dependencies as follows:

<dependency>
    <groupId></groupId>
    <artifactId>spring-ai-milvus-store-spring-boot-starter</artifactId>
</dependency>

The required configuration files are listed below:

# Configure Milvus client host address
=
# Configure Milvus client port number
= #19530

# Configure Milvus database name
=
# Configure Milvus collection name
=
# If there is no collection one will be created by default, default value is false
-schema=true
# Configure the vector embedding dimension
=1024
# Configure the index type
=IVF_FLAT
# Configure distance metric type
=COSINE

Tencent hybrid embedding interface currently only supports input and model parameters, model is currently fixed to hunyuan-embedding, dimensions are fixed to 1024.

-url=
=hunyuan-embedding
=1024

Here, we still use the API-key of the requested hybrid macromodel, so there is no need to configure it again. It is worth emphasizing that the correct configuration of these parameters is crucial. Failure to set them properly will result in errors when the system is invoked.

basic operation

Most intelligent body platforms will have full access to the knowledge base so that users can freely perform operations such as viewing, modifying, deleting and adding. Next, we will demonstrate how to perform these operations:

@GetMapping("/ai/embedding")
public Map embed(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
    EmbeddingResponse embeddingResponse = ((message));
    return ("embedding", embeddingResponse);
}

@GetMapping("/ai/addKnowledage")
public boolean addKnowledage(@RequestParam(value = "meta-message") String message,@RequestParam(value = "vector-content") String content) {
    String uuid = ().toString();
    DocumentInfoPO documentInfoPO = new DocumentInfoPO();
    (uuid);
    (message);
    (content);
    (documentInfoPO);
    List<Document> documents = (
            new Document(uuid,content, ("text", message)));
    (documents);
    return true;
}

@GetMapping("/ai/selectKnowledage")
public List<Document> selectKnowledage(@RequestParam(value = "vector-content") String content) {
    List<Document> result = ((content).withTopK(5).withSimilarityThreshold(0.9));
    return result;
}

@GetMapping("/ai/deleteKnowledage")
public Boolean deleteKnowledage(@RequestParam(value = "vector-id") String id) {
    Optional<Boolean> delete = ((id));
    return ();
}

The following is my personal opinion: the basic realization of the add, delete and check operation has been completed. The reason for the third-party intelligent body platform to provide the modification operation is that the subsequent process is to re-insert the data after deleting it, and this operation is unavoidable because everyone has the need to modify. In addition, it is worth noting that the default vector database does not support the display of all data, and this limitation prompts the need to introduce the corresponding database operations to compensate for this deficiency and ensure the integrity and operability of the data.

To better validate the process, I went ahead and called the interface and uploaded some knowledge base data. Next, I'll show how querying this data works.

This is the knowledge base information I just uploaded. For the sake of efficiency, next I will directly show the RAG (Retrieval-Augmented Generation) retrieval function of the knowledge base in our intelligences.

automatic calling

According to my current observations, all intelligent body platform can be divided into two main ways of implementation: automatic invocation and on-demand invocation. The implementation of most platforms is still based on automatic invocation, unless written in the workflow is also on our function, then it is the same as the plug-in above, I will not explain. Today, I will focus on how the automatic call is realized.

The implementation of automatic knowledge base invocation relies on the Advisor interface by constructing an additional cue word before each request. Currently, Spring AI has implemented long-term memory functionality with the specific class VectorStoreChatMemoryAdvisor.Therefore, we can directly refer to the implementation of this class in order to build an automatic knowledge base invocation system that meets our needs.

We can perform one implementation. Since our main goal is to enable a better understanding of the context when providing reference information to the larger model, the enhancement part of the response can be simply ignored. This means that we don't need to do any additional processing or optimization of the content of the response in the process, as shown in the specific code example below:

public class PromptChatKnowledageAdvisor implements RequestResponseAdvisor {

    private VectorStore vectorStore;
    private static final String userTextAdvise = """

            Please use the following references to answer the questions.If there is no reference，Then please just answer.。

            ---------------------
            The reference information is as follows:
            {memory}
            ---------------------

            """;

    public PromptChatKnowledageAdvisor(VectorStore vectorStore) {
         = vectorStore;
    }
    @Override
    public AdvisedRequest adviseRequest(AdvisedRequest request, Map<String, Object> context) {
        // 1. Add a Knowledge Base Tip
        String advisedSystemText = () + () + ;
        List<Document> documents = (());
        // 2. Splicing Knowledge Base Data
        String data = ().map(d -> ().get("text").toString()).collect((","));
        Map<String, Object> advisedParams = new HashMap<>(());
        ("memory", data);
        // 3. Assigning Prompt Word Parameters
        AdvisedRequest advisedRequest = (request)
            .withSystemText(advisedSystemText)
            .withSystemParams(advisedParams) //repositoryRAGretrieve
            .build();

        return advisedRequest;
    }

    @Override
    public ChatResponse adviseResponse(ChatResponse chatResponse, Map<String, Object> context) {
        //No need to modify anything
        return chatResponse;
    }

    @Override
    public Flux<ChatResponse> adviseResponse(Flux<ChatResponse> fluxChatResponse, Map<String, Object> context) {
        //No need to modify anything
        return fluxChatResponse;
    }

}

need to be passed in the configuration class via constructor injection for the sameVectorStore Example.

@Bean
PromptChatKnowledageAdvisor promptChatKnowledageAdvisor(VectorStore vectorStore) {
    return new PromptChatKnowledageAdvisor(vectorStore);
}

Next, we simply add the appropriate code or configuration to the request method in order to integrate the new functionality.

//Omit duplicate code
.advisors(messageChatMemoryAdvisor,promptChatKnowledageAdvisor)
.functions("CurrentWeather","TravelPlanning")
.call()
.content();
//Omit duplicate code

This is the dramatic effect of auto-calls, where all operations are fully encapsulated, clear and easy to understand.

Next, let's look at the second call-on-demand approach, which is realized through the use of plug-ins (i.e., function callbacks). In this mode, the system can dynamically call the corresponding plug-ins according to the actual need to provide flexible and efficient functional support. We have previously demonstrated two related plug-ins, so we will not show them in detail here.

Online Deployment

Instead of deploying it separately to the server, I decided to expose the interface using a local startup. In addition, I have created a separate page specifically for this purpose, and given that this part is not the focus of this chapter, I will not go into detail about the front-end knowledge.

To better demonstrate these, I have provided the relevant demo videos for your reference:

weigh the pros and the cons

First, I'd like to talk about some of the notable advantages of the current major smart body platforms:

visualization: These platforms provide intuitive visualization interfaces that enable even beginners to quickly develop business intelligence to better meet their business needs.
Multiple Distribution Channels: Many platforms support multiple publishing channels, such as public, which is very friendly to newbies. In contrast to simply configuring the server backend, which often requires specialized knowledge, these platforms greatly lower the barrier to entry.
Extensive plugin store: Regardless of the smart body platform, the diversity of plugins is crucial. These platforms usually offer a variety of official and developer-created plugins that help users extend their functionality to meet different needs.
Diversified workflow: The workflow functionality is actually similar to that of a plug-in, just with a different name. For external systems, these workflows are integrated through API interfaces, improving interoperability and flexibility between systems.

Everything in this world has flaws, and intelligences are no exception. Even a powerful platform like Coze has some of the same shortcomings. The following points are particularly obvious:

Functional Exception Handling: When an intelligent body is functioning abnormally, even if you submit a work order, customer service and technicians are often slow to solve the problem. In this case, you can only wait helplessly, unable to determine where the problem lies. If it's just an individual user's problem, they may not even give feedback on scheduling. Whereas, if it's a smart body you've developed yourself, you can quickly pinpoint the problem when you encounter an error, and you can always make a fix and release a new version regardless of demand.
Knowledge base storage limitations: Since these intelligences are intended for a wide range of users, the amount of storage in the knowledge base is often limited, and charges may start to be levied in the future. coze has gradually introduced different tariffs, with a variety of rates that will make you look at them. In this case, it is undoubtedly more cost-effective to maintain a server yourself. In addition, the current major cloud service providers and domestic databases have vector database recommendations, and usually provide incentives, very attractive.
Optimization of knowledge base information: Knowledge bases are managed differently by the major intelligence platforms, and users need to spend time adapting to the way they operate. The advantage of maintaining the vector database yourself is that all the additional metadata information can be freely configured, and you are able to filter the information according to your specific business needs, so as to better meet your own business standards. This is a flexibility that other intelligence platforms cannot offer.
Costs are not controllable: It is critical for organizations to manage the controllability of various fees. However, the fees of intelligent body platforms are often uncontrolled with the increase of traffic, which may lead to indiscriminate fees and put enterprises in a passive situation. In contrast, when developing smart bodies on your own, you are free to change models and the fees are under your control, whether it's server fees or large model fees, all of which can be effectively managed.
Weak selectivity: Intelligent body platforms are usually tied to their own organization, limiting the user's freedom of choice. One day, the platform may decide to no longer support a particular big model, in which case the associated workflows will also need to be replaced altogether, as there are significant differences in the response capabilities of different big models, resulting in the user having to re-adapt.
Wait .....

Having said that, it's not to say that Spring AI will completely replace the Intelligent Body platform in the future. After all, for niche customers, there is usually a lack of development and maintenance staff to manage code. Therefore, the future trend is likely to be for the two to complement each other. Intelligent body platforms have the development speed and capability to essentially meet 80% of the needs of the business, a principle that coincides with the rule of two or eight practiced by the big players. The remaining 20% may need to be made up by the company's own in-house development of smart body platforms, and this percentage may even be higher.

Therefore, it is the mastery of the relevant technology that is the most critical factor for enterprises in this change. Possessing technical capabilities will enable enterprises to be more flexible in choosing and using intelligent body platforms, and be able to customize and optimize them according to their specific needs. At the same time, I also hope that the hybrid big model can be compatible with OpenAI's interface as soon as possible, or integrated into the Spring AI family, which will provide users with more choices and flexibility.

summarize

Today, we dive into the practical application of Spring AI in building intelligences, especially the value and efficacy in enterprise environments. By taking a step-by-step approach to realizing a locally deployed intelligent body solution, we not only demonstrate the flexibility and ease of use of Spring AI, but also highlight its potential to drive deeper integration of AI technologies with the business.

The core of the intelligent body lies in its ability to efficiently deal with complex business needs, which can not be realized without reasonable architectural design and technology selection. Through the integration of Spring AI, we can flexibly call different APIs, whether using the domestic hybrid API or other mainstream AI interfaces, developers can quickly switch in the project to ensure the maintainability and scalability of the system. This feature not only improves the development efficiency, but also makes the enterprise in the face of changes in market demand can quickly react and flexibly adjust the technical route.

The personalized configurations and plug-in invocations involved in our process fully demonstrate how to combine traditional development models with modern AI technology. By customizing plug-ins and workflows, enterprises can design more targeted intelligent bodies according to specific business needs, thus improving service quality and customer satisfaction. For example, in the scenario of weather query, the intelligent body can not only obtain real-time data through the API, but also combine it with the information in the database to achieve accurate and personalized services. This deep functional integration not only simplifies the user's operation process, but also improves the system's response speed.

In addition, we mentioned the integration of private knowledge bases, emphasizing the importance of data security and autonomy. By utilizing vector databases such as Milvus, enterprises are not only able to efficiently manage massive amounts of data, but also enhance the intelligence of intelligences through embedding technology. This provides enterprises with a more solid guarantee in terms of information security and intellectual property protection, which is especially important in the current context of rapid development of information technology.

In short, this paper is not only an elaboration of the Spring AI intelligent body building process, but also an in-depth reflection on how enterprises can effectively use this technology to realize business upgrading and transformation. We hope that through our discussion, we can provide you with new perspectives and insights in the development and application of intelligent bodies, and help you walk more steadily on the road of AI in the future.