preamble
This is a learning record blog, mainly to learn the practice of semantic kernel (Semantic Kernel), as well as Aspire for full-stack development hands-on experience, I am using Aspire at the same time to start the API service, Blazor front-end services and WinUI's desktop project, at the same time, the code changes in the three projects, the overall feeling of convenience, if the code are modified only need to start the Aspire project, do not have to start each project individually, and very fast, even when there is a container service. The overall feeling is very convenient, if the code is modified only need to start the Aspire project, do not have to start each project separately, and the speed is very fast, even if it is useful container services.
technical program
1. Framework selection
- WebApi is implemented using Core WebApi.
- The PlayWright library is used to fetch Bing search results and to extract content for web parsing.
- Web content summarization uses a client written in WinUI, combined with a semantic kernel (Semantic Kernel) to call the domestic ZhiPu QingYan LLM.
- The backend admin page uses Blazor, but it's just a demo page.
2. Why choose?
As a .Net developer, I definitely prioritized the use of .Net related technologies, and also made some new technology choices in order to be able to practice the latest technologies.
The main reason for choosing these technical frameworks is explained:
-
Playwright reason is that through testing found that it is the best performance, other types of libraries have also been tested, such as Selenium, HtmlAgilityPack, HtmlAgilityPack static web page parsing is better, but if you encounter a lot of data rendered by the js page will not be good, Selenium than the Playwright extracted some poor content, Playwright is through the simulation of user operation to start the browser, and then get the content, I feel that if a one-time processing of a lot of pages should also be a big load.
-
Aspire This is because this is Microsoft's latest tool developed specifically for developers, that since it is for developers to do, that certainly want to experience a hand, after experiencing the feeling is really good, to save a lot of steps.
-
Semantic Kernel (Semantic Kernel) was chosen because it is considered to be the most popular framework in the .Net community for interfacing with large language models, provides a lot of out-of-the-box functionality, helps a lot in the development of smart apps, and has a high level of community enthusiasm.
-
The selection of LLM is the result of many considerations, firstly, it is compatible with OpenAI interface, so that the semantic kernel can use it through the configuration, and secondly, it supports Function Call, which means that it can be used as a domestic stand-in for OpenAI, and it's very good to use it for the development of some intelligent APPs.
-
WinUI choose it is a personal development of the client is mainly used WinUI, and use it to dock the large language model does not put the docking to the back end is also for the back of the docking offline large language model to do the foundation, such as Microsoft's Phi3 and so on.
code talk
The link to the code covered in this blog is below:
/GreenShadeZhang/BingSearchSummary
1. Access to search results
The sample code is as follows:
The Playwright instance is created first and then the user action is simulated.
var playwright = await ();
var browser = await (new BrowserTypeLaunchOptions { Headless = true });
var page = await ();
// set up User-Agent and viewport size
var js = @"(navigator, {webdriver:{get:()=>false}});";
await (js);
await ("");
// Simulate user input of search keywords
await ("input[name=q]", keyword);
await ("Enter");
// Wait for search results to load
await ();
// Get search result content
var content = await ();
var dataList = (content);
var result = new List<BingSearchItem>();
Parse the search results into json data as follows:
This step is because I don't have a bing search subscription, so I can only parse the page, if you have a bing search subscription this step can be omitted.
using ;
using HtmlAgilityPack;
namespace ;
public class BingSearchHelper
{
public static List<BingSearchItem> ParseHtmlToJson(string htmlContent)
{
var htmlDocument = new HtmlDocument();
(htmlContent);
var results = new List<BingSearchItem>();
foreach (var node in ("//li[@class='b_algo']"))
{
var titleNode = (".//h2/a");
var snippetNode = (".//p");
var urlNode = (".//cite");
var title = titleNode?.();
var snippet = snippetNode?.();
var url = urlNode?.();
if ((title))
{
continue;
}
var searchItem = new BingSearchItem
{
Title = title,
Snippet = snippet ?? "",
Url = url ?? ""
};
(searchItem);
}
return results;
}
}
Through the above code operation, the keyword search web page URL has been obtained, and then you can continue to parse the page content.
2. Web content parsing
The client calls the interface and then gets the first three search results for the keyword and the page content.
// Get search result content
var content = await ();
var dataList = (content);
var result = new List<BingSearchItem>();
foreach (var data in dataList)
{
if ( >= 3)
{
break;
}//Processing only three pieces of data
await ();
var divContent = await (".content");
divContent ??= await ("body");
if (divContent != null)
{
var pageContent = await ();
(new BingSearchItem
{
Title = ,
Url = ,
Snippet = ,
PageContent = pageContent
});
}
The swagger results are displayed below:
3. Summary of the results of the web page
This part of the code is implemented in the WinUI project, WinUI calls the interface to get the results, and through the library for message templates to dynamically generate messages, call the semantic kernel (Semantic Kernel) for content summarization.
The Semantic Kernel injection code is as follows:
//Test token was deleted and is no longer valid Please replace it with your own Jippo token
(modelId: "GLM-4-Air", apiKey: "4827638425a6b9d48bea3b0599246ff2.pFjhEKShPOZE8OFd", httpClient: GetProxyClient("/api/paas/v4/chat/ completions")).
<TimeInformationPlugin>().
(()).
#pragma warning disable SKEXP0040 // Types are for evaluation only and may be changed or removed in future updates. Cancel this diagnostic to continue.
<IPromptTemplateFactory, LiquidPromptTemplateFactory>();
#pragma warning restore SKEXP0040 // Type is for evaluation only and may be changed or removed in a future update. Cancel this diagnostic to continue.
The content summary code is as follows:
[RelayCommand]
private async Task SummaryAndUploadAsync(BingSearchItem item)
{
_chatHistory.Clear();
SummaryProcessRingStatus = true;
try
{
var arguments = new KernelArguments
{
["startTime"] = ("hh:mm:ss tt zz", ),
["userMessage"] =
};
var systemMessage = await _promptTemplateFactory.Create(new PromptTemplateConfig(_systemPromptTemplate)
{
TemplateFormat = "liquid",
}).RenderAsync(_kernel, arguments);
var userMessage = await _promptTemplateFactory.Create(new PromptTemplateConfig(_userPromptTemplate)
{
TemplateFormat = "liquid",
}).RenderAsync(_kernel, arguments);
_chatHistory.AddSystemMessage(systemMessage);
_chatHistory.AddUserMessage(userMessage);
var chatResult = await _chatCompletionService.GetChatMessageContentAsync(_chatHistory, _openAIPromptExecutionSettings, _kernel);
SummaryResult = ();
await _apiClient.PostContentsAsync(new BingSearchSummaryItem
{
Title = ,
Summary = (),
Url =
});
}
catch (Exception ex)
{
();
SummaryProcessRingStatus = false;
}
SummaryProcessRingStatus = false;
}
The effect is as follows:
At this point the summary is complete, so you can check out the code and see if it helps.
Personal experience
After a period of study, I gained some comprehensive understanding of the Big Language Model and realized that it is not a panacea, but it can easily do things that we had to be very sophisticated to do before. The prerequisite for doing it easily is to give good cue words.
If the big language model is compared to a fighter jet, then the cue word can be compared to the pilot, the cue word is good or bad directly determines the accuracy of the output of the big language model.
As a software developer, you must learn more and summarize more for the writing of prompt words.
Refer to the recommended documentation items below:
-
Blog Code
-
BotSharp
-
Rodel Agent