Mr. Ernest Ng proposes a Big Language Model (LLM) AI translation workflow for rethinking translation-GitHub - andrewyng/translation-agentThe specific workflow is as follows:
- Prompts an LLM to remove the text from the
source_language
Translate totarget_language
; - Allow the LLM to reflect on the results of the translation and make constructive suggestions for improvement;
- Use these suggestions to improve translations.
This AI translation process is a relatively new way of translating, utilizing LLM to improve on your own translation results to get better AI translation results.
The project demonstrates that it is possible to utilize the segmentation of long texts and then process them for reflective translation respectively, in order to break through the limitation of LLM on the number of tokens, and to truly realize efficient and high-quality translation of long texts with one click.
The project has also achieved more accurate AI translations, such as American English and British English, by limiting the countries and regions for the large model; and proposed some optimizations that may bring better results, such as creating a glossary for terms that have not been trained by the LLM (or for terms that have been translated in more than one way), which will further improve the accuracy of the translations, and so on.
And all of this can be accomplished byFastGPT This article will teach you how to use FastGPT to replicate Mr. Wu Enda's translation-agent.
Single Text Block Reflective Translation
Let's start with the simple ones, i.e., single text block translations that do not exceed the limit of the number of LLM tokens.
Initial translation
The first step starts with having the LLM perform an initial translation of the source text block:
The "text splicing" module refers to three parameters: source language, target language, and source text, and generates prompts that are passed to the LLM to give the first version of the translation.
Cue word:
This is an {{source_lang}} to {{target_lang}} translation, please provide the {{target_lang}} translation for this text. \
Do not provide any explanations or text apart from the translation.
{{source_lang}}: {{source_text}}
{{target_lang}}:
reassessment
The LLM is then asked to suggest changes to the initial translation generated in the first step, called thereassessment。
Cue word:
Your task is to carefully read a source text and a translation from {{source_lang}} to {{target_lang}}, and then give constructive criticism and helpful suggestions to improve the translation. \
The final style and tone of the translation should match the style of {{target_lang}} colloquially spoken in {{country}}.
The source text and initial translation, delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT> and <TRANSLATION></TRANSLATION>, are as follows:
<SOURCE_TEXT>
{{source_text}}
</SOURCE_TEXT>
<TRANSLATION>
{{translation_1}}
</TRANSLATION>
When writing suggestions, pay attention to whether there are ways to improve the translation's \n\
(i) accuracy (by correcting errors of addition, mistranslation, omission, or untranslated text),\n\
(ii) fluency (by applying {{target_lang}} grammar, spelling and punctuation rules, and ensuring there are no unnecessary repetitions),\n\
(iii) style (by ensuring the translations reflect the style of the source text and takes into account any cultural context),\n\
(iv) terminology (by ensuring terminology use is consistent and reflects the source text domain; and by only ensuring you use equivalent idioms {{target_lang}}).\n\
Write a list of specific, helpful and constructive suggestions for improving the translation.
Each suggestion should address one specific part of the translation.
Output only the suggestions and nothing else.
Here the cue word receives 5 parameters, source text, initial translation, source language, target language, and qualifier region country, so that the LLM will suggest a considerable number of modifications to the translation generated earlier in preparation for the subsequent enhancement of the translation.
Enhancement of translation
Cue word:
Your task is to carefully read, then edit, a translation from {{source_lang}} to {{target_lang}}, taking into
account a list of expert suggestions and constructive criticisms.
The source text, the initial translation, and the expert linguist suggestions are delimited by XML tags <SOURCE_TEXT></SOURCE_TEXT>, <TRANSLATION></TRANSLATION> and <EXPERT_SUGGESTIONS></EXPERT_SUGGESTIONS> \
as follows:
<SOURCE_TEXT>
{{source_lang}}
</SOURCE_TEXT>
<TRANSLATION>
{{translation_1}}
</TRANSLATION>
<EXPERT_SUGGESTIONS>
{{reflection}}
</EXPERT_SUGGESTIONS>
Please take into account the expert suggestions when editing the translation. Edit the translation by ensuring:
(i) accuracy (by correcting errors of addition, mistranslation, omission, or untranslated text),
(ii) fluency (by applying {{target_lang}} grammar, spelling and punctuation rules and ensuring there are no unnecessary repetitions), \
(iii) style (by ensuring the translations reflect the style of the source text)
(iv) terminology (inappropriate for context, inconsistent use), or
(v) other errors.
Output only the new translation and nothing else.
After generating the initial translation and the corresponding reflection in the previous section, these two are fed into the third LLM translation so that we can obtain a relatively high quality translation result.
operational effect
Because of the consideration of the reuse of this reflective translation later, so created a plug-in, then in the following I directly call this plug-in can use reflective translation, the effect is as follows:
A randomly selected passage from Harry Potter.
You can see that the reflective translation is still quite a bit better, where the output of the reflection is as follows:
Translation of long reflections
Having mastered reflective translation of short text chunks, we can easily realize reflective translation of long texts, i.e., multiple text chunks, through slicing and looping.
The overall logic is thatFirst of all, make a judgment on the number of tokens of the incoming text, if it does not exceed the set tokens limit, then directly call the single text block reflective translation, if it exceeds the set tokens limit, then cut it into a reasonable size, and then carry out the corresponding reflective translation processing respectively.
As for why we cut chunks, there are two reasons:
1, the big model output context is only 4k, can not output more than 4k token content of the text.
2. Input chunking reduces the illusion caused by too long inputs.
Counting tokens
First, I use the "Laf Functions" module to implement the calculation of the tokens of the input text.
The Laf function is fairly simple to use out-of-the-box.Laf Cloud DevelopmentCreate an app in the platform, then install the tiktoken dependency and import the following code:
const { Tiktoken } = require("tiktoken/lite");
const cl100k_base = require("tiktoken/encoders/cl100k_base.json");
interface IRequestBody {
str: string
}
interface RequestProps extends IRequestBody {
systemParams: {
appId: string,
variables: string,
histories: string,
cTime: string,
chatId: string,
responseChatItemId: string
}
}
interface IResponse {
message: string;
tokens: number;
}
export default async function (ctx: FunctionContext): Promise<IResponse> {
const { str = "" }: RequestProps =
const encoding = new Tiktoken(
cl100k_base.bpe_ranks,
cl100k_base.special_tokens,
cl100k_base.pat_str
);
const tokens = (str);
();
return {
message: 'ok',
tokens:
};
}
come backFastGPTIf you want to calculate the number of tokens, click on "Synchronize Parameters", and then connect the line to pass in the source text.
Calculate single text block size
Since there are no third-party packages involved, just some data processing, you can just use the "Code Run" module to handle it:
function main({tokenCount, tokenLimit}){
const numChunks = (tokenCount / tokenLimit);
let chunkSize = (tokenCount / numChunks);
const remainingTokens = tokenCount % tokenLimit;
if (remainingTokens > 0) {
chunkSize += (remainingTokens / numChunks);
}
return {chunkSize};
}
With the code above, we can figure out what is a reasonable single text block size that does not exceed the token limit.
Obtain a chopped block of source text
With a single text block size and source text, we write another function in Laf to call langchain's textsplitters package to realize text slicing, the specific code is as follows:
import cloud from '@lafjs/cloud'
import { TokenTextSplitter } from "@langchain/textsplitters";
interface IRequestBody {
text: string
chunkSize: number
}
interface RequestProps extends IRequestBody {
systemParams: {
appId: string,
variables: string,
histories: string,
cTime: string,
chatId: string,
responseChatItemId: string
}
}
interface IResponse {
output: string[];
}
export default async function (ctx: FunctionContext): Promise<IResponse> {
const { text = '', chunkSize = 1000 }: RequestProps = ;
const splitter = new TokenTextSplitter({
encodingName: "gpt2",
chunkSize: Number(chunkSize),
chunkOverlap: 0,
});
const initialChunks = await (text);
(initialChunks)
// Define sentence separators for different languages
const sentenceDelimiters = /[。!?.!?]/;
// Further processing of each preliminary segmentation block
const output = [];
let currentChunk = initialChunks[0];
for (let i = 1; i < ; i++) {
const sentences = initialChunks[i].split(sentenceDelimiters);
if ( > 0) {
currentChunk += sentences[0]; // Splice the first sentence into the current block
(()); // Add the current block to the output array
currentChunk = (1).join(''); // The remaining sentence is used as the new current block
}
}
// Add the last block to the output array
if (().length > 0) {
(());
}
(output);
return {
output
}
}
In this way we obtain the sliced text, and the ensuing operation is similar to a single text block reflective translation.
Multi Text Block Translation
Here should still not be able to directly call the previous single text block reflective translation, because the prompt word will involve some context processing (or you can modify the plug-in written earlier, pass more parameters into it).
Detailed and similar to the previous, is the prompt word for some replacement, as well as the need to do some very simple data processing, the overall effect is as follows.
Initial translation of multiple text blocks
Reflections on Multi-Text Blocks
Multiple Text Block Enhanced Translation
loop execution
One of the more critical parts of reflective translation of long texts is the cyclic reflective translation of multiple blocks of text.
FastGPT provides workflow lines can be returned to the execution of the function, so we can write a very simple judgment function to determine the end or continue to execute.
js code:
function main({chunks, currentChunk}){
const findIndex = ((item) => item ===currentChunk)
return {
isEnd: -1 === findIndex,
i: findIndex + 1,
}
}
That is, by determining the current processing of this text block, whether it is the last text block, so as to determine whether it is necessary to continue the execution, and in this way, we achieve the effect of long text reflection translation.
operational effect
First enter the global settings:
Then input the text to be translated, here I chose a chapter of Harry Potter's original English to do the translation, and its text length is judged by OpenAI on the number of tokens as follows:
The actual running results are as follows:
You can see that it still meets reading needs.
Further optimization
cue word optimization
In the source project, the system prompts given to the AI are still relatively brief, we can use better prompts to urge the LLM to return more appropriate translations to further improve the quality of translation. For example, we can use the CoT thought chain to let the LLM explicitly and systematically generate a reasoning chain to show the complete thinking process of translation.
For example, the cue word in the initial translation can be replaced with the following cue word:
# Role: Senior Translation Specialist
## Background.
You are an experienced translation specialist who is good at translating {{source_lang}} and {{target_lang}}, especially at translating {{source_lang}} texts into {{target_lang}} that are easy to understand. You have led your team to complete large-scale translation projects for many times, and your translations have been widely praised.
## Attention.
- In the translation process, you should always adhere to the principles of "Faithfulness, Dignity and Elegance", but "Dignity" is especially important.
- The translation should be in line with {{target_lang}}'s expression habits, easy to understand, coherent and fluent.
- Avoid overly literal expressions and obscure allusions.
- For proprietary terms, they can be retained or transliterated appropriately.
## Constraints.
- The four rounds of translation process must be strictly followed: direct translation, paraphrase, proofreading and finalization.
- The translation must be faithful to the original text, accurate, and must not omit or misinterpret the original meaning.
- Be careful to judge the context and avoid repetitive translations
## Goals.
- Translate {{source_lang}} into high-quality {{target_lang}} translation through four rounds of translation process.
- The translation should accurately convey the meaning of the original text, and the language should be easy to understand and catchy.
- Moderate use of some familiar phrases and popular Internet terms to enhance the affinity of the translation.
- On the basis of direct translation, provide at least two different styles of Italian translation for choice.
## Skills.
- Proficient in both {{source_lang}} and {{target_lang}}, with solid linguistic skills and extensive translation experience.
- Specialized in translating {{source_lang}} idioms into natural {{target_lang}}.
- Have a keen insight into the development of contemporary {{target_lang}} language, good at grasping the language trend.
## Workflow.
1. First round of direct translation: faithful to the original text word by word and sentence by sentence, without omitting any information. 2.
2. Second round of Italian translation: based on the direct translation, use a fluent {{target_lang}} Italian translation of the original text, and provide at least two different styles of versions.
3. Third round of proofreading: scrutinize the translation to eliminate discrepancies and gaps and to make the translation more comprehensible.
4. Fourth round of finalization: selecting the best ones, revising and touching them up again and again, and finalizing a concise and smooth translation that meets the reading habits of the general public.
## OutputFormat.
- Before each round of translation, use [Think] to explain the main points of the round.
- After each round of translation, use [Translate] to present the translation.
## Suggestions.
- Strive to be faithful to the original text in direct translation, but don't be too strict on word by word.
- When translating from the original, use the most plain and simple language to express the original meaning.
- In the proofreading process, the focus is on whether the translation is in line with {{target_lang}}'s expression habits and whether it is easy to understand.
- Adopt some familiar proverbs and Internet buzzwords in the finalization process to make the translation more localized - Make good use of the flexibility of {{target_lang}} to present the same content in different ways to improve the readability of the translation.
From there, a more accurate and higher quality initial translation can be returned. We also need to add another node to extract the finalized fourth round of the initial translation:
The js code is as follows:
function main({data1}){
const result = ("```").filter(item => !!())
if(result[-1]) {
return {
result: result[-1]
}
}
return {
result: 'Translation not captured'
}
}
Subsequent reflection and enhancement translations can also be modified with more accurate cue words, for example:
The cue words are as follows:
# Role: Senior Translation Specialist
## Background.
You're an experienced translator who specializes in the translation of {{source_lang}} and {{target_lang}}, especially in translating {{source_lang}} into {{target_lang}} that is easy to understand. You've been involved in proofreading and reviewing translations many times, and you can offer insights into the translated articles.
## Attention.
- Translations should follow the principles of "trustworthiness, expressiveness, and elegance", but "expressiveness" is especially important.
- The translation should conform to {{target_lang}}'s expression habit, easy to understand, coherent and fluent.
- The translation should avoid the use of too literal expression and obscure allusions to quote
## Constraints.
- The translation should be faithful to the original text, accurate, and should not omit or misinterpret the original meaning.
- Recommendations should be clear and implementable, and to the point.
- Recommendations should be as detailed as possible for each passage.
## Goals.
- You will be provided with a copy of the original text of {{source_lang}}, along with its initial translation, and you will be asked to give your suggestions for improving this translation.
- Judge each passage in as much detail as possible, suggesting changes that need to be made, and not forcing changes that don't need to be made.
- The translation should accurately convey the meaning of the original text, and the language should be easy to understand and catchy.
- Moderate use of some familiar phrases and popular Internet terms to enhance the affinity of the translation.
## Skills.
- Proficient in both {{source_lang}} and {{target_lang}}, with solid linguistic skills and rich translation experience.
- Specializes in translating {{source_lang}} idioms into natural {{target_lang}} expressions.
- Have a keen insight into the development of contemporary {{target_lang}} language, good at grasping the language trend.
Let's take a look at the final run, and take a paragraph of the technical article to test it out:
In February of 1992, the development of Windows 3.1 was nearing a close, and the Windows team was trying to figure out what their next steps would be. By the 5th of March, the team knew that they’d be focusing on desktops, laptops, mobile, and pen with NT taking servers and workstations. The team also knew that they needed to address three major areas: UI, hardware support, networking.
There was a ton of stuff being worked on at this time (and through the rest of the 1990s) within Microsoft. Just within the Systems group (as distinct from the Apps group) Janus would release on the 6th of April as Windows 3.1, Astro would release in March of 1993 as MS-DOS 6.0, Winball would release in October of 1992 as Windows for Workgroups 3.1, Jaguar while being worked on at this time would never see an independent release (more on that in a bit), and then came the next windows projects: Cougar, Panther, Rover, NT, and Cairo. Cougar was a project to build a fully 32 bit Windows kernel, evolving the Windows 386 mode kernel for 386-class and higher machines. Panther was a project to port the win32 API to this new kernel. Rover was a project to make a mobile computing version of Cougar/Panther. The NT project was Microsoft’s first steps into a dedicated workstation and server release of Windows, and it would release in July of 1993. Cairo was a project for the next major release of NT, and it would mirror many of the changes to Windows from Cougar/Panther (and the reverse is also true). This system comprised of Cougar and Panther was known as Chicago. The Cougar portion of this system was vital to making a more stable and robust Windows. Beyond being a fully 32 bit protected-mode system, this new kernel would feature dynamically loaded and unloaded protected-mode device drivers. This system would also be threaded and fully support any MS-DOS program running from Windows (where previously in Windows 2 and 3, programs that wrote directly to video RAM would require Windows to terminate and stay resident, one side effect being that in really big Command and Conquer maps, the memory space of Windows would be overwritten and as a result Windows would not restore on exit).
These moves were huge for Chicago and for Microsoft more generally. When Chicago was taking shape in 1992, MS-DOS was still Microsoft’s bread and butter. Brad Silverberg was relatively new to Microsoft, but he had a very strong background. He had worked at Apple on the Lisa, and he had worked at Borland. By early 1992, he was the project leader of Chicago and the SVP of Microsoft’s personal systems division. In an internal Microsoft memo Silverberg said:
Lest anyone be confused, ms-dos is the the bedrock product of the company, accounting for a very major portion of Microsoft’s profits (ie, stock price). Further, it is under strong competitive pressures (I am more inclined to say “under attack”) from DR-DOS and IBM. We must protect this franchise with our lives. Short term, that means continued aggressive marketing plans. In addition, it also means we need to get yearly product releases out so we put the other guys on a treadmill, rather than be put on the treadmill. As a result, we are going to release a new version of MS-DOS this year, chock full of new goodies, while we move with full-speed toward cougar.
That new MS-DOS release was MS-DOS 6 mentioned earlier. The most visible and important new “goodies” referenced by Silverberg were disk defragmentation, disk compression, anti-virus, a new backup system, and file transfer tools. MS-DOS 6 was released in March of 1993 with updates being pushed until June of 1994.
I bring this up to try and portray where Microsoft and the industry were at this time. IBM compatible computers outnumbered all other computers by nearly 80 million units. MS-DOS or a compatible DOS system was installed on almost all of them (with OS/2 or Linux being rare). Most software on these computers ran in 16 bit real mode. Most hardware was configured with dip switches, and the config had to match that setting exactly. Loading a driver required knowledge of autoexec and load-high tools. Windows 3 was a huge success, and Windows 3.1 was an even greater success. Despite these successes and the resultant changes in Microsoft’s future plans, MS-DOS was still the market leader in PC operating systems by a very wide margin. Windows 3x did ameliorate some problems, but the old systems remained dominant. Due to this, Microsoft absolutely needed to ensure that MS-DOS was still part of their future despite having a more technically advanced system in NT. Adding to this, most computers that home users were purchasing were incapable of providing a good experience with NT. Chicago needed to provide the best experience possible for win16, win32, and MS-DOS applications on modest hardware, and it needed to be a noticeable improvement over Windows 3. If Microsoft failed in either case, they would be yielding ground to Digital Research or to IBM.
Ultimately, the need for backwards compatibility meant that some 16 bit code remained in Chicago. Without this, the backwards compatibility wouldn’t have been as good. In hindsight, given that IBM’s OS/2 could run DOS and Windows software, this was a very good decision on the part of Microsoft.
Chicago was structured in a way that is similar to Windows for Workgroups 3.1 (386 enhanced), but is far more refined. There are a large number of virtual device drivers (VxDs) running in 32 bit protected mode alongside virtual DOS machines (VDMs) running in a virtual real mode. These virtual device drivers are used for real physical hardware, for emulating devices for virtual machines, and for providing services to other software. Three of these VxDs comprise the very heart of Chicago: Virtual Machine Manager (), Configuration Manager (CONFIGMG), Installable Filesystem Manager (IFM). VMM32 is essentially the Chicago kernel. It handles memory management, event handling, interrupt handling, device driver loading and initialization, the creation of virtual machines, and the scheduling. CONFIGMG handles plug and play. IFM coordinates filesystem access, provides a disk buffer, and provides a 32 bit protected mode I/O access system. This bypasses MS-DOS entirely and was first seen 386 Windows 3 releases.
The translation effect is as follows:
It's too strong!
From now on, no matter what article you want to translate, no matter how long the article is, you can just throw it to this translation expert, then do what you need to do, and then come back a little while later to collect the most perfect translation result, who else?
Other tuning
For example, the qualifier tuning, which has been demonstrated in the source project, is to add the qualifier of country and region, and it does improve quite a bit in real-world testing.
Due to the excellent ability of LLM, we are able to set different prompts to get different translations, i.e., we can easily set special qualifiers to achieve specific, more accurate translations.
For some terms that are beyond the understanding of LLM, the knowledge base function of FastGPT can also be used to extend the function of the translation robot.
concluding remarks
The next post will bring you a more powerful intelligent body:Specialist in reflective translation of subtitles。
What can this expert do? As an example.Let's say you have an English subtitle, no matter how long that subtitle is, you can copy all the content of that subtitle, throw it directly to a subtitle translation expert, and then do what you need to do, and then come back a little while later to pick up the most perfect Chinese-English subtitle ever... who else?
Finally, it's time for the perks, I've shared the complete workflow of this translation specialist, so help yourself:Long text reflections on the workflow of translation specialists