Location>code7788 >text

TEN Framework

Popularity:287 ℃/2024-10-08 09:34:47

TL;DR

TEN Framework Originally called Astra, it was later changed to TEN, the Transformative Extensions Network.

I first met TEN (then known as Astra) at the Geek Park AGI Playground conference in June of this year.RTE OpenDay at the event. The show floor was noisy, but the conversations worked well enough. We were demonstrating multimodal dialog support with XSwitch, and we already have access to various videoconferencing systems, as well as APIs from various AI providers, but not yet to TEN.

XSwitch is a multi-protocol audio/video and AI connector dedicated to accessing all audio/video and AI related platforms and services. Therefore, TEN is also a framework and protocol that we need to access.

The TEN framework is actually very well written, and Docker containers are easy to run, but in the process of accessing it, we still encountered some problems, some of which have been solved, and some of which are still in the process of being solved. In the process of stepping on the pit, but also in-depth understanding of the framework, encountered some problems, but also found it good.

Front-end problems and optimization

While trying it out on the official website, I sometimes find the front-end unresponsive. There is no response when I click the Join button. Later, I realized that the server may be in foreign countries, the homepage seems to load quickly, but the background may still be "lazy" loading some JS and other content, before loading is completed, the Join button is not responsive, and it looks like it's stuck.

We did some optimization andOne PR was mentioned.This is because it displays "Loading ..." during the loading process and does not allow clicking. and does not allow clicks, so at least the user can understand that the page did not finish loading due to waiting for the network, etc., and it is not a bug.

Docker Related Questions

TEN is developed using a mix of C, Rust, and Go, and therefore uses Cgo. Some of the underlying components are not adapted for ARM, and are therefore only available as x86-64 images, which is not friendly enough for Apple Silicon users, but can be run without any problems by following the instructions in the official documentation.

Compiling however is a headache. Since I'm using theOrbStackThe TEN developers probably haven't used it before, so there was a check that didn't pass when compiling. After a few explorations, I found a compilation script that removes the check, so I was able to get around it. The TEN developers also confirmed the problem, and now it has been fixed.It's fixed.

Problems with compilation

TEN provides a one-click compilation script, but TEN includes Python, Go, and many other languages and dependencies, so it automatically downloads some dependencies during the compilation process, sometimes the download takes a long time, sometimes the download fails, and during that time, there was the problem of not being able to download the Docker image in China, so in general, the one-click compilation could not be completed in a single click.

This is kind of the daily routine for programmers living in the country, and if you use such a complex system, you have to have some internet skills. I have a socks5 proxy, but some Python environments don't support socks5, so I had to disable the code and install the following in the development image:

pip install pysocks
pip install httpx[socks]

Then, turn on the HTTPS proxy:

export HTTPS_PROXY=socks5://:8888

Among them.:8888 is the address of my socks5 proxy. This is how OrbStack accesses the host IP in the Docker container. This way, I don't have to change the IP address every time I run around the world with my laptop or something.

In order for OpenAI to be able to use the agent, it needs to be in the.env Configure the following in the

OPENAI_PROXY_URL=socks5://:8888

The good thing is that the compilation script is essentially idempotent, and executing it a few more times will always succeed.

Graph Designer Related Questions

TEN's Graph Designer is a great tool for designing dialog flows. However, the initial version was weak (and still isn't very strong, it's still in Beta.) Graph Designer connects to a backend file, refreshing the page after modifying it does not take effect. You need to restart all Docker containers. But when I was developing, restarting all Docker containers meant recompiling and installing a lot of things, and the process was slow and not always successful. When I asked in WeChat, the developer support was very active, but no effective solution was given, until later I found the reason by myself.

ten_graph_designer The container is just a front end that connects to theastra_agents_dev One of them.tman On the service process, the default port is49483funded bymake run-gd-server command to start. Therefore, it is only necessary to restart thistman service process will work without restarting the entire Docker container.

But it's not that simple.make run-gd-serveris started with the Docker container, kill thetman Later on the whole container will exit as well! So, I had to modify the, adding aentrypointLet's get it started.bash rather thantman

entrypoint: ["/usr/bin/bash"]

After starting Docker, pass thedocker exec -it astra_agents_dev bashRe-enter the container and manually use themake run-gd-server activate (a plan)tman service process. In this way, each time you modify the Later, simply restart the container in thetman The service process will be fine.

As a newbie, after making changes in Graph Designer and saving them, it is customary to look at thegit diff. Discovery The formatting changes so much that it's impossible to effectively know exactly what was changed. This one would be nice to ensure a consistent format. It seems like there is aPR Already following up on this.

In fact, this problem is not difficult to find, just that I was in a hurry to write the framework, anyway, a full restart is not unusable. Now it seems that "sharpening the knife is not a good idea", if you take care of this problem earlier should be able to save a little more time.

Cache Optimization

astra_agents_devWhen a container installs various Python or Go dependency packages, there is a lot of caching to the/root/.cachedirectory. In theto do a persistence that maps to the host's.cachedirectory, which can speed up subsequent compilation. Of course, this one requires a restart of the container.

volumes:
  - ./:/app
  - ./.cache:/root/.cache

Compilation process optimization

The entire compilation process basically relies on theinstall_deps_and_build.shscript is done, it compiles everything. This compilation is very time consuming. At the moment, I've only developed the Go Extension, and I haven't changed Python, so there's no need to check and download Python dependencies every time. The good thing is that this script is very clearly written, and I simply commented out all the Go irrelevant lines, which makes the compilation much faster.

  echo "install dependencies..."
  #tman install

  # build extensions and app
  echo "build_cxx_extensions..."
  #build_cxx_extensions $APP_HOME
  echo "build_go_app..."
  build_go_app $APP_HOME
  echo "install_python_requirements..."
  #install_python_requirements $APP_HOME

  echo "post installation..."
  #post_install $APP_HOME

Of course, if you're just trying to run TEN Framework locally, there's no need to toss it around like I did, since you usually only need to compile it once. And I need to write plugins.

Write Hello World plug-ins

I wrote a Hello World plugin in Go according to the official documentation, which went relatively smoothly, but not without a few changes. I wrote a Hello World plugin intman After generating the code, to modify thehello_world/default_extension.gocenterinit function, there is a string to be changed tohello_worldOtherwise, it will prompt that the plugin cannot be found.

	// Register addon
	(
		"hello_world",
		(newDefaultExtension),
	)

in usingtman After generating the plugin, use thegit diffIt can be seen that there is amanifestThere is a change in the file, but after compilation, the change is partially restored, the developer said it might be a bug, but I tested it and it doesn't seem to affect the operation.

A bit of advice

Graph Designer is still in Beta, but the functionality is already very good. However, there are still some improvements that can be made.

The first thing is that it looks cool, but it took me a long time to figure out how it works. The documentation isn't very complete either, especiallyflush There was no explanation at all, so I had to go chew on the source code.

Graphs look cool, and the dotted lines that represent the flow of data are moving, so it looks like the data is flowing. But once the graph gets complex, it's easy on the eyes. I feel that the dotted line should be a spline instead of a straight line, and different colors can be used to represent different data types. You can compare the two diagrams below to see which one is easier to understand.

原图

Graphviz图

The latter graph was generated by me using Graphviz. I'm not really familiar with Python, but with ChatGPT I can write Python code, and I used the following cue words.

write a python script, convert json into graphviz digraph

graph nodes take from nodes
nodes label use html syntax
graph edge take from connections
node has ports like data and text_data, make them in a sub table, in ports at left, out ports and right
an example json follows

... Followed by a simplified element ...

Of course the generated code compiles directly with errors, I fixed it and made a lot of improvements, and finallyOne PR was mentioned.

What else have we done?

We actually researched a lot, the TEN framework includes front-end and back-end, before we understood the back-end, we made a front-end, and then plugged the back-end into the official TEN (then called Astra) back-end to demo to customers, and gleaned TEN's traffic until one day we realized that it was limited. However, we just tested it ourselves and didn't publicize it, so the traffic limitation should be due to the large number of TEN's trial users rather than us 😂.

Of course, this method is a secret, so we won't talk about it.

In addition, we have directly connected TEN in XSwitch, which is a softswitch platform supporting various protocols such as SIP, H323, WebRTC, etc. We were the first one to connect with Agora. XSwitch is a softswitch platform that supports SIP, H323, WebRTC, and other protocols, and we were the first to open it up with Agora.agora_rtc supported by the component, and the signaling is not complicated, onlystartpingstop Three APIs. we wrote a Lua script directly in XSwitch and got through, using any PSTN or SIP phone, dialing a phone number and chatting directly with TEN.

As a result of the last few days, I've learned to write plugins. I wrote one directly in TENxswitch plugin, it is now possible to use xswitch in Graph Designer as well. But the TEN Store is not ready, we haven't released it yet. We're only in Alpha, and some of the processes are not yet tuned.

On the day before the 4th of July vacation, I changed theagents framework, you can launch the XSwitch plugin directly from the TEN framework without even needing theagora_rtc. But I don't tell anyone 😂 .

Besides, TEN.

The TEN framework actually has three parts, the bottom layer is called theTEN Framework, written in C, Rust and Go, I've looked at the source code but haven't learned how to compile it yet. It doesn't seem like you need to compile it if you're not developing the underlying layers.

Most people don't really need to change the TEN Framework, and the main thing I'm tossing around these days is theTEN AgentThe magic of TEN is that you can use Python, Go, C++ and other languages to write plug-ins, and through the Graph Designer (or JSON) to link these plug-ins together to form a graph, and then plug-ins written in different languages can even run in the same process!

The third part is the above mentionedGraph DesignerThe future is in the hands of the developer. In the future, more developers actually only need to know Graph Designer, and then they can combine different dialog flows. If Graph Designer matures and TEN Agent is rich in plug-ins, we developers may not be needed.

That said, the TEN framework is actually very well written, and reading the source code reveals that its authors are very good at what they do. Although the documentation is not much now, it still includes a lot of concepts and the basics of framework design, and is relatively developer-friendly, even including thetman tools, and how to set up VS Code, how to Debug and so on. But also because the documentation is not much, a lot of things still need to figure out. WeChat group support is very active, but you can't ask for everything, haha.

More to look forward to

It is expected that TEN will get better and better. It's also promising to see how to better organize the source code and put your own developed Extension in a separate repository (open-source or not) instead of mixing it with the existing Agent code.

wrap-up

I didn't go out on the first day of the 4th of July, but I still tossed a TEN and wrote this post in the meantime.

Needless to say, there is still a barrier to entry.

  • First of all Docker stands in the way of a lot of people.
  • Second, although Docker is officially provided, you still need to compile it yourself, because some of the dependencies can't be downloaded smoothly, which is a high threshold.
  • Once again, the framework supports a lot of languages, which is a good thing, but also very demanding on the developer. However, if one reads my bookThe Great Tao is SimpleI don't think I'll be afraid.
  • Finally, TEN is still very young, the documentation is incomplete, and what documentation there is is mostly in English.

Of course, none of this is really a problem. Many people say that with AI, programmers are not needed. But as a programmer who knows a lot of languages, I have been in the RTC business for so many years, and it still took me a lot of time to toss up the TEN framework. In the future, it is not that programmers will not be needed, it is that programmers in the future will need to know a lot of languages. Just one TEN framework contains C/C++, Rust, Go, Python, Shell, Javascript and so on, if you don't know anything about it, it's still very difficult to toss it up. That's why I wroteThe Great Tao is Simple The original purpose of the program - to provide a fundamental and systematic understanding of the nature of programming and development.

I'm in a hurry, so I'll make a brief note of this, and hope it helps. the TEN framework is a promising one, so if I find anything new later, I'll share it with you.

update

The day after writing this post, something big happened: OpenAI announced the Realtime API and a partnership with Agora, and Agora stock (ticker symbol isAPI, this one looks premium at first glance) should be up so much that I kind of regret not having an account in US stocks. Related links are below:

  • /index/introducing-the-realtime-api/
  • /en/blog/agora-and-openai-enabling-natural-real-time-conversational-ai/

Not sure how much the TEN framework helps with this, but it seems that everything is cause and effect.

The TEN team has been working through the National Day holiday to not only merge the 5.0 branch, but also implement OpenAI's latest Realtime API, which is a progress that I haven't been able to keep up with yet. It looks like I'm going to have a lot of work to do in the next few days.

Another thing: I also have a video about XSwitch interfacing with TEN on my video number, but the video number video doesn't seem to have a direct link to it, so if you're interested, you can find it on your own.

Permanent link to this article:/2024/10/01/