when it comes toAIGenerate the video of the hottest project, when the domestic fast hand can be spiritual, even more than theOpenAI(used form a nominal expression)SoraIt's still on fire, and the former is still apptThe blogger who applied for the trial channel some time ago, even lined up hundreds of thousands of people outside the position, but finally got the qualification to use, not yet used on the foreign friends can only dry anxious, in the social media to send "beg for!
But just recently, Korin started charging, and the annual fee is a few thousand dollars, so it's a bit stressful!~
Just a few days ago, the counterpart to Racer Korin'sCogVideoXThe world has come out, but also can be deployed to the local computer to use, from the figure below you can see its effect is not even lose the fast hand can be spirit!
CogVideoXLatest Chinese version:
Baidu.com:/s/1b2dS7Wj6-yta7xo7NrjihQ?pwd=7hmo
CogVideoXThe cue word limit for the226classifier for individual things or people, general, catch-all classifiertokenThe length of the video is6seconds at a frame rate of8one of a pair (scrolls)/seconds, with a video resolution of720*480. Users can quickly generate imaginative videos by simply typing in a piece of text, and can also freely choose the style of the generated video, including cartoons,3D, black-and-white, oil paintings, movies, etc., with music that comes with the software
CogVideoXEfficient and coherent video generation is achieved by fusing textual, temporal, and spatial dimensions with theDiTArchitecture and optimized algorithms that significantly improve inference speed and generation quality
workflow
- Data preprocessing: converts the input image or video data into a format that can be processed by the model, slices the image into fixed-sizepatches(small chunks), and then thesepatchesConvert to eigenvectors
- Noise introduction: Gradually introduce noise on the feature vectors after data preprocessing to form a diffusion process of noise increase, this process can be regarded as the conversion from raw data to noise data
- Model training: using feature vectors with introduced noise as inputs, train theTransformermodel, the goal of the model is to learn how to reverse the process of increasing noise, i.e., recover the original data from the noisy data
- Video generation: after the model training is completed, the mapping relationship between the randomly generated noise and the original data is fed into the model, and the new image or video is generated after processing by the model
Usage
1.Click on the bottom right corner of the software interfaceloadbutton, select the project's ownfile and load the
2.In the newly opened screen, enter the prompt for the video to be generated
3.Setting parameters (generally left at default)
4.strike (on the keyboard)Queue Promptbuttons
Wait a little bit, the video is generated well, Zhen Huan biography of the emperor gnawing on the chicken leg this how I have not seen?
caveat
①Do not include Chinese in the installation path of the project
②Recommended UseGTX1070Run this program on above graphics cards
③ If you accidentally close the software background in the process of use, please reopen and refresh the web page