DeepSeek is a rookie in large domestic model technology. Recently, it has also been in the industry and the media, so I want to learn its technology.
In the era of big models, learning knowledge, of course, first think of using big models. Because in the past year, it does not use much for DeepSeek, so I want to compare it with Wen Xinyin (4.0 Turbo).
Through comparison, for the same issue, "DeepSeek develops open source culture, published core technologies in the form of paper, what are the key technical papers?" The effect of Wenxin's words is obvious, but this may be as good as DeepSeek's "Internet search search "It is related to" for the time being.
The following is the results of the use
Wenxinyin Questions and Answers:
DEEPSEEK Q & A result:
A verification of the above answers
The four papers answered in Wenxin's words are all real papers. The results are correct and sorted according to time. The experience is better
DEEPSEEK answers the result
"DeepSeek-Moe: Towards Ultimate Spectialization in MIXTURE-OF-EXPERTS Language Models" is correct
Several other papers, through search engines, have not found the corresponding address, should be an illusion answer
《DeepSeek-Long: Advancing LLMs' Context Window to 128K through Positional Interpolation and Attention Optimization》
《DeepSeek-R1: Retrieval-Augmented LLM with Real-Time Knowledge Update》
《DeepSeek-Math: Improving Mathematical Reasoning in LLMs via Symbolic Supervision》