💙 Gate Square #Gate Blue Challenge# 💙
Show your limitless creativity with Gate Blue!
📅 Event Period
August 11 – 20, 2025
🎯 How to Participate
1. Post your original creation (image / video / hand-drawn art / digital work, etc.) on Gate Square, incorporating Gate’s brand blue or the Gate logo.
2. Include the hashtag #Gate Blue Challenge# in your post title or content.
3. Add a short blessing or message for Gate in your content (e.g., “Wishing Gate Exchange continued success — may the blue shine forever!”).
4. Submissions must be original and comply with community guidelines. Plagiarism or re
The evolution of "Consultation 2.0", standing in front of the large model layout of SenseTime
We are experiencing a massive wave of AI new infrastructure.
Within half a year, the large-scale model rapidly spread from a small-scale consensus. According to the report released by CITIC, the number of large-scale models with more than 1 billion parameter models has been released so far is close to 80, half of which come from enterprises and half from scientific research institutions.
In the process of the gradual formation of the domestic large-scale model ecology, it has also begun to shed its pursuit of OpenAI and gradually find its own path. The standard for measuring the success of large models has also changed from the parameter competition of hard bridges and hard horses to the real problem solving.
SenseTime announced the large-scale model system of "SenseNova" for the first time in April this year, and released a number of large-scale AI models and applications including the self-developed Chinese large-scale language model "SenseChat". Recently at the World Artificial Intelligence Conference, SenseTime announced the first major iteration of the "Daily New SenseNova Large Model" system. The large language model "negotiation" was upgraded to version 2.0.
It's stronger. In the entire SenseTime large-scale model layout system, its role is becoming more and more obvious.
Stronger "Negotiation 2.0"
How to visually reflect the capability improvement of "Consultation 2.0"? Xu Li, chairman and CEO of SenseTime, demonstrated a non-existent dialogue between Lao Tzu and Confucius.
Confucius said: "I have heard the name of Master, and it is indeed a great fortune to meet you today!"
Lao Tzu said with a smile: "No, I am walking on the same path as you, how come the 'three lives'?"
And according to the question, the whole dialogue appears in classical Chinese. And in order to avoid confusion, "Consultation 2.0" also stated the premise of "this is just a fiction and should not be regarded as a true record of history" in the first sentence of the answer.
When "Consultation 1.0" was first launched, the on-site demonstration has demonstrated its excellent multi-round dialogue and human-machine co-creation capabilities. Three months later, "Consultation 2.0" has made more improvements in the accuracy of knowledge information, logical judgment ability, context understanding ability, and creativity.
For example, use "Consultation 2.0" to make travel planning, and tell it to make a table:
In terms of language, "Consultation 2.0" has added new languages such as Arabic and Cantonese. Support the interaction between Simplified Chinese, Traditional Chinese and English and other languages. And "Consultation 2.0"'s support for super-long texts has also been increased from 2k to 32k, enabling a better understanding of the context.
For ToB-oriented large-scale model manufacturers such as SenseTime, the quality of the large-scale model itself is only the starting point. How can enterprise customers define a specific outline for the large-scale model based on their own needs, and how can the latter achieve a stable iterative process and approach it step by step? The real pain point is where the winner will be decided.
Open Knowledge Base Fusion Capabilities
After SenseTime has trained a "Consultation 2.0" with super understanding, dialogue, reasoning and other abilities, corporate customers can also use their accumulated corporate knowledge to turn the big model into a "professional talent" who can serve their own companies well. .
How to efficiently solve these engineering problems is very important.
Wang Xiaogang, co-founder and chief scientist of SenseTime, said: "With the knowledge base, it is relatively simple and convenient to summarize the corresponding knowledge in this field without entering into our model itself", and because the information is more accurate , also solved the problem of hallucinations.
Digital Human as a Productivity Tool
At the same time as the comprehensive upgrade of "Consultation 2.0", the capabilities of the AIGC platform in the "SenseNova Large Model" system are constantly breaking through, and after the integration of language large model capabilities, a leapfrog improvement has been achieved.
For example, the Wenshengtu creation platform "Miaohua" mentioned above has been upgraded to version 3.0 this time, the model parameters have been increased to the order of 7 billion, and the details of the generated pictures have reached the level of professional photography. As for the headache of prompt words, "Discussion 2.0" provides "Miahua 3.0" with the ability to automatically expand prompt words. This means that users only need a few simple prompt words to achieve a detailed image result.
In the field of digital humans, SenseTime's digital human video generation platform "Ruying" has also been upgraded to version 2.0. The voice and mouth fluency of "Ruying 2.0" have increased by more than 30%, and 4K video can be realized. Effect. At the press conference, the digital human images of economist Ren Zeping, Master Yancan and Xu Li appeared, and the effect was realistic enough.
In the landing scene of the large model, the digital human is a very important carrying method. The recent very popular digital human live streaming is a typical scene. Live streaming, including short videos, is also one of the most focused scenes for customers during the three-month internal and public testing of "Ruying 2.0".
Luan Qing, general manager of SenseTime’s Digital Entertainment Department, said that within the framework of AIGC, “Discussion 2.0” can undertake copywriting and script creation for short video live broadcasts. And how "Ronin 2.0" can keep up with the trend in communication also depends on the large language model ability of "Consultation 2.0" to learn the latest short video corpus.
In addition to short video and live broadcast scenes, "Ronin 2.0" is accelerating its entry into all walks of life.
For example, in the insurance industry, every insurance specialist has the need to promote new products or other personalized service-oriented content output for customers. "Ruying 2.0" can replace insurance specialists on customers' birthdays or when certain wealth management products are released. Personalized content and services; in the education industry, "Roning 2.0" has begun to assist teachers on top domestic vocational education platforms to produce educational materials to meet internal needs for video production.
"Digital Human is a typical efficiency tool within an enterprise." Luan Qing said.
As an AIGC creation platform, Ronin will continue to deepen in the field of video generation in the future. Luan Qing believes that this is because content creation is undergoing a dimensional change from text, pictures to videos.
Towards Multimodal
Since pictures and video information account for a huge proportion in the real world, far exceeding language information, the need for understanding the real world will make the future of the basic large-scale model move towards multimodality, which has been seen for the first time through "Consultation 2.0" Clue.
In addition to text, "Consultation 2.0" has the ability to analyze pictures and video content.
The current large-scale model research is based on the transformer network architecture. "SenseTime has been engaged in large-scale model research since 2019. At that time, it was the route to do vision." According to Wang Xiaogang, co-founder and chief scientist of SenseTime, some visual standards and natural language standards are gradually converging today. , "When we develop in a multimodal direction, language and vision begin to have a deeper integration, which reflects a relatively strong accumulation and ability in this area."
Many application scenarios we encounter in real life, such as in a series of fields such as autonomous driving and robotics, must be applied to multimodality. "However, multi-modal data and some tasks are often not easy to obtain and require deep industry accumulation. This is also the advantage of SenseTime." Wang Xiaogang introduced.
Three months after its first public appearance at this year's World Artificial Intelligence Conference, SenseTime's "Daily New SenseNova Large Model" system has been fully upgraded and opened to enterprise users. At the same time, many people have not noticed that Shangtang has also released a multi-modal large-scale model of scholars together with the Shanghai Artificial Intelligence Laboratory. In the future, it is worth looking forward to whether SenseTime can take the lead in finding the key to the multi-modal road.