Shanghai Artificial Intelligence Lab releases the 3D real scene model of Shusheng·Tianji LandMark

2023-07-07 03:00:09

Source: The Paper

Reporter Wu Tianyi Intern Chen Xiaorui

Image source: Generated by Unbounded AI tool

·The world's first NeRF 3D real-life large-scale model with 100 billion parameters ·Sky LandMark can support multiple city editing functions. In the demonstration, the Wukang Building can use NeRF technology to change its style and light and shadow effects according to different time periods; the Chinese Culture Palace can perform overall rotation or rotation of different layers.

·Shanghai Artificial Intelligence Laboratory launched the general large-scale model system for scholars, including three base models of multi-modality, Puyu, and Tianji. At the same time, it launched the first full-chain open source system for large-scale model development and application.

On July 6, at the 2023 World Artificial Intelligence Conference (WAIC) Frontiers of Science plenary meeting, Lin Dahua, an expert in deep learning and computer science, a professor at the Shanghai Artificial Intelligence Laboratory, and a professor at the Chinese University of Hong Kong, released a large 3D real-scene model of Shusheng Tianji LandMark, And its technical principle and functional application are introduced.

Lin Dahua said that Shusheng·Tianji LandMark is the world's first 100-billion-parameter NeRF 3D real-life large model, which was jointly developed by Shanghai Artificial Intelligence Laboratory, Chinese University of Hong Kong and Shanghai Surveying and Mapping Institute. ) The capability of light field modeling extends from the object level to the city level. Lin Dahua said that the release of Shusheng·Tianji LandMark is an innovative application of large models, which "provides the technical possibility for us to realize city-level AIGC (Artificial Intelligence Generated Content) in the future."

The "Shusheng General Model System" (hereinafter referred to as "Shusheng Large Model") was also released for the first time at the conference, including three basic models of Shusheng·Multimodal, Shusheng·Puyu and Shusheng·Tianji, as well as the first large-scale model-oriented A full-chain open source system for R&D and application.

From an apple to a whole city

"In addition to generating text, the large model can also give us a more imaginative world." Lin Dahua said that the scholar Tianji LandMark uses NeRF technology to provide more possibilities for the application of large model technology.

NeRF is a new type of 3D light field modeling technology, which was first proposed by the Google research team in March 2020. It was initially applied to 3D modeling, and it was limited to the level of small objects (the size of an apple). "But we think that NeRF technology is more than that." Lin Dahua said, "On December 10, 2021, our team first proposed to expand the ability of NeRF light field modeling from the object level of a small apple to the city level. This is the global It is the first time to extend the capabilities of NeRF technology from objects to cities. He said that after their research team proposed city-level NeRF for a while, Carnegie Mellon University and Google released their respective city-level NeRF technologies.

On December 10, 2021, Lin Dahua's team first proposed to extend the NeRF light field modeling capability from the object level of a small apple to the city level.

"Based on the core technology of city-level NeRF, we are constantly improving its scalability and capabilities." Lin Dahua introduced that the 3D real-life large model of Shusheng·Tianji LandMark is based on the second-generation CT NeRF technology and algorithm of the research team, and supports a full range of High-precision real-time rendering, including 200 billion parameters, covering 100 square kilometers, every detail in the real scene supports 4K high-definition resolution.

Real 3D is a digital space that reflects and expresses real, three-dimensional, and time-sequenced human production, living, and ecological spaces within a certain range. According to reports, Shusheng·Tianji LandMark integrates algorithms, operators, and computing systems, and proposes a new real-world 3D model representation and training paradigm at the model level. While training efficiently, it can accurately represent large-scale 3D urban scenes, and Achieve high-quality neural rendering effects. It takes the lead in four aspects: high-precision modeling, high-precision rendering, functional scalability, and integration of training and interaction.

Shusheng·Tianji LandMark can also support functions such as city-level editing and style conversion. In the demonstration, the Wukang Building can use NeRF technology to change its style and light and shadow effects according to different time periods; the Chinese Culture Palace can perform overall rotation or rotation of different layers. "This provides a technical possibility for our city-level AIGC in the future." Lin Dahua said.

Various parts of the Chinese Culture Palace can be "rotated".

Lin Dahua said, "I hope that through the new 3D real scene generation technology, we can inject new imagination and innovation space into our future urban space. In the future, Shanghai AI Lab will expand the modeling scope and functions of Shusheng Tianji, and The algorithms, operators and systems of Shusheng Tianji are all open source.”

The first general large-scale model system for scholars

At the meeting, Lin Dahua also introduced the general large-scale model system for scholars, including three base models of multi-modal, Puyu, and Tianji. At the same time, he launched the first full-chain open source system for large-scale model development and application. Among them, the multi-modal large model has 20 billion parameters, supports 3.5 million semantic tags, and leads the world in 80+ tasks; the Pu language large model is the first large model officially released in China with 100 billion parameters that supports multiple languages.

"Scholar Puyu has surpassed LLaMA-7B (an artificial intelligence language model developed by Meta AI's FAIR team) in all dimensions." Lin Dahua said that as a large model with hundreds of billions of parameters, Shusheng Puyu has achieved a high level of accuracy in all dimensions. Both surpass the best existing open source models in China.

On June 7 this year, Shanghai AI Lab and SenseTime jointly released the large-scale language model of "Scholar·Puyu" jointly with the Chinese University of Hong Kong, Fudan University and Shanghai Jiaotong University. The model has 104 billion parameters and is one of the current large language models with hundreds of billions of parameters. It is trained based on a multilingual high-quality data set containing 1.6 trillion Tokens.

According to reports, since its official debut in June, Scholar·Puyu has undergone a comprehensive upgrade within one month, including five aspects. First, the length of the context window has been increased from 2K to 8K, which enables it to understand long input, develop complex reasoning, and carry out long-term multiple rounds of dialogue; second, the multilingual and structured expression capabilities have been further strengthened, The new version of the model supports more than 20 languages, and can also summarize and present complex information through tables and charts; third, the multi-dimensional capabilities have been comprehensively improved, and the performance on 42 mainstream evaluation sets has been significantly improved, and the performance on 35 of them Surpassed ChatGPT; Fourth, the mathematical logic ability has improved significantly, and the mathematical ability such as numerical calculation, function operation, and equation solving has been greatly improved. The performance on the mathematics evaluation set GSM8K has increased from 62.9 to 73.2. On the multiple choice questions of the 2023 college entrance examination, The accuracy rate has increased by more than 70%; Fifth, the safety and alignment capabilities have been significantly enhanced. Through more effective instruction fine-tuning, including reinforcement learning based on human feedback (RLHF), the new version of the model can follow human instructions more reliably, and the safety is also obvious. improve.

"The ultimate value of all large models is still to create value for life and production. The Shanghai Artificial Intelligence Laboratory not only achieves technological breakthroughs through innovation, but is also committed to promoting the implementation of these technologies in specific industries." Lin Dahua said at the meeting.

Lin Dahua said that in addition to the large model itself, the team also open sourced the entire chain of tool systems, covering the five main links of data, pre-training, fine-tuning, deployment, and evaluation during the development of the large model. "Through the open source tool system, the model can be The value has been fully utilized. I believe that open source can really help developers develop and innovate on the basis of large models."

According to reports, the official open-source version is a lightweight InternLM-7B with 7 billion parameters, which shows excellent and balanced performance in the full-dimensional evaluation including 40 evaluation sets, which is ahead of the existing open-source models.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

1 Likes

Reward
1
Comment
Repost
Share

Comment

0/400

No comments

Topic
#July PPI Beats Expectations
10k Popularity
#ETH ETFs Top $30B
10k Popularity
#Gate Alpha Peak Trading Competition
137k Popularity
#Bessent on BTC Reserves
568 Popularity
#Gate Releases August Reserves Report
19k Popularity

sitemap