2023.06.30
On June 30th, the "New AI · New Data · New Software" DataCanvas new product launch event was held in Beijing. Guests from more than 50 universities, institutions, and media, including Tsinghua University, China Academy of Information and Communications Technology, IDC China, and Xinhua News Agency, witnessed the birth of the two revolutionary AI product series, "AIFS Artificial Intelligence Foundation Software" and "DataPilot ", together with DataCanvas.
The technological breakthrough of large models has opened up a corner of the new AI universe for human technological vision, allowing humans to immediately imagine a kaleidoscope like future. When faith becomes a force, it will promote immeasurable leaps in development. At the press conference, DataCanvas Chairman Dr. Fang Lei elaborated on his unique big model worldview from the perspective of AI technology enterprises.
Dr. Fang Lei, Chairman of DataCanvas
"The era of large models requires a complete infrastructure upgrade, rather than relying on a single large model to solve all problems; the implementation of large models will solve more difficult problems and cause deeper impacts, and it is not easier than small models."
—— Dr. Fang Lei, Chairman of DataCanvas
Fang Lei pointed out that a complete infrastructure includes three major elements: computing power, data, and foundation software. Whether the big model can achieve cross era development depends on the frequency of progress of the three elements together.
Among them, computing power represents the progress of hardware, and according to the current development trend, there will no longer be a resource gap. The effective storage, calculation, and circulation of data still have broad room for development. In the real world, there are numerous independent data territories among industries, enterprises, and professions. The huge amount of data and the difficulty of connecting data territories indicate the difficulty of implementing universal large models. The implementation and application of large models will be earlier and more reflected in vertical large models of industries, enterprises, and other industries. Similarly, the number of vertical large models will greatly exceed that of general large models. However, foundation software has significant differences in performance and cost. Fang Lei pointed out that there is a huge space for unified optimization of software, models, and hardware, and among the three elements, it is the most active place for innovation.
DataCanvas AIFS & DataPilot Product System
The AI technology in the era of big models still requires the integration of the last mile. Fang Lei pointed out that there is a huge space for unified optimization of software, models, and hardware, which is the most active place for innovation. Powerful and flexible foundation software, open and flexible white box models, and professional talents proficient in business will accelerate the achievement of the last mile of crossing. The AIFS & DataPilot product system, which was heavily released during this product launch event, is a new technological achievement aimed at achieving this goal.
AIFS, using AI Force to explore the boundaries of model application capabilities
AIFS (AI Foundation Software) is DataCanvas's response to building comprehensive AI capabilities in the era of big model driven New AI, and it is also the latest upgrade of DataCanvas's product family.
As a leading artificial intelligence application construction infrastructure platform in the industry, AIFS covers the entire lifecycle process of training, fine-tuning, compression, deployment, inference, and monitoring of large models, as well as small models. It provides a set of tools for data scientists, application developers, and business experts, allowing people from different roles to collaborate and easily process data and use it for development, train and deploy models of any size.
Yu Jiangang, Vice President of DataCanvas
As a foundation software system for artificial intelligence, AIFS mainly includes a series of fully open, highly automated, and highly collaborative software tools such as the DataCanvas Alaya large model matrix, DataCanvas APS AI infrastructure service platform, DataCanvas BAP business automatic modeling platform, open-source DAT automatic machine learning software, and the open-source YLearn causal learning software, providing one-stop support for users to independently build "big+small" models throughout their entire lifecycle.
The released DataCanvas Alaya large model features a "General+Industry" series of model matrices, multimodal large models, optimized training mechanisms, and friendly open source protocol management. In terms of open-source support, DataCanvas Alaya not only supports Apache 2.0 protocol, but also provides users with white box models, which is truly outstanding in the current situation of the large model industry. Yu Jiangang emphasized that this is the company's commitment to product openness, aiming to give users greater freedom in AI innovation capabilities, in order to accelerate the application of large models in diverse business scenarios.
Miao Xu, Chief AI Scientist of DataCanvas
Multimodal is the next important technological link in artificial intelligence large models. Miao Xu, Chief AI Scientist of DataCanvas, emphasized the technical roadmap of DataCanvas Alaya in the multimodal direction. To achieve support for various industries, DataCanvas Alaya vigorously develops the integration and alignment between structured and unstructured data, so that large models can not only utilize text information, but also industry databases and a large number of difficult to process sensor sequences, making intelligence closer to business scenarios. Furthermore, DataCanvas Alaya also provides fine-tuning technology innovation for the construction of enterprise owned large models, improving efficiency by dividing complex fine-tuning tasks into different subtasks, making the customization process of large models easier and more flexible.
DataCanvas AIFS is not a small step from small models to the era of "big+small" models. It represents a big step towards the fusion of core software capabilities and white box models in the era of large models. AIFS can also achieve tremendous empowerment for the application side, building personalized and autonomous large models for future enterprises, and integrating large models with small models accumulated in the past to apply to business, pressing the "Enter" button.
DataPilot, a new era spacecraft for exploring the "vector ocean" of data
AI and Data have always been closely related. In the past decade or so, data has often been regarded as the raw material and fundamental element of AI, and we call it the era of Data for AI architecture. The emergence of big models has enabled data to be reverse empowered by AI, which is a hallmark of the new era of New Data.
When data collides with the breakthrough in AI capabilities, how will its future change? DataPilot is giving the world the answer, with "vector ocean" becoming the key word for the answer.
Zhou Xiaoling, Vice President of DataCanvas
DataPilot, a new paradigm for data processing, is a next-generation data architecture tool product developed by DataCanvas based on large models. By fully utilizing DataCanvas Alaya's ability to understand and generate general text, as well as fine-tuning and optimization in the data field, DataPilot can help users achieve intelligence and automation of data throughout the modeling lifecycle.
Vector Ocean, on the other hand, is the ultimate form of data development creatively proposed by DataCanvas based on years of research and practice in the field of databases, combined with the development direction of vector data.
Zhou Xiaoling introduced that the features of DataPilot include a multi-modal "vector sea" data architecture, on-demand automated data integration, code generation, process orchestration, and analysis calculation, as well as natural language based data acquisition, analysis, and machine learning modeling capabilities. DataPilot can significantly reduce the technical barriers for the entire process of data integration, governance, modeling, computation, querying, analysis, and machine learning modeling, reduce the cost of data-driven business development, and accelerate the process of digital innovation.
Hu Zongxing, Senior Product Director at DataCanvas
It is precisely based on the concept of "vector ocean" that DataPilot includes various data software such as DataCanvas RT real-time decision-making center platform and open-source DingoDB multimodal vector database, enabling users to have the real-time and multimodal data capabilities urgently needed in the context of AI technology breakthroughs.
Among them, DingoDB, as an open-source multimodal vector database, will be a powerful engine in the era of vector ocean. It combines the characteristics of data lakes and vector databases, supporting the storage of any type (key value, PDF, audio, video, etc.) and any size of data. Through DingoDB, users can build a dedicated "vector ocean" of data, whether structured or unstructured, and can complete the analysis and scientific calculation of multimodal data with just one set of SQL.
From Software to Thought-ware, a new form of software evolution
While empowering Data, AI is also having a profound and even disruptive impact on the evolution of software forms, officially opening the chapter of the era of New Software.
Empowered by new breakthroughs in AI technology, the world is experiencing a leap from the "Software" era to the "Thought-ware" era. The traditional "software" is an iterative process that revolves around requirement analysis, product design, and code implementation, and is a paradigm of "passive response to requirements". "Thought-ware" is a new paradigm of software evolution centered around "thought".
Yang Jian, Chief Architect of DataCanvas
Yang Jian further elaborated on " Thought-ware " at the press conference, stating that "Thought-ware" has the ability of independent thinking, controlled action, and proactive self evolution.
Through the ability of "independent thinking", Thought-ware can understand user intentions and provide constructive solutions to users by combining action capabilities. Autonomous thinking and action abilities can raise a series of concerns about the legitimacy, security, and compliance of system capability boundaries, therefore such system action capabilities must be "controlled". In addition, Thought-ware also needs to have proactive self evolution ability, which is a type of active learning and self optimization ability. During operation, it can learn autonomously through interaction with users and the environment, achieving a series of evolution processes such as error correction, requirement adaptation, and capability accumulation and optimization.
Yang Jian revealed that DataCanvas has started exploring the new "Thought-ware" and demonstrated the experimental product TableGPT at the press conference.
TableGPT follows the concept of "what you need is what you get". Users only need to describe problems and requirements in natural language, without entering complex commands or manually selecting algorithms. TableGPT can automatically understand user intentions, select appropriate statistical analysis and machine learning methods, and complete automatic modeling on private domain data. Then, it can provide feedback on the data analysis results that users need and provide explanations. At the same time, suggestions for subsequent data mining can also be provided. Throughout the entire process, users do not need to pay attention to or be exposed to any specific technical details, greatly reducing the difficulty of learning and using.
This language based interaction with artificial intelligence software allows anyone to easily gain insights from data, providing a minimalist and novel interaction experience, as well as unexpected interaction results, which will greatly promote the democratization and popularization of data analysis. Yang Jian stated that although it is still in its early stages, TableGPT points the way for further integration and development of data science and artificial intelligence software.
From AIFS to AIFS
The artificial intelligence industry has shown a new round of explosive vitality under the trend of nationwide participation in AIGC interaction. Regardless of the direction of the AI industry's development, high-performance foundation software and data architecture will always serve and play an important role as the foundation core. The two major product systems released this time continue to implement DataCanvas' core product philosophy of "openness, automation, and cloud native", and will take on heavy responsibilities in the rapidly changing AI new world.
With the mutual empowerment of AIFS, DataPilot, and Thought-ware, DataCanvas software tools and solutions will continue to empower the vast industry, continuously integrating cutting-edge AI innovation technologies, and helping it accelerate its independent digital upgrading and AI scale application in the era of large models.
Looking towards the future, DataCanvas will adhere to the top-level strategic plan of " AI cloud in the clouds", collaborate with cloud vendors, intelligent computing centers, and other partners to provide one-stop services, and complete the magnificent transformation from AIFS (AI Foundation Software) to AIFS (AI Foundation Service).