2023.07.03Source: Equal Ocean
On June 30th, DataCanvas held a new product launch event and released two new series of products, AIFS (AI Foundation Software) and DataPilot. Among them, AIFS products include DataCanvas self-developed white-box large model - DataCanvas Alaya Large Model, as well as a software platform that facilitates enterprise to train their large models from scratch.
Since the emergence of ChatGPT's popular at the beginning of the year, multiple large models or industry solutions have released so far. Among them, there are both internet giants such as Baidu, Tencent, and Alibaba, as well as emerging industry giants such as SenseTime and Cloudwalk in the era of AI 1.0.
However, compared to these technology companies with their own application scenarios, DataCanvas has significant differences:
1. DataCanvas is an AI foundation software company that focuses on providing software platforms to enable clients to develop AI models more efficiently and independently;
2. The first difference determines the positioning of DataCanvas large model related products. The significance of the emergence of the DataCanvas Alaya Large Model as a pre-trained model for white-box is that it can transparently allow users to create large models with confidence based on this foundation. The AIFS large model training platform inherits the functions of APS (DataCanvas AI Infrastructure Service platform) and tools required for users self-developed large models on this basis.
In the past period, ChatGPT has opened up people's huge imagination for AI 2.0, and then the market quickly followed suit, with large models and related vendors sprouting up like mushrooms after rain. But the pointer of time has shifted to the present, passing through the stunning first few months and entering the fuzzy zone of the initial landing of large models. As people cannot yet clarify in which scenarios the highest return on investment and the optimal landing path are, many manufacturers, to not fall behind, not only develop large models, but also carry out large-scale model business landing and customer service, while also playing multiple roles, and their positioning is inevitably unclear.
In contrast, DataCanvas has firmly anchored its identity as an artificial intelligence foundation software supplier based on long-term business accumulation.
This time, Equal Ocean TE has a conversation with Fang Lei, co-founder and CEO of DataCanvas. Fang Lei holds a Bachelor's degree in Electronic Engineering from Tsinghua University and a Ph.D. in Electronic Engineering and Computer Science (EECS) from Virginia Tech in the United States. He returned to China in 2013 to start his own business and has been deeply involved in the field of data science for more than ten years. We have discussed the opportunities for AI 2.0, as well as the strategic layout and thinking of DataCanvas.
New AI,New Data,New Software
After graduating with a PhD, Fang Lei joined Microsoft in 2008 and participated in the incubation and development of Microsoft's cloud computing platform Azure, as well as Bing's search team.
It was his work experience at Microsoft that helped Fang Lei find a direction for entrepreneurship. In 2013, Fang Lei ended his 10-year study and work life in the United States and officially returned to China to start his own business.
After the establishment of DataCanvas, the demand development in the AI market has gone through different processes, and the development of DataCanvas has not been smooth sailing. For a long time, DataCanvas, which provided AI modeling production tools to clients, was actually not understood by the market. "After the Four Little Dragons emerged in the AI 1.0 stage, clients initially felt that they only needed one application and did not need tools."
Until after 2020, when national policies were guided towards autonomy and controllability, clients also found that the last mile required a lot of business integration, and the ability of independent AI for enterprises had to be put on the agenda.
After 2020, market demand surged, and DataCanvas stepped on the trend and embarked on a fast lane of development.
Currently, AI 2.0 is a new opportunity in front of DataCanvas.
In the 1.0 era, AI vision companies represented by the Four Little Dragons have not yet formed a set of products that can cover the entire industry. In an interview with Caixin Magazine, Yang Fan, co-founder of Senstime Technology, mentioned the limitations of the past. "The cost of meeting user needs is too high, and each new scenario requires collecting new data and creating a set of algorithms. Some small scenarios may only have 3 to 5 clients nationwide that can be reused, which greatly restricts the business model."
However, large models represented by ChatGPT and GPT4 already have common scenario implementation cases. After combining specific industry scenario data, the scene of the large model quickly landing in each industry seems clear and visible.
AI 2.0 has greater capabilities and value than AI 1.0, and DataCanvas is bound to board this train to the future.
As a new AI, AI 2.0 requires new data and software to be implemented, which is in the field that DataCanvas has long accumulated.
In terms of data, Fang Lei told Equal Ocean that DataCanvas sees vector data becoming the protagonist in the era of large models. On this basis, DataCanvas proposed that Vector Ocean will cross data silos and data lakes, becoming a new form of data storage and management in the AI 2.0 era.
Specifically, current data warehouses and data lakes are the mainstream ways to store and manage structured, semi-structured, and unstructured data. The Vector Ocean proposed by DataCanvas refers to the unified storage, querying, and analysis of different types of data in the form of vectors. Vector Ocean is a new data architecture formed by the fusion of traditional data analysis and AI algorithms, which can better adapt to the development needs of multimodal large models.
Due to the current focus of attention on the implementation of large models in China, there is relatively little discussion on these low-level technologies. "However, I believe that the new data architecture represented by Vector Ocean will quickly become mainstream in the industry," said Fang Lei.
In terms of AI software, Fang Lei believes that "software" will become "thought-ware" because the capabilities of AI models and data capabilities have changed, and the development process and presentation form of AI will also be strongly impacted.
Fang Lei told Equal Ocean that software production has always been based on process guidance, which is a step-by-step approach in specific contexts after exhaustive segmentation of scenarios. However, in the future, some software production and presentation forms will be more closely related to people's thinking methods.
Taking recruitment scenarios as an example, the recruitment process of most companies is a set of processes including resume collection, interview, and evaluation, all of which require manpower to digest candidate information. In AI 2.0, the relevant application scenario may be that AI software accepts resumes, HR directly enters "Is this person suitable for us, why" in the dialog box, and then the AI model absorbs and understands the candidate's information and provides constructive feedback.
In the business field of DataCanvas, the company is also committed to enabling developers to produce AI software using human thinking and expression methods. This is the value direction of DataCanvas AIFS platform products, providing a new paradigm for AI software development for thousands of industries.
AI 2.0, computing power is a larger market than models, and intelligent computing centers are an undeniable layer
While seeing New Data and New AI bring New Software, Fang Lei also pointed out that in the era of AI 2.0, computing power is a larger market than models.
This can be seen from the weak growth of ChatGPT applications, which is currently the strongest model, while Nvidia's market remains strong.
According to SimilarWeb data, the growth rate of ChatGPT's traffic has significantly decreased. From January to May this year, the month on month growth rates of ChatGPT were 131.6%, 62.5%, 55.8%, 12.6%, and 2.8%. Some people predict that this data is likely to be negative by June.
Looking at the computing power market again, it is in a hot state of supply shortage. The first thing that drives the large model is training, which has made GPU giant Nvidia a huge profit.
On May 25th, Nvidia released its first quarter financial report for 2023, showing a year-on-year increase of 46% in revenue to $8.29 billion, and raised its revenue guidance for the second quarter to $11 billion, far exceeding expectations of $7.11 billion.
On May 30th, Nvidia's market value also exceeded trillions of dollars, becoming the seventh company with a market value exceeding trillions of dollars after Apple, Microsoft, Amazon, Alphabet, Meta, and Tesla.
The GPU chip developed by Nvidia meets the computing needs of the era of large models. In China, technology companies such as BAT have a large reserve of GPU chips and have also opened computing power services for large model training on the basis of their existing business.
But Fang Lei believes that the AI 2.0 data computing market is not only an opportunity for BAT, but also a market for intelligent computing centers of all sizes. "Because the usage cost of intelligent computing centers is cheaper than cloud computing platforms."
In early June, CNBC reported that Microsoft would invest billions of dollars in cloud computing infrastructure construction for startup CoreWeave in the coming years to ensure that OpenAI, which operates ChatGPT, has sufficient computing power in the future.
Fang Lei analyzed that on the basis of the Azure cloud computing platform, Microsoft also invested in CoreWeave to ensure computing power because a large public cloud provides various convenient services, and there is also a huge infrastructure hidden beneath the cloud, for which all cloud customers have to pay. In this case, the construction and usage costs of a new computing power platform focused on AI computing power will be lower.
As an example, CoreWeave's website claims that the company can provide computing power that is "80% cheaper than traditional cloud vendors."
At the same time, the opportunity for intelligent computing centers is different from how large companies were able to foundationally dominate the IDC market in the past. Fang Lei told Equal Ocean that there is a significant difference between investing hundreds of millions or tens of billions in building IDC. IDC mainly sells broadband, which has a economies of scale. The more people use it, the lower the cost. Investing a few hundred million yuan is naturally not worth several hundred billion yuan.
But the intelligent computing center sells computing power, and the cost of an upstream GPU chip is rigid for all enterprises, and there can be corresponding returns on investment.
Furthermore, Fang Lei believes that the intelligent computing center will become an important investment target for a large number of state-owned capital.
In addition to the certainty of the return quantity mentioned above, in terms of the return cycle, compared to the current 8-10 year return cycle of innovative enterprises and technologies that can be selected by the central government capital, "the return cycle of the computing power market is estimated to be around 3 years."
The opportunity for China Central Bank Capital in the computing power market is the opportunity for DataCanvas.
In the era of cloud computing, large companies such as BAT find it difficult for independent startups to integrate into their cloud ecosystem due to their closed ecosystems and complete family bucket solutions..
Fortunately, in 2021, DataCanvas stepped on the trend of fragmented cloud market and bet on state-owned cloud. Now, it has established cooperative relationships with more than 50 clouds, including Mobile Cloud, Tianyi Cloud, and China Electronics Cloud, and has established an AI foundation platform on the cloud, which has been able to develop rapidly in the past two years.
This is the "AI Cloud in the Clouds" strategy proposed by DataCanvas in 2021.
Now, Fang Lei told Equal Ocean that "AI Cloud in the Clouds" is still the core strategy of DataCanvas. At the same time, DataCanvas has identified where the real opportunity lies for AI 2.0 and has incorporated intelligent computing centers into this strategy.
The meaning behind the upcoming launch of AIFS (AI Foundation Service) by DataCanvas in the coming months is that DataCanvas will continue to cooperate with state-owned cloud vendors and will include a large number of intelligent computing centers in the market as partners, providing customers with an AI model development platform that integrates software and hardware.
Fang Lei told Equal Ocean that at that time, the DataCanvas AI model development platform will have the ability to pre train large models in white-box and help enterprise clients customize large and small models from scratch. The AI model development platform will not increase in price, and customer expenses will be tied to computing power consumption.
The computing power market is a trillion dollar market, and the software market is a 20-30 billion dollar market. The combination of the two will create a larger market, and DataCanvas is here.
Author | koko