Digital human created by Baidu AI Cloud and modeled after Chinese celebrity Simon Gong.
“Rising demand is driving the rise of digital people,” said Shiyan Li, head of the digital human and robotics business at Baidu, which created digital model actor Gong. “In China alone, there are more than 400 million ACGN (animation, comics, games and novels) fans, and a business market worth hundreds of billions of dollars, targeting digital people.” And according to a company that maintains business records, Qichacha, China now has more than 280,000 enterprises engaged in digital people-related activities.
A different kind of digital
The debut of Baidu’s digital celebrity may not seem like much at first, as the concept of “virtual idols” has been around for years. For example, American virtual influencer Lil Miquela has appeared alongside real human celebrities in online advertisements and TV commercials since 2016, gaining more than three million Instagram followers. However, there is something different about the virtual Chinese star: a digital human with the ability to listen, speak and interact with real people on a level never seen before. And Gong’s digital duties are not limited to singing. With the latest update of Baidu app, China’s leading search plus feed app, Gong appears on users’ phones and helps with searches and queries using the real voice of the model actor. Since this interactive search experience was launched in 2021, the number of voice searches on the Baidu app has increased by 18.2%.
Baidu AI Cloud started developing a digital worker in 2019 in collaboration with Shanghai Pudong Development (SPD) Bank. They then focused their efforts on building a digital financial advisor to provide a service equivalent to that of a human bank representative when real employees were unavailable. According to SPD Bank, more than 460,000 customers rely on digital people for banking services and portfolio management every month. “Access to digital people outside of regular business hours enables SPD Bank to provide 24/7 customer service at low cost and high efficiency,” said a bank representative.
More recently, a Baidu-created virtual anchor provided live sign language commentary during the Beijing 2022 Winter Games for hearing-impaired viewers. The avatar not only looked like a real person, but was also equipped with speech recognition and sign language interpretation to ensure fast and highly accurate input and output. With approximately 430 million people worldwide experiencing hearing loss, there is great potential for using this technology to increase their access to a wide variety of content, according to the World Health Organization.
A sign language interpreter created by XiLing of Baidu AI Cloud.
XiLing: a new generation on an AI platform
From entertainment to public services, digital people will play a greater role in our daily lives. But behind their natural and effortless appearance lies a complex web of new and emerging technologies that push the boundaries of AI innovation.
Baidu AI Cloud’s digital celebrity and virtual sign language anchors were created through XiLing, a new digital platform launched in 2021. At the Baidu World 2022 event on July 21, the company announced a new capability on XiLing, which will enable the creation of digital people who can be livestream hosts who can sing, dance and respond to comments in real time, without ever needing a single pause. to have. XiLing is unique in its ability to support the entire process of creating a digital human, from creating a realistic persona to empowering conversation and content generation skills. One of its most notable features is speed. The platform can generate a 3D avatar based on a real person in one to two weeks, while a 2D avatar can be created in just minutes.
In addition, XiLing’s intelligent dialogue tools allow creators to quickly adapt a digital human’s conversational ability, allowing it to adapt and learn over time. This capability is enabled by Baidu’s PLATO, a 100 billion-parameter dialogue model that enables digital people to participate in open domain conversations, i.e. to understand any topic and provide relevant answers. High-precision speech recognition and lip-syncing with over 98.5% accuracy allow the digital human to have smoother, more human-like interactions. “Using advanced AI technologies will continue to lower the cost of building digital humans and significantly improve their interactions with real humans,” says Li.
Just as every real person has their own skills and talents, so does the new generation of digital people. This could even mean giving digital people the power to be creative themselves, thanks to recent advances made by major AI models like Baidu’s ERNIE, which can generate texts and create realistic images when asked. For example, digital people designed to serve as brand spokespersons can independently create and post on social media, design posters, and act in videos.