When man's word is machine's command in artificial intelligence, the next frontier of growth

BEIJING (CHINA DAILY/ASIA NEWS NETWORK) - Shortly after reaching home from a long day's work, Mr Ma Bin, 30, a software engineer in Beijing, walks into his flat's living room and, in Alibaba's "Open Sesame" style in One Thousand And One Nights, says to a remote controller on the sofa: "Let's talk."

Immediately, his Internet-connected TV hears his command, switches itself on automatically in utter obedience and begins beaming Mr Ma's favourite TV show.

After relaxing for a while watching the show, Mr Ma heads out for a bite, settles in the driver's seat in his car and slips into the Alibaba mode again: "Lower the temperature to 19 degrees."

The car's air conditioner obeys the Master instantly.

While still driving, hands very much on the wheel, he issues another oral command: "Recommend the nearest coffee shop."

It is now the turn of the car's navigation system to obey the Master.

Welcome to the your-word-is-my-command age where devices use audio as input or medium for rendering services powered by AI (artificial intelligence).

Ask and you shall receive as it were. The consumer, it appears, never had a more powerful voice.

Mr Ma accomplishes many everyday tasks by uttering instructions to voice-based digital assistants - he has quite a few of them - that control his devices and appliances.

For instance, his iPhone-based assistant Siri helps him book calendars and livestream music.

Rapid advances in speech recognition and language understanding technologies are making human voice the next major medium to communicate with computers, which are at the heart of almost all devices and appliances these days.

Devices are getting better at processing voice commands from across different rooms and against background noise.

No need to type on keyboards; no need to tap, swipe or draw on touchscreens; no need to press buttons, levers and such things. Do just say: "Open sesame."

Early converts such as Mr Ma are embracing the era of voice computing with gusto. "If I can control the surroundings simply by uttering a few words, why should I bother to touch screens or buttons?"

In China, conversation-savvy electronics are on the rise as local tech heavyweights vie for early lead in the next frontier of growth and innovation.

The scene is not much different from the United States where Apple, Microsoft, Google, Facebook and Amazon.com are all battling for slices of the AI pie.

The trend will be further stoked by China's plan to build a one trillion yuan (S$202 billion) AI industry by 2030, which was unveiled by the State Council. Voice computing is an important part of that ambitious goal, which the private sector is determined to reach.

For instance, on July 5, e-commerce behemoth Alibaba Group Holding unveiled its Tmall Genie X1, its voice-driven digital speaker, which is modelled on Amazon.com's Echo and Google's Home. The same day, Baidu, the Chinese Internet search leader, showcased its Mandarin-speaking DuerOS personal assistant.

Such voice-based speakers can stream music and newscasts, among other things, and can be improved to perform other tasks.

Towards that end, Baidu announced a new deal to acquire a start-up specialising in the development of voice recognition technology.

Not to be left behind, Tencent Holdings, China's social networking and gaming titan, is developing its own voice-based speaker for launch within months.

Huawei Technologies, the world's third-largest smartphone manufacturer, jumped onto the voice-based technology bandwagon, hiring more than 100 researchers to work on developing a Siri-like assistant.

According to a Bloomberg report, more than 60 companies in China are working with US-based Conexant Systems, an audio technology player, to introduce voice-activated intelligent devices.

"Voice interaction, though still nascent, will be of utmost importance in future. In the Internet-of-things era, most Internet-connected devices won't have screens. Voice control will be the most convenient way to interact with them," said Mr Liu Xingliang, president of the Data Centre of China Internet, a Beijing-based market research company.

Recent facts and figures appear to back Mr Liu's vision.

In China, the speech recognition market expanded by about 40 per cent to 4.03 billion yuan in 2015, faster than the US$6.12 billion (S$8.3 billion) global market which grew at 34 per cent, according to a report by the Speech Industry Alliance of China.

The China market is expected to grow almost 70 per cent year-on-year to 10.07 billion yuan in sales this year. Some two million smart speakers will likely be shipped in China this year, a fraction of the 14 million in the US; and 22 million will be sold in China in 2022, according to Counterpoint Research estimates.

With potential applications of the technology growing by the day on the back of constant improvements, Grand View Research projects the global market will reach US$128 billion in 2024.

That kind of optimism stems from the high level of accuracy of the technology.

For instance, in 2015, Mr Andrew Ng, former chief scientist at Baidu, said the technology was about 95 per cent accurate. Stated differently, devices were able to hear and act on about 19 out of 20 words correctly.

That is, there were not too many serious risks to consumers seen arising from devices mishearing words and acting in ways contrary to commands.

And now, the accuracy rate is said to be higher - 97 to 98 per cent. Baidu and iFlytek are leading the voice technology pack.

To be sure, technological hurdles exist. Mr James Yan, research director at Counterpoint, said: "More efforts are needed so that third-party services can be swiftly activated through voice control."

Improvements are coming at a faster rate than expected as big data is crunched, analysed and made to yield insights, which, in turn, are opening up voice recognition platforms to third-party services, according to Analysys, a Beijing-based market research company.

With market potential increasing, Chinese companies are scrambling to unveil always-on listening devices that are eager to communicate or interact with their "masters".

For instance, e-commerce giant Alibaba is emulating Amazon in envisioning a central role for voice-driven smart speakers that consumers can use to control almost everything at home. Its Tmall Genie X1 speaker can simplify online shopping by executing purchases based on voice commands.

Similarly, JD.com, another leading online marketplace, has unveiled several versions of smart speakers by using iFlytek's voice recognition technology.

JD said it sold around 10,700 speakers during last year's Nov 11 online shopping festival and the following two weeks.

"Many domestic players are inspired by (Amazon) Echo's phenomenal success in the United States," said analyst Zhang Yin at Orient Securities.

In the fourth quarter of 2016, the Echo accounted for about 88 per cent of shipments of 4.2 million intelligent home speakers in the US. In that quarter, US shipments were up nearly 600 per cent year-on-year, according to Strategy Analytics.

Join ST's Telegram channel and get the latest breaking news delivered to you.