National AI database gains 1,000 hours of local English voice samples

The addition of natural conversational speech would enhance the accuracy of speech application development for software developers and researchers.
The addition of natural conversational speech would enhance the accuracy of speech application development for software developers and researchers.PHOTO: ST FILE

SINGAPORE -  Speech apps and tools such as voice transcription apps could soon be able to pick up Singapore English more accurately, as their developers can now access better data from an expanded corpus of local speech.

Some 1,000 hours of natural conversations on topics such as favourite foods and holidays have been added to the National Speech Corpus (NSC), an artificial intelligence (AI) database of locally accented speech maintained by the Infocomm Media Development Authority (IMDA).

The NSC had previously only comprised 2,000 hours of read speeches.

The enhanced corpus can be used by software developers and researchers for a variety of purposes, including analysing contact centre conversation logs and creating virtual customer service officers.

The move was announced by Minister for Communications and Information S. Iswaran on Thursday (Oct 17) during the IMDA's SG:Digital Industry Day.

"The ICM (infocomm media) industry is a key enabler of Singapore's digital transformation," said Mr Iswaran.

"It... catalyses the transformation of other industries, spurring the adoption of technologies such as cloud computing and AI, allowing the creation of new business ventures and models."

Since its launch in November 2018, the NSC has been downloaded more than 100 times by a mix of local and international companies, start-ups, as well as research institutes.

These include local AI start-up Sentient.io which has used the corpus for its speech transcription app, and national broadcaster Mediacorp which has utilised it to automate the subtitling of television dramas. 

 
 
 
 

On Thursday, the IMDA also announced a new technical ability self-assessment tool which evaluates tech firms' capabilities and readiness to adopt cloud technology.

The free online tool developed by the IMDA will guide small and medium-sized enterprises (SMEs) through a series of questions and then provide insights on what gaps there are and how to address them.

The tool's launch comes in the wake of another IMDA initiative launched in March called GoCloud.

GoCloud trains and coaches SMEs in the information and communications technology sector to adopt cloud technology solutions.

More than 50 SMEs have signed up to date to be trained by domain experts such as IBM Singapore and Amazon Web Services.


Correction note: An earlier version of the story wrongly stated that Mr Iswaran announced a new technical ability self-assessment tool. We are sorry for the error.