Thursday, January 28, 2021

New AI model processes text in 11 Indian languages

It will provide essential building blocks to students, faculty, and startups to work on Indian language tools.

Reading Time: 2 minutes
Indian Institute of Technology Madras.

Faculty from the Indian Institute of Technology Madras (IIT-M) and AI4Bharat have developed artificial intelligence (AI) models and datasets to process texts in 11 Indian languages.

AI4Bharat is a platform for building AI solutions for problems of relevance to India.

- Advertisement -

According to IIT-M, its researchers and AI4Bharat released AI models and datasets for the following languages: Tamil, Hindi, Malayalam, Telugu, Kannada, Punjabi, Bengali, Odia, Assamese, Gujarati, and Marathi.

The multilingual AI models and datasets developed through this initiative will provide the essential building blocks to students, faculty, startups and industry to work on the Indian language tools and push the frontiers of technology.

The faculty have made these cutting-edge resources open-source and completely free of cost, which can be accessed by anyone.

These models are freely available and can be downloaded from a Github repository.

“We have a very rich diversity of languages in our country. As we move towards a digital economy, it is important that our languages find a space online. This requires a lot of innovation in creating input tools, datasets, and AI models for Indian languages,” said Mitesh M. Khapra, Assistant Professor, Department of Computer Science and Engineering.

For example, imagine a learner who posts a question on an e-learning platform in Tamil or Hindi or any other numerous Indian regional languages.

There is a need for tools that can automatically process such questions written in the Indian languages and classify them into specific topics.

READ ALSO: I just become myself in Hindi: Ian Woolford

devanagari script
Texts like these can now be processed through artificial intelligence.

“While such tools are available for English and other foreign languages, there are hardly any tools for Indian languages and this is the critical gap that we are trying to address through this initiative. These models are available free of cost as we want the entire country to benefit from them,” added Khapra.

AI4Bharat is an initiative co-founded by Khapra and Pratyush Kumar from IIT Madras and works to solve India specific problems in a community-driven, open-sourced manner.

“We have an urgent responsibility to take the rapid advances of AI and make them accessible to the common man. One way of achieving this is to improve interactions between humans and machines. That is where the field of Natural Language Processing (NLP) comes in. NLP is a branch of AI that deals with the interaction between computers and humans using natural language,” said Anoop Kunchukuttan, a volunteer at AI4Bharat and the lead researcher on this project.

For the past one year, a team of researchers comprising students, faculty and volunteers from IIT Madras and AI4Bharat worked on collecting data and training powerful models for processing text written in Indian languages.

The models take advantage of the similarities between Indian languages to make efficient use of data.


READ ALSO: How Richi Nayak from QUT is making Twitter safer for women

- Advertisement -

Related Articles


Please enter your comment!
Please enter your name here


Ep8: Indian links in Indigenous Australian poet Ali Cobby Eckermann’s life

To celebrate NAIDOC week 2020 (between 8-15 November) I spoke to Yakunytjatjara poet Ali Cobby Eckermann about her time in India where she taught...

Ep 7: In the case of Sushant Singh Rajput

  The torrid and high-octane Sushant Singh Rajput case has been fodder for Indian people and press for the last few months. The actor’s tragic...

Ep 6: The Indian LGBTQ+ community in 2020

  It’s been two years since the world’s largest democracy repealed the draconian Section 377 which used to allow discrimination against homosexual people. Only this...

Latest News

wagah border ceremony

WATCH: Attari-Wagah Border Ceremony on India’s 72nd Republic Day

  It's always a sight to behold - watching the daily military practice between Indian and Pakistani forces at the Attari-Wagah border, made special on...
President Joe Biden

Huge relief for spouses of H1B workers in the US

  "Withdrawn". A single word on a thick bureaucratic file on the seventh day of the Biden administration delivered a huge win for spouses of...
kamala's way biography by dan moraine

Book Review: ‘Kamala’s Way’ by Dan Morain

  Just how did the daughter of two immigrants in segregated California became one of this countrys most effective power players? Through the human touch,...
farmers protest, delhi, red fort, flags

Farmers rage in Delhi on Republic Day

  On the morning of Republic Day, farmers were warmly welcomed in Delhi. As pictured below, tractors and cars carrying groups of protesters were showered...
Jeet Ki Zid movie poster

Review: Jeet Ki Zid (Zee5)

  It's easy to like Jeet Ki Zid. The show talks of patriotism and valour, it celebrates will power to triumph against all odds. Importantly,...