Science

Scientists may not be very far away from decoding the language of whales. An interdisciplinary group of scientists have started collecting data in order to use artificial intelligence (AI) to understand how whales communicate. The project has been named Cetacean Translation Initiative (CETI). The first talks about the possibility to decode sperm whale sounds took place at the Harvard University. An international group of scientists spent a year together at the Radcliffe Fellowship in 2017. The research and collection of data started in full swing in 2020. If the project succeeds, this would be the first time that humans will understand the language of another species. As a result, humans may also construct a system to communicate with whales.

Shafi Goldwasser, director of the Simons Institute for the Theory of Computing at the University of California, Berkeley, noted a series of whale clicking sounds that were similar to Morse code, or the noise of a faulty electronic circuit. She pitched the idea of translating whale language through these clicks or “codas” to David Gruber, a marine biologist at the City University of New York. Then, Michael Bronstein, an Israeli computer scientist teaching at Imperial College London, considered a link between the codas and natural language processing (NLP).

Biologist Shane Gero has supplied sperm whale codas’ recordings from around the Caribbean island of Dominica. Bronstein applied some machine-learning algorithms to this data. He told Hakai Magazine, “They seemed to be working very well, at least with some relatively simple tasks.” But this was only a proof of concept.

Scientists and linguists still do not know whether or not animals have a language. Animal utterances can be called a language only if they have semantics (vocalisations having fixed meanings), grammar (a fixed way of structuring the sounds), and are not just innate sounds.

Whales usually dive into deep waters and communicate over large distances. Hence, facial expression or body language do not affect their communication. Bronstein added, “It is realistic to assume that whale communication is primarily acoustic.”

However, learning to decipher and communicating in whale language is difficult for AI too. The best-known AI-language models are contained in GPT-3, which has a database of almost 175 billion words. In comparison, CETI’s database has less than 100,000 sperm whale codas. Scientists plan to expand the database to four billion codas.