Outstanding performance of Google’s state-of-the-art chatbot ‘Meena’

Latest News

Beautiful-Templates.com Joomla Extensions - Joomla Templates

Outstanding performance of Google’s state-of-the-art chatbot ‘Meena’

Category: February 2020 - Automation & AI: accelerator in digital marketing

Google just presented Meena, an end-to-end, neural conversational model that learns to respond sensibly to a given conversational context. Meena can conduct conversations that are more sensible and specific than existing state-of-the-art chatbots, according to Google’s blog post. Such improvements are reflected through a new human evaluation metric that Google proposes for open domain chatbots, called Sensibleness and Specificity Average (SSA), which captures basic, but important attributes for human conversation.

Automatic Metric: Perplexity

Google demonstrates that perplexity, an automatic metric that is readily available to any neural conversational models, highly correlates with SSA. The training objective is therefore to minimize perplexity, the uncertainty of predicting the next token (in this case, the next word in a conversation). At its heart lies the Evolved Transformer seq2seq architecture, a Transformer architecture discovered by evolutionary neural architecture search to improve perplexity.

According to Google, the Meena model has 2.6 billion parameters and is trained on 341 GB of text, filtered from public domain social media conversations. Compared to an existing state-of-the-art generative model, OpenAI GPT-2, Meena has 1.7x greater model capacity and was trained on 8.5x more data.

Mastered Challenges

“Researchers have long sought for an automatic evaluation metric that correlates with more accurate, human evaluation. Doing so would enable faster development of dialogue models, but to date, finding such an automatic metric has been challenging”, Google states, being the more proud about the discovery that perplexity, an automatic metric that is readily available to any neural seq2seq model, exhibits a strong correlation with human evaluation, such as the SSA value.
Perplexity measures the uncertainty of a language model: the lower the perplexity, the more confident the model is in generating the next token (character, word, or sub word). Conceptually, perplexity represents the number of choices the model is trying to choose from when producing the next token.

“During development, we benchmarked eight different model versions with varying hyperparameters and architectures, such as the number of layers, attention heads, total training steps, whether we use Evolved Transformer or regular Transformer, and whether we train with hard labels or with distillation: the lower the perplexity, the better the SSA score for the model, with a strong correlation coefficient (R2 = 0.93)’, Google explains, adding; “Our best end-to-end trained Meena model, referred to as Meena (base), achieves a perplexity of 10.2 (smaller is better) and that translates to an SSA score of 72%. Compared to the SSA scores achieved by other chabots, our SSA score of 72% is not far from the 86% SSA achieved by the average person. The full version of Meena, which has a filtering mechanism and tuned decoding, further advances the SSA score to 79%.”

Future Research

Last but not least, Google promises to continue working on lowering the perplexity of neural conversational models through improvements in algorithms, architectures, data, and compute, stating: “While we have focused solely on sensibleness and specificity in this work, other attributes such as ‘personality’ and ‘factuality’ are also worth considering in subsequent works. Also, tackling safety and bias in the models is a key focus area for us, and given the challenges related to this, we are not currently releasing an external research demo. We are evaluating the risks and benefits associated with externalizing the model checkpoint, however, and may choose to make it available in the coming months to help advance research in this area. (Source: Google Blog)

By Daniela La Marca

Latest News

Messengers and social media can serve to disclose personal data

Not all data is created equal

What needs to be considered in terms of data protection when cooperating with freelancers

Growing security risks to drive APAC managed security services revenue to US$17bn in 2024

The importance of E-A-T in content creation

List of Asian countries most at risk of RDP brute-force attacks

Malaysia successfully fighting streaming piracy

Novel Linux malware targets VoIP softswitches

Manipulated language AI can write manifestos that appear human and radicalize automatically

Sentryc stops product piracy on the internet

Fileless malware continues to grow

Competitive advantage of cybersecurity often still undervalued

Apple operating system increasingly the target of malware

ConnecTechAsia announces headliners and themes for virtual conference

10 golden rules for a safe home office

ioGates ultra secure content sharing service adds desktop app and smart link sharing

Privileged accounts remain a popular gateway for hackers

Thycotic deploys advanced machine learning to control dangerous applications on endpoints

It’s not about technology, but people

How and especially why do human beings select or decide against a product or service?

Outstanding performance of Google’s state-of-the-art chatbot ‘Meena’

February 2020 - Automation & AI: accelerator in digital marketing

Latest News

February 2020 - Automation & AI: accelerator in digital marketing

About Us

Contact Us