Enhancing the Performance of Legal Search Queries on Google

welocalize June 10, 2020

Late last year, Google started rolling out what has been touted as the most significant leap in search since the introduction of RankBrain five years ago. Known as BERT – Bidirectional Encoder Representations from Transformers – the new NLP framework is set to significantly enhance the performance of global search.

Understanding the Role of BERT in Legal Content

What is Natural Language Processing (NLP)?

To truly understand BERT and how it impacts search, we first need to understand the wider discipline of Natural Language Processing (NLP). Melding elements of computer science, artificial intelligence, and linguistics, NLP is the field focusing on teaching machines how human language works, or training computers to understand and recognize the nuances of human language.

‘Deep’ NLP as we know it emerged in the early 2010s, and today we see it applied in many aspects of everyday life – from online chatbots, to predictive text messages, to trending topics on Twitter, to voice assistants like Apple’s Siri, Amazon’s Alexa, and Google Assistant.

NLP goes beyond training machines to understand spelling and grammar, it also involves teaching machines to understand the definitions of a word in different contexts. For instance, the definition of the word ‘running’ differs in the phrases ‘running an event’, ‘running away’, and ‘running for president’; NLP is used to help computers recognize and distinguish between these definitions based on the context of the overall input. It’s also used to help computers recognize the tone or sentiment behind a piece of text or a word. A great example of this is how tools like Grammarly can identify whether the tone of a passage is optimistic, aggressive, formal, neutral, etc.

What impact does BERT have on SEO for the legal industry? You cannot optimize for BERT, so the only way for SEO to really leverage this update is to ensure that content is always focused on the audience and their needs. Click To Tweet

Many NLP models tend to utilize a recurrent neural network (RNN) system to solve the linguistic task. Recurrent neural networks allow a machine to retain the knowledge it has gained from earlier in a body of text and use it to predict what may come next. This helps the machine to recognize patterns and understand context as it scans the piece of text.

What is BERT then?

What BERT achieves is quite simple: it uses a number of innovative mechanisms and processes in order to understand human language better than any other NLP framework has ever been able to achieve.

BERT is taught a general understanding of how language works using a massive corpus of text data, and then this general knowledge can be fine-tuned for any specific language-related problem you might have.

Prior to being rolled out in search, BERT had already achieved state-of-the-art results for 11 different natural language processing tasks. If, for example, you wanted to create a chatbot for your law firm, you could take BERT’s pre-trained architecture and fine-tune it for this specific task and your firm’s specialties and clients. You could input a dataset containing thousands of reviews, each tagged ‘positive’ or ‘negative’, and further train BERT in sentiment analysis to understand how to distinguish between future positive and negative reviews.

Benefits of Recent BERT Updates

Before BERT, a search for “Hit in a rear-end collision in Denver is the driver responsible?” would offer a mixed bag of results. You could expect a bunch of news articles about collisions in Denver, perhaps some advice on what to do if you were involved in a collision, but you might not immediately be delivered the exact answer to the question. Pre-BERT Google would have recognized only a few words in the query – collision, Denver, driver – and attempted to decipher results based on what it believes to be the focus of the query, without fully understanding the intent of the user.

Now, BERT can build a representation of the meaning for both the entire query and for each word simultaneously. The model is able to recognize all of the ways that each word may interact and, using bidirectional Transformers, can determine the true intent of the query, and subsequently provide the most relevant results. When we put it to the test, we can see how Google can now recognize the full context and intent of the query; the results immediately provide the answer.

Currently, 10% of Google searches in the U.S. use BERT to serve the most relevant results – typically on “longer, more conversational queries”. BERT is also currently only trained for the English language. While there is no defined timeline, Google are committed to expanding the update to both a larger percentage of queries and to more languages in the future.

What does BERT mean for users?

For users on Google, BERT means improved search query results, and therefore an enhanced user experience. As the BERT algorithm continues to develop and as Google continues to roll out the update, the search engine’s understanding of human language will continue to improve considerably. Search results will become more relevant and responsive, and better served for your specific needs. It will become easier and easier to find the information you need.

BERT is also used for Google’s featured snippets, again providing more relevant, accurate results. It is likely you’ll begin to notice these improvements in featured snippets like Answer Boxes and ‘People Also Ask’ lists.

What impact does BERT have on SEO for the legal industry?

You cannot optimize for BERT, so the only way for SEO to really leverage this update is to ensure that content is always focused on the audience and their needs.

BERT is not a ranking tool, and it doesn’t assign values to pages; it is simply used so Google can better understand the intent of the user.

As search engines push towards a more human way of understanding queries, so too should the content people are searching for. The more focused your content is on the specific intent of the user, the more likely it is that BERT will recognize this connection. Understand your audience, what they search for, and how they search for it; less keyword stuffing and more natural, human content is key.

How will BERT impact legal translation?

While current BERT models concentrate only on English, as it develops it will become hugely useful for machine translation (MT). If BERT can learn the nuances of English, then it can do so for any language, and in time we will very likely see BERT or new natural language processing models built upon BERT’s architecture greatly improve the accuracy and performance of MT.

A system like BERT is capable of learning from the English language and applying these learnings to other languages. Already, Google’s BERT algorithm is being used to improve featured snippets in 24 countries, and this has seen improvements in languages such as Korean, Portuguese, and Hindi.

Engaging with BERT as it continues to develop allows our legal industry digital marketing experts the opportunity to anticipate and subsequently capitalize upon these innovations for global brands driving multilingual SEO campaigns within the ever-evolving search environment.

For more information on Welocalize Digital Marketing services for the legal industry, connect with us here.

Author: This article was written by Michael de Alwis. It is an abridged version of a longer article that appears in full on the Adapt Worldwide website here.