Voice Search, AI And Natural Language Processing

Artificial Intelligence, Natural Language Processing, Machine Learning AND Voice Search. A Match Made In Heavens

Published Under: Technology | Artificial Intelligence
Last Updated: December 22, 2019
Tweet Share

Voice Search as we know it, though still perfecting, is awesome. It's 2019 and use of voice search Is gaining grounds like fire ants. What future holds for voice search. Let's see how some of the technologies if combined, can transform the experience for us.

Asad Butt

Embedded content: https://www.youtube.com/watch?v=FPfQMVf4vwQ

We know, voice search is here, it's banging on the doors. Everyone is using it. Marketers and Search Engine Optimisation specialists cannot stop talking about it. There is a reason for that. Not long from today by the end of next year, 2020, half of all searches on the internet are expected to be carried out using voice search.

Search Engine Optimisation experts are still very much reeling from the Google algorithm updates over the last few years. The way Google has shifted, strictly on to serving the user intent, has made it difficult for publishers and SEO's and they are still in the phase of "catching up".

Many of us, in fact, are still living in the world of keywords.

So, here is the thing, if we had a tough time, managing changes in the search algorithms like the hummingbird, penguin, and panda which made it very difficult to carry on with traditional method to gain the search rankings, how can we expect to successfully align and optimize for voice search.

Something we are still in the phase of guessing about.

And we have a new kid in the town: Artificial intelligence. I say it's new because we have just started to experience it.

Natural language processing, machine learning and speech recognition coupled with artificial intelligence are not going to be easy to work with.

We know that all major players are investing heavily in voice search. Virtual assistants are basically the new customer service assistants.

Google Assistant, Siri by Apple, Alexa, Cortana, and Bixby. Who are they?

Customer service representatives or customer service assistant .. a friend, personal assistant, whatever term you may use... we know where we are heading... Ever wondered why these names sound like these?

As they say, the very great thing about the future is that no one really knows what it's going to be like. Now, we can do predictions but predictions backed by the facts only matter.

Most predictions we have about voice search are contradictory and over-ambitious.

For this article, I would like to go through a few of the very ingenious technologies which when combined with voice search can truly transform the experience and probably will take over the voice search market.

Embedded content: https://www.youtube.com/watch?v=3kLUP__uKF0

National Language Processing has too many challenges at the moment.

As a human being, it might appear to be the easiest thing to do, like taking a breadth. Sometime between the age of 18 months and 2 years, we begin to form two to four word sentences.

It's not that simple from the inside.

For technology, it is to set of many complex processes.

  • Audio sampling
  • Feature extraction
  • Speech recognition
  • Recognize individual sounds
  • Converting them to text

Once it has a voice input converted into text, the uphill struggle of understanding it just start from there onwards.

Two of the most difficult are:

  • Processing text accurately
  • Providing an accurate output

Many other challenges NLP faces in terms of processing voice inputs are understanding and differentiating between accents, separating background noise from a meaningful statement (separating signal from noise) and the overlapping of conversations etc.

On the output side, NLP faces the challenge of producing answers which are not only grammatically accurate but also understandable and make sense. Add "serving the user intent" on top of it.

The additional layer of problems arises when it needs to process human emotions, developing situations and understanding the context.

We need to remember that when human beings communicate, they also make use body language and gestures at the same time, along with using a natural language.

One way, Natural Language Processing is overcoming the hurdles is by starting small. For example, smart speakers and virtual assistants provide limited options in terms of domains they can interact in. There are fewer domains in which they can assist a user for now. A new domain is added every other day but we can notice that it is a step by step process. These systems are taking one domain at a time, perfecting it, building upon the knowledge and expertise gained, before testing new waters.

Working on a domain by domain basis can also help in another way.

Instead of taking a big challenge of being able to understand everything and every question, the companies, leading NLP technology, are instead focusing on any particular domain and trying to master it.

This way, they can possibly innovate more and also support businesses interested in using their licensed technology

Third parties can use these NLP units and integrate into their existing applications. For example, if a company gains expertise in processing languages related to "customer services interactions" and opt to provide the licenses for broader, commercial use, their service can be adopted and integrated into many applications used by small to medium-sized businesses

This could kick-start voice-based customer services, backed by artificial intelligence and machine learning.

The mainstream acceptance by users of smart voice-based devices with limited capabilities is helping as well. We know our smart speakers like Amazon Echo can help us in a very limited way but most users are ok with that.

Because of the psychology of "every little helps"

Virtual And Augumented Reality

November 2018, we were all intrigued by the news of China's Xinhua agency unveiling its virtual newsreader.

Embedded content: https://www.youtube.com/watch?v=bmqd9nYH5Fw

Imagine swapping this character with your favorite one.

A friend, your personal assistant or maybe your girlfriend. Does it sound convincing? Does it look convincing? Just try your imagination a bit. Just try some role which you have your trust in. Just try someone who you are comfortable with.

Now imagine, having a kind of conversation with that character that you normally couldn't.

Like, asking your personal assistant, how is weather going to be like today? Do I need an umbrella? What do you reckon? Should I take an uber or the underground to work. Now imagine your virtual personal assistant answering you without going through a weather app or weather service website within a a few seconds.

Imagine having a peaceful conversation with your wife. No if's no but's. No arguments. Just an informative discussion. I know, we dream of this every day and wish it was ever possible.

As more and more consumers become used to voice search and voice controls and voice-based interactions with technology, it will become the norm. And virtual person replacing the gadgets, computers, screens,, and keyboards is going to provide a more natural experience.

Users would like their voice interaction to be seamless, easier and natural. It wouldn't be a surprise if the virtual assistant does replace voice-based devices like smart speakers in the future.

The Smart Move By Holograms

“Bringing computing into the three-dimensional world that humans have always existed in is the next step in making computing more personal,”
- Tweet This

Greg Sullivan | Microsoft’s director of communications

Embedded content: https://www.youtube.com/watch?v=thOxW19vsTg

Until recently, hands-on training for mechanics meant trainees attending practical training carried out at a real hanger where they could see a plane being maintained by the professionals. They could see the aircraft and could see what was inside the engine by removing the cover. This is the only way they could see the parts functioning and interacting.

Japan Airlines, however, has an innovative method of doing this training.

The use of Hololens.

Hololens is a mixed reality headset by Microsoft and it provides the experience to observe different parts of an aircraft using virtual reality.

It has two benefits.

It is efficient, cost unavailable 24/7.

We already have the use of holography in gaming. It's already moving into the industrial settings as successful users like Japan airline clearly show. Can you imagine hololens taking the place of your virtual assistant?

Use of holograms and augmented reality is increasing. More and more domains have been explored and tested.

A company called looking glass based in Brooklyn is already developing a product called hollowplayer, one that can display interactive 3D images which we can manipulate with our hand gestures.

Combination of holograms and voice search can be very powerful which is demonstrated in the video above. We have another example, Pokemon Go. Why was Pokemon go so popular? The interactivity is to blame.

Regardless of the device used, the ability to manipulate virtual person using our voice or the gesture can change how we interact with them.

Artificially Intelligent Virtual Assistants

Embedded content: https://www.youtube.com/watch?v=wqhxVwXI6q8

We all like straight and quick answers. Our questions include words like: where, when how etc. There is a context, a gesture, and the situation. It's not that easy to interpret. Even a real person can make mistakes. We would like our virtual assistants to understand what our intent is and be able to answer our questions based on our preferences. Virtual assistance needs to be able to adapt according to our behavior, our moods and our interests.

Accurate identification of the user intent would mean accurate answers.

Traditionally our search using the keywords would result in a list as a result. Sometimes relevant, sometimes not. We are all used to searching using a combination of several keywords before we reach the desired information. And sometimes we even don't, despite that information being available somewhere.

This is different when we are speaking our personal assistant in normal life.

As I mentioned earlier, room for improvement lies with algorithms.

And algorithms increasingly are integrating the use of artificial intelligence into their workings. This does not mean that we do not need the betterment in the voice recognition technology or presentation like a hologram.

What it means is that the software has to catch up with the hardware.

Because, if the only algorithms needed catching up, why would we have google requesting publishers to provide information in the form of structured data.

And we also need to remember that a whole lot of information is still available in the form of websites, videos, podcast, and other modes etc. According to the estimates, the world will hold 163 zettabytes of data by 2025. Current estimates are around 16.3 ZB of data creation per annum.

No one is going to waste all the data and it is going to be put in better use, of course. In the meantime until the market giants catchup with natural language processing, machine learning... they need some help and that's why the structured data is in place.

Machine Learning

Embedded content: https://www.youtube.com/watch?v=MuWWZ91-G6w

"If machine learning is a pack horse for information processing, a neural network is the carrot that draws the horse forward."
- Tweet This

Deep learning and neural networks | theconversation.com

As of 2019, machine learning is extensively used in many areas.

New studies show the Machines can understand far better what you are saying without even listing a word, just by processing the movement of your lips.

From self-driving Cars, trading on the behalf of humans by algorithms, detection of fraud in online transactions.

You name it and machine learning is helping us to do the job better.

Google is in the pursuit of creating something that can even mimic a thought process. Another way of saying this is "wisdom". And we all know Deepmind has recently beaten world-class human players in games like chess and shogi. It is believed that AI Go has even taught itself some new tricks, that even baffle the chess players.

There are advances being made in speech generation with computers or otherwise called as speech-synthesis or text-to-speech but nothing comes close to Google WaveNet.


I hear folks talking about voice search and artificial intelligence is going stay and going nowhere. It is not the question of if it is going to stay or go away. It is here, It is taking over because it is natural.

From experience, we have learned that technology has a tendency to move towards as close to a natural experience as possible. Things look more natural than they used to be about a decade ago in terms of virtual reality, for instance, it is easy to deduce that voice search coupled with artificial intelligence, natural language processing, machine learning and technologies like holography is going to be a permanent part of our future.

Earn links that produce results. Create content that wins traffic

Meet The Author

Asad Butt

Asad Likes anything creative, but mainly, developing web-applications, optmizing websites for search engine algorithms and writing about all stuff creative.

He could be reached at LinkedIn or Twitter. He is also an active contributor at stackovrflow.com

Learn more about Asad Butt


Learn More

Explore the topics further: Technology, Artificial Intelligence

Disclaimer: Whilst we have made every effort to ensure that the information we have provided is accurate, it is not and advice. We cannot accept any responsibility or liability for any errors or omissions. Visit third party sites at your own risk. This article does not constitute legal advice