By Pratik Kumar
It’s 2026. A customer calls a support line, and the call lands on a voice-based AI system.
The call is answered instantly, and the AI greets the customer using a customised avatar — the AI identifies the customer using voice ID, and not only knows about her product usage and recent query history, but also her preferences around gender, tone of voice, and language. The customer knows that she’s speaking to a computer, and rather prefers this. Most customer-centric companies are now using voice-based AI for support, and it gives the customer comfort that her query will be resolved in the best way possible.
The AI has become extremely smart in solving / answering frequently encountered queries. Every conversation is recorded, and transcribed with sentiment analysis in real-time, training the AI to get better and more personalised with every call. The AI is also getting better at recognising queries that it will not be able to solve on its own — queries and problems that require a more general intelligence to attend to, like being politically correct, emotionally sensitive, or being introspective, philosophical or creative (utilising seemingly unconnected learnings from unrelated industries). For such support, the AI forwards the call to the most appropriate agent available on the human cloud.
The human cloud is a network of human agents available on demand. All agents have an app, which indicates availability of an agent to the cloud, either through explicit logging in, or other types of behaviour tracking. The AI has become very smart at forecasting demand for different types of agents, and manages supply-levels through various incentive mechanisms like surge pricing etc. when necessary.
If the AI determines that the call needs to be forwarded to a human agent, it prompts the customer and forwards the call to the most appropriate available agent accordingly. The agent takes the call on his mobile like any other call, using an in-ear headphone — he needs to see the phone screen during the call for information. Regardless of the environment for either the customer or the agent, the call is crystal-clear. The in-ear headphone has cancelled all the environmental noise for the agent. The inbuilt AI in the headphone is isolating only the voices of the customer and the agent on the call, and cancelling all other noise. Either people on the call won’t know if the other person was in a train station or a movie theatre.
There is no call centre. Everything is in the cloud.
The agent app UI is highly intuitive, and immediately gives complete context of the AI led conversation upto this point to the agent, so he can pick up the conversation from there. The UI can also understand which information to automatically show during the call, based on the context of the call. The conversation continues to be transcribed in real-time, with sentiment analysis. The AI is trained to flag a conversation if the sentiment analysis indicates so. Tickets are raised in real-time for supervisors to analyse the transcript, and jump-in in real-time if necessary. All these jumps inside a conversation are seamless, and the customer is used to such situations any way — they know this is only to make sure their query is resolved in the best possible way.
Once the call is over, an immediate quantitative and qualitative assessment is done for the agent, basis which the corresponding payment is settled immediately. The assessment summary is shared with the agent, and feeds into personalised training materials generated for the agent. The agent has to continuously keep completing his training materials to maintain / grow his expertise for the corresponding client / industry, and corresponding pay level.
The AI continues to use transcriptions of these calls for further training itself, getting better and better at managing more complex queries on its own. However, it will never be able to completely handle calls that require a high degree of general intelligence. And so the industry does not foresee elimination of human agents any time soon.
The voice industry itself is exploding. It seems humans prefer talking as opposed to using any other kind of interface, and with the explosion of IoT, now they can talk to everything. Alexa and Google Home were just the beginning — people are talking to their cars, refrigerators, microwave ovens, even lights and fans and doors and windows. And so the use cases for voice-based support are also exploding.
Smart machines are now smart enough to start initiating voice support calls on behalf of their owners. You’re talking to Alexa to turn on your A/C, and the next thing you know, you’re automatically connected to voice-support for your A/C company, because Alexa and the IoT on-board the A/C couldn’t figure out why it wouldn’t turn on.
Voice-support companies are building their own niches and relationships. Machine learning, language APIs and distributed workforce management infrastructure is available easily on the cloud, mostly as open-source. The differentiator is the company’s focus on which geographies / industries to service, and building industry-specific tools (on top of open-source) and sales capabilities. Client relationships are very sticky — once you start servicing a particular company’s customers, your AI gets better with every support call, and your human cloud gets more experienced with their customers, making it very hard for your client to shift to another provider.
Correspondingly, it starts getting easier to get new clients in an industry you’ve established yourself in. With increasing scale and deeper knowledge, costs are coming down, and profitability is going up, significantly.
Industry forecasters are very bullish. As long as people want to talk, the voice-support industry will keep prospering, and new use-cases will keep coming up.
Unless people stop talking. Some eccentric entrepreneurs are working on brain-computer interfaces, which will eliminate the need to speak. Communication will happen at the speed of thought. IoT will evolve into IoP (Internet of People).
However, even futurists believe this is wishful, and shouldn’t be a concern for the foreseeable future.
Pratik Kumar, VP Solutions iSON BPO