Speak up: Speech recognition is big business, thanks to HPC

It wasn’t long ago when being put through to a robot on company’s phone line was a painful experience to say the least. I’m sure, like me, you have had that moment where you’ve wanted to hang-up out of pure frustration at not being understood? Fortunately, however, speech recognition has vastly improved in recent years with high performance compute (HPC), and forward-thinking companies like Speechmatics, being at the heart of these improvements.

The power of HPC has opened doors to many technological advances within business and commerce and none more so than the area of speech recognition and natural language translation. Translating human voice has become a big, multi-million pound business.

Back in October of last year, LivePerson acquired VoiceBase and Tenfold, which both used artificial intelligence (AI) and complex natural language processing to turn audio into structured, rich data. This was closely followed by Microsoft closing on its approximately $16 billion acquisition of speech recognition company Nuance – an early pioneer of speech recognition AI, best known as the source of the speech recognition engine for Siri, Apple’s ever-popular personal assistant.

And, even more recently, global experts in deep learning and speech recognition, Speechmatics, who’s latest HPC cluster is based within our Harlow data centre announced they raised an impressive $62m in Series B funding.

Of course digital personal assistants like Siri and Alexa are where most people would have encountered this technology, but it is also being used to make driving safer through voice-activated navigation, save time for medical staff by automating dictation, and increase security with voice-based authentication.

As the LivePerson acquisition suggests, customer service is another area where voice recognition is growing rapidly in significance. Most people prefer voice requests to typing or clicking through menus, but few companies can afford to have a person available to handle every discussion and customers are increasingly complaining about ‘call centre fatigue’.

Technology now makes it easier to handle some of those interactions efficiently and it allows organisations to collect voice data. The data can be used to train AI-powered neural networks to understand speech better, but organisations naturally want to analyse it further and AI can help here too.

As a result, the global voice recognition market is booming, and was valued at $10.7 billion in 2020. This is expected to be worth $27.2 billion by 2026 and will include several overlapping technology types and use cases.

Voice recognition could be used by dictation software to convert the content of a meeting into a text document, for example. To do that, the computer needs to identify the words spoken but it doesn’t have to be very concerned about their meaning.

That changes somewhat when a voice assistant needs to tell you about the weather by ‘understanding’ a range of commands. In most cases, though, the device will recognise a variety of ways of asking the question but not every single one. These tools can also be used to analyse large quantities of audio and identify key qualities, such as sentiment.

These solutions are constantly evolving. For example, Speechly, a Finnish start-up, recently patented a new technological approach that combines speech-to-text with natural language understanding in a novel way. The company claims that this enables faster and more complex voice interactions than current solutions.

All this is done with a variety of algorithms, which use variations of grammar rules, probability, and speaker recognition to identify and classify spoken phrases. The difference between a good speech recognition algorithm and a poor one is accuracy rate and speed. Companies want the system to work as quickly as possible, with minimal errors.

Key behind all of this of course is data, and voice data that is tagged and categorised so AI can understand what it is and how it relates to everything else in the sphere of spoken word. As we’ve already seen with visual data (behind say autonomous driving) labelling data is a labour-intensive process and the end product is only as good as the initial input. It’s accurate, but limited, both in terms of speed and volume, but also in capturing the infinitely varied intricacies of global speech, tone, delivery, accents, slang, etc.

How can we label data faster? Or maybe we need to change how we look at this problem and refine our machine and deep learning models. The amount of data is an issue, but also a blessing, and if machines can be trained to be able to pull more accurate insights from raw data, then we’re really moving into the world of seamless voice translation at mass volume. It’s a game changing moment.

As I’ve mentioned at the start of this blog, much of this work relies on high performance computing (HPC) in both the cloud and colocation data centres to work successfully. Voice recognition is a very compute intensive operation that requires classic parallel processing, low bandwidth and inter-connectivity, and lots and lots and lots of GPUs. The cloud is a perfect receptacle for collecting, inputting and distributing data, and industrial scale data centres like Kao Data are ideal for the data intensive, fine-touch processing tasks – enabling an HPC cluster, supported by GPUs, to be tailored within a close, connected environment to crunch data.

It’s possible that within a few years, voice will be our main method of interacting with computers and they will ‘speak’ their information back to us through smart earphones or project data onto augmented reality glasses. The cloud, high performance computing and 5G connectivity will ensure that all of this happens with lightning speed. We are still in the early stages of voice recognition, but as the investment within the technology shows, there is plenty more to come.



Share

Other articles

March 31, 2026

Hype Cycle to Power Cycle: The Industrial Era of AI kicked off in San Jose

March 18, 2026

The UK’s AI Ambition Gap

January 22, 2026

The Quiet Revolution: Slough’s story shows how data centres can benefit the UK’s economy and communities for good

Test Text

test job title

Details

If your application is successful, Harlow Council will transfer the grant by BACS.  Bank details (account name, number and sort code) will need to be supplied with a summary of accounts. 

Funding conditions:  If your application is successful, your project must be delivered by 31 December 2026. You will have to return any grant funds if the project is not delivered or the organisation receiving the funding stops operating.

A contract agreement will need to be signed between your organisation and Harlow District Council before any funding is granted.

Monitoring and Evaluation: Grant recipients will be required to provide an end of project report to establish whether the project has met its aims and objectives, as well as to assess the overall impact on participants. Funding for the project is provided on the basis that the Project Evaluation form is returned within the agreed project timescale (no later than 31 January 2027).

Risks and Liabilities: In giving grants the ‘Harlow Council’ will require the supported project organisation or groups to accept all risks and liabilities associated with the activity being supported. This will be a condition of the grant.  Copies of relevant documents may need to be provided if the application has been approved.

Data protection statement – how we will use your information

The Council is committed to handling your personal information in line with the data processing principles.  The Data Protection Legislation and the General Data Protection Regulation 2016/679 (GDPR) sets the legal framework for how we collect, handle and process personal data and for your rights as a ‘data subject.’

General Data Protection Regulation:  Personal data provided by you will be processed in accordance with this protocol. For more details, please see https://www.harlow.gov.uk/privacy-notice

Thank you for taking the time to read these guidance notes. If you have any questions, please get in touch with [email protected].

FAQs

  1. Who can apply?
    The fund is open to not-for-profit community groups and grassroots initiatives based in Stockport.
     
  2. What types of projects are eligible?
    We encourage projects focused on environmental sustainability, community cohesion, and local economic development.
     
  3. How much funding is available?
    Grants range from £500 to £2,500 for pilot projects or to enhance existing initiatives.
     
  4. Is this the sole funding for this project?
    If not, please expand on the additional match funding that you currently have or are in the process of applying for.
     
  5. Are there any restrictions on grants under £500?
    Yes, please note that grants under £500 may be subject to different guidelines or restrictions, which will be communicated upon application.
     
  6. When will we know if our project was selected?
    Notifications will be sent by March 31, 2025.
     
  7. How can I apply for a grant?
    Applications can be submitted through our online portal, where you will find detailed guidelines and forms.
     
  8. What happens if my application is unsuccessful?
    If your application is not successful, we encourage you to seek feedback and consider reapplying in future funding rounds.
     
  9. When is the application deadline?
    Please check our website for the latest application deadlines and any upcoming funding rounds.