Keeping AI cost effective in the move to cloud

Can Artificial Intelligence Deliver Real Value Today?

AI is in its infancy, but the early shoots of growth hold great promise for the industry. According to the Boston Consulting Group, although 59% of organisations have developed an AI strategy, and 57% have carried out pilot projects in the area, only 11% have seen any kind of return from AI. That said, the potential is vast; some sources estimate that the size of the global AI market could increase tenfold from $15bn (2021) to $150bn by 2028, and in the UK, expenditure on AI technologies could reach £83bn by 2040, from £16.7bn in 2020. 

Whatever the application, most AI projects usually start as small, experimental tests hosted on a server in-house, and eventually graduate to cloud environments, where their uptime, security, scalability and maintenance can be assured. However, this migration – the ‘teenage’ stage of an AI application’s lifecycle, as it were – is often the most difficult and painful. 

Growing Pains

Moving an AI application to the cloud isn’t just a matter of ensuring greater scalability and improving uptime – it’s often a matter of cost. AI applications usually rely heavily on GPU and GPU-like processors, which can be a significant investment for any startup or lab. Although a single specialized card can be found at around a thousand pounds, more advanced, high-performance GPUs can be in the region of £5,000 to £15,000 each. Delivering this level of high performance at scale is often out of the question from a CapEx point of view, especially for a start-up. 

Furthermore, AI application developers eventually reach the limits of their in-house machines; AI usually needs to be trained on exceptionally large datasets, which can mean running out of RAM and storage space fairly rapidly. Upgrading to a high-performance machine in the cloud can remove this bottleneck at both the development and production stages. However, there are a number of things that teams should be aware of and prepare for if they are to make the migration to cloud as painless and productive as possible. 

When Plans Come Together

In the very early stages, research and preparation are key. For example, portability is key; working on a platform like Docker from the get-go can greatly help you before and after migration. Even before moving to a third-party datacentre, working in a containerized environment means that your coworkers and collaborators can quickly replicate the app and its dependencies and run it under exactly the same conditions as you have, allowing for robust and reliable testing. However, having an AI application running in a container also means that you’ll minimize re-configuration during the migration process as well.

From a provider point of view, it’s worthwhile understanding the credentials of cloud companies; for example, is their security regularly audited by independent bodies? Do they have specific security accreditation from the vendors they use in turn? AI applications can often handle extremely sensitive data – from simple chatbots in retail banking, to complex healthcare analytics systems, for example – so making sure that this data will be handled, stored and protected appropriately is a must. 

Similarly, sustainability is an important consideration. AI requires high computing power and the Wall Street Journal recently revealed that handling a search query via ChatGPT was seven times more compute intensive than a standard search engine. In fact, the University of Massachusetts Amherst research team found out that the GPT-2 algorithm (ChatGPT’s older sibling) created approximately 282 tons of equivalent CO2 – a similar amount to what the entire global clothing industry generated in producing polyester in 2015. AI application developers should be considering sustainability from the get-go, as well as how their partners manage recycling and electronic waste. 

At a more specific level, it’s also important to be clear about scaling. Having clear discussions with cloud providers about the specifics of app functionality, who will be using the app, and what that means for the technical architecture, can make sure that no aspect is left neglected. After all, most large-scale cloud providers can offer automatic and unlimited scaling, but there’s a lot of difference between the set-up needed for a system getting ten requests a day and one that gets ten thousand in a minute, so it’s important to be clear about instance ranges, for example. 

Similarly, latency considerations are crucial; the likes of chatbots and other real-time systems need to respond instantly to web users. Consequently, this means that both code and infrastructure must be sufficiently low-latency, and developers and deployers will need to shave off every possible milli-second. In terms of deployment, this means checking that compute resources, for example, are as close to (or in the same place as) data, which will help to keep things as fast as possible. 

Finally, once the application has been deployed, continuous monitoring is important. There may be alternative configurations within the cloud provider’s environment that could better suit its needs – or in some cases, moving to an alternative provider may be the best thing for the app. Working with open standards, in an open-source cloud environment such as OpenStack, can often make this less challenging. 

When AI Grows Up

Nobody knows if AI will ever reach the lofty – and sometimes terrifying – heights that science fiction films have promised for decades. However, if this incredibly promising and powerful technology is to reach its full potential, especially in the face of the current energy crisis, it needs to be deployed as efficiently and effectively as possible and allow its creators to focus on their core work, building AI systems, rather than worrying about infrastructure and operational concerns. 

If AI developers can plan carefully, choose their partners well, and streamline their processes when they move applications from their on-premise training-wheels environment to the bigger, wider and more flexible world of cloud, then they will considerably increase their chances of successful re-deployments, keeping costs down and end-users happy. And although that’s not the same as building WALL-E, a T-1000 or Chappie, it’s a step in the right direction.  

Alexis Gendronneau

Head of Data Products at OVHcloud

Laying the foundations for global connectivity

Waldemar Sterz • 26th June 2024

With the globalisation of trade, the axis is shifting. The world has witnessed an unprecedented rise in new digital trade routes that are connecting continents and increasing trade volumes between nations. Waldemar Sterz, CEO of Telegraph42 explains the complexities involved in establishing a Global Internet and provides insight into some of the key initiatives Telegraph42...

Laying the foundations for global connectivity

Waldemar Sterz • 26th June 2024

With the globalisation of trade, the axis is shifting. The world has witnessed an unprecedented rise in new digital trade routes that are connecting continents and increasing trade volumes between nations. Waldemar Sterz, CEO of Telegraph42 explains the complexities involved in establishing a Global Internet and provides insight into some of the key initiatives Telegraph42...

IoT Security: Protecting Your Connected Devices from Cyber Attacks

Miro Khach • 19th June 2024

Did you know we’re heading towards having more than 25 billion IoT devices by 2030? This jump means we have to really focus on keeping our smart devices safe. We’re looking at everything from threats to our connected home gadgets to needing strong encryption methods. Ensuring we have secure ways to talk to these devices...

Future Proofing Shipping Against the Next Crisis

Captain Steve Bomgardner • 18th June 2024

Irrespective of whether the next crisis for ship owners is war, weather or another global health event, one fact is ineluctable: recruiting onboard crew is becoming difficult. With limited shore time and contracts that become ever longer, morale is a big issue on board. The job can be both mundane and high risk. Every day...

London Tech Week 2024: A Launched Recap

Dianne Castillo • 17th June 2024

Dominating global tech investment, London Tech Week 2024 was buzzing with innovation. Our team joined the action, interviewing founders and soaking up the latest tech trends. Discover key takeaways and meet some of the exciting startups we met!

The Future of Smart Buildings: Trends in Occupancy Monitoring

Khai Zin Thein • 12th June 2024

Occupancy monitoring technology is revolutionising building management with advancements in AI and IoT. AI algorithms analyse data from IoT sensors, enabling automated adjustments in lighting, HVAC, and security systems based on occupancy levels. Modern systems leverage big data and AI to optimise space usage and resource management, reducing energy consumption and promoting sustainability. Enhanced encryption...