Flash Storage – Powering the AI Future

By James Petter, EMEA VP, Pure Storage

Artificial Intelligence (AI) is starting to change how many businesses operate. The ability to accurately process, and deliver, data faster than any human could, is already transforming how we do everything from studying diseases and understanding road traffic behavior to managing finances and predicting weather patterns.

For organizations like Global Response, AI represents an opportunity to reinvent existing business models. With the help of storage industry leaders, Global Response has begun development on a state-of-the-art call centre system that allows for the real-time transcription and analysis of customer support calls. This will allow for a superior customer experience and faster solutions -- both increasingly important as consumer expectations shift heavily toward personalized experience. Similarly, Paige.AI is an organization focused on revolutionizing clinical diagnosis and treatment in oncology through the use of AI. Pathology is the cornerstone of most cancer diagnoses. Yet most pathologic diagnoses rely on manual, subjective processes, developed more than a century ago. By leveraging the potential of AI, Paige.AI aims to transform the pathology and diagnostics industry from highly qualitative to a more rigorous, quantitative discipline.

With so much on offer and at stake, the question is no longer simply what AI is capable of, but rather where AI can best be used to deliver immediate business benefits.

According to a recent 2018 report from PwC, AI is expected to contribute US$320 billion to the Middle East economy by 2030, with an annual growth rate between 20-34% across the region. This is not surprising given findings from our recent Evolution report which revealed that 45% of IT decision makers in the Middle East are planning on increasing their IT budget for AI and machine learning projects in the next financial year. A further 46% are also planning on investing in AI skills/personnel in the same timeframe.

For those looking to implement AI or machine learning projects, the compute bottleneck that used to hold back projects like these has largely been eliminated. The application of graphics processing unit (GPU) technology from the likes of NVIDIA, has played a big part in this. As a result, the challenge for many projects is now providing the data fast enough to feed the data analysis pipelines central to AI.

It is critical that organizations also carefully consider the infrastructure needed to support their AI ambitions. To innovate and improve AI algorithms, storage has to deliver uncompromised performance across all manner of access patterns—small to large files, random to sequential, and low to high concurrency—all with the ability to easily scale linearly and non-disruptively in order to grow capacity and performance.

For legacy storage systems, meeting these requirements is no mean feat. As a result, data can easily end up in infrastructure siloes at each stage of the AI pipeline—comprised of ingest, clean and transform, explore, train—making projects more time intensive, complex and inflexible.

Bringing together data into a single centralized data storage hub as part of a deep learning architecture enables far more efficient access to information, increasing the productivity of data scientists and making scaling and operating simpler and more agile for the data architect. Modern all-flash based data platforms are ideal candidates to act as that central data hub. It’s the only storage technology capable of underpinning and releasing the full potential of projects operating in environments that demand high performance compute capabilities such as AI and deep learning.

UC Berkeley’s AMPLab created and pioneered real-time analytics engine Apache Spark™, the fastest, most cutting-edge analysis tool in the world. The UC Berkeley genomics department then implemented Apache Spark on top of flash storage to serve as an accelerator to make major leaps in genomic sequencing.

Similarly, Man AHL, a pioneer in the field of systematic quantitative investing, also leverages Apache Spark on top of flash storage to create and execute computer models that make investment decisions. Roughly 50 quantitative researchers and more than 60 technologists collaborate to formulate, develop and drive new investment models and strategies that can be executed by computer. The firm adopted flash storage to deliver the massive storage throughput and scalability required to meet its most demanding simulation applications.

Flash storage arrays are best suited for these AI projects as they encompass a parallelism that mimics the human brain, and enables multiple queries or jobs to run simultaneously. By building this type of flash technology into the very foundation of AI projects, it vastly improves the rate at which AI and ML initiatives can develop. For years, slow, complex legacy storage systems have been unable to cope with modern data volume and velocity, and have been a roadblock for next-generation insights and progression. Purpose-built flash storage array systems eliminate that roadblock, removing the storage infrastructure as a barrier to customers fully leveraging data analytics and AI projects.

Whether AI is central to your company’s core competency or not, it is a tool all organizations should be looking at using to bring efficiency and accuracy to their data-heavy projects. Those who don’t could be leaving their business at a severe competitive disadvantage.

Comments