EMBRACE ALGORITHMS! EXPLORING THE DATA ANALYTICS LIFECYCLE
“I am excited about the possibilities of data analytics. Although there is a lot of community fear about creating and implementing systems that can scale so rapidly, when done correctly there are enormous business benefits. It sounds super-cheesy, but to quote Uncle Ben from Spiderman, “with great power comes great responsibility”.
Jaydin Nathan is a Data & AI Consultant within the LAB3 cutting edge research arm, Lumen. As somebody who studied philosophy and economics at university, and is now passionate about crunching algorithms, Jaydin is the perfect person to explain and reflect on the potential of data analytics and how Lumen can minimise the risks.
Breaking down data analytics into 3 easy parts.
I often get asked to explain data analytics by people outside the tech sector. To help get your head around the concept, it is easiest to start out with the end goal. Essentially, this is to collect information and turn this into insight. This can then be further broken down into three levels of maturity.
First, there is descriptive analytics – to look at data from a historical or real-time perspective and observe what has happened or is happening.
Next, is predictive analytics – once we know what has happened in the past and what the inputs and outcomes were, we can then model what will happen in the future.
Thirdly, there is prescriptive analytics – now that we know what will happen through the predictive model, the question to ask is – what is the best course of action given our goal?
To build on this, AI (through machine learning) is a continuation of the predictive and prescriptive elements. Once developed, an AI system can learn autonomously, both quickly and at scale. And because of this, AI is a natural evolution of data analytics maturity.
Along the way, risks arise. For example, was the information collected sufficiently substantial and accurate? Were the algorithms used appropriately scrutinised to avoid inadvertent distortion of the results? And then of course, in some cases, there will be ethical dilemmas, about the morality of algorithms making decisions that have tangible effects on the way people are able to live their lives.
To minimise these risks over the data analytics lifecycle, the team at Lumen has developed purpose-built hardware, best-practice principles, and modular reference architectures.
To kick start data analytics, getting the raw data is critical.
Of course, for organisations to benefit from data analytics at all, they need to be able to consistently and accurately capture relevant information, which is where IoT (the Internet of Things) can play a key role.
Within Lumen, we have our own in-house R&D capability dedicated to IoT sensors. While we’ve developed a range of sensors for a variety of use cases, we’ve also been able to create custom sensor solutions to fit less common edge cases for our customers. These can be integrated with any existing sensors our customers may have.
From the sensors, the data is then sent to Azure for processing and analytics. A benefit of our ability to integrate sensors is that all of this data can be processed together in Azure. Within Azure we typically leverage PaaS products such as IoT Hub, Stream Analytics, and Time Series Insights, before sending the data to our custom predictive models.
Most recently, the team and I have been working on using techniques typically used for image classification on our IoT data, which allows us to perform anomaly detection and classify time series data to a high degree of accuracy.
Given Lumen’s end-to-end IoT capability, our time-to-value is very fast, which then means our customers can reach predictive maturity faster – i.e. predictive maintenance. An example of this was when Lumen used our sensors to monitor tyres on mining trucks, with the aim of optimising their maintenance schedule and extending the life of the tyres. Not only did this benefit the client’s bottom line, it was good for the environment as well!
The team and I have also found that the large-scale data collection required for big data analytics can bring unintended benefits to the organisation, such as reducing information silos. Typically, an analytics project will need to harvest data from many disparate areas of the business. And it may also involve conducting interviews with domain experts who are close to the work – whether that be engineers, drivers, maintenance specialists or managers.
Thus, when an organisation backs these wide-reaching initiatives for information collection – whether that be information as data or information gained from interviews – silos that were previously unknown begin to break down.
To reduce risk, you can incorporate the principle of Positive Automation.
I believe there is a healthy amount of fear around algorithms used in AI given the automation they deliver and the speed at which they can scale.
To allow organisations to develop trust in models, the approach Lumen takes is to slowly increase the decision-making ability of models in tandem with the increased trust of our clients.
As a first step, the model can be built to simply make recommendations – to then be actioned by a person. This is sometimes termed ‘augmented intelligence’, and while still requiring input from a person, it is nevertheless possible to gain efficiencies due to the extensive information collection involved in machine learning systems. The ability to better aggregate information and present that to a decision maker will almost always result in better and more informed decisions.
The next stage is to progress the model to automatically make decisions in certain scenarios. To provide a layer of safety, this can still involve the input of a person where decisions could be considered negative (for example, for a customer). This stage allows clients to gain further trust in any changes to existing systems.
Once the model has gained the confidence of the organisation, it can then be safely embedded as an automated process within a business. But this isn’t the end of the AI lifecycle.
The right mix of monitoring is key to successful AI.
There is another area of concern with models which arises after they are implemented. It is prudent for organisations to include monitoring, and not simply adopt a set and forget mentality.
Once algorithms are embedded within business processes and the initial benefits realised, they should not be forgotten. These models may be making or aiding incredibly important decisions that have ripple effects throughout organisations and wider afield.
For this reason, Lumen has developed an MLOps reference architecture for our clients (Machine Learning Ops is the data science equivalent to DevOps).
To make this possible, Lumen has leveraged LAB3’s strong Terraform footprint, and the team has used parameterised Terraform templates to rapidly deploy MLOps environments into Azure.
The MLOps architecture is centred on Azure Machine Learning, to which the team has added monitoring and alerting for events such as data and model drift, as well as security and networking controls. This guarantees data remains accessible to only those who should have it.
Not only does this keep clients in touch with the models across their organisation, Lumen has also found that MLOps greatly increases the success rate of analytics projects by encouraging reuse and knowledge sharing of any analytics artifacts.
Conveniently, each of Lumen’s offerings are completely modular so that it is possible to integrate around existing environments and only deploy what is needed for each use case. For example, if a client only wants to add a Lambda architecture for IoT, this is possible. Or if they want to branch out into MLOps, the Lumen solution can sit on top of their existing architecture.
Lumen has found that baking best practices into reference architectures and principles, and then using Terraform to modularise these, creates the right balance between speed and safety when working with AI.
Through best practices, you can feel confident about the numbers.
Given the risks that can arise at any stage of the data analytics lifecycle, I cannot stress enough how important it is for organisations to be confident that best practices have been followed. LAB3-Lumen can ensure this and has gold partner levels with Microsoft in both Data Platforms and Data Analytics.