Preparing and Architecting for Machine Learning

Gartner Technical Professional Advice

by Carlton E. Sapp

Preface

Key Findings

  • Machine learning (ML) — a subset of artificial intelligence (AI) — is more than a technique for
    analyzing data. It's a system that is fueled by data, with the ability to learn and improve by using algorithms that provide new insights without being explicitly programmed to do so.
  • Preparing data for ML pipelines is challenging when end-to-end data and analytic architectures are not refined to interoperate with underlying analytic platforms.
  • ML is best-suited for dealing with big data. Organizations overwhelmed with data are using multiple ML frameworks to increase operational efficiencies and achieve greater business agility.
  • Technical professionals are using machine learning to add elements of intelligence to software development and IT operations (DevOps) to gain operational efficiencies.
  • The ML compute and storage cluster — which is the heart of the ML system — will vary based on learning method, learning application and need for automation.

Recommendations

To modernize your organization's business intelligence and analytics capabilities to support machine learning:
  • Update the data organization layer in end-to-end analytics architectures to support data preparation for ML algorithms.
  • Incorporate a development life cycle that supports learning models when the organization plans to aggressively build custom ML algorithms and applications.
  • Choose an ML platform that supports and interoperates with multiple ML frameworks when the organization plans to leverage service providers or commercial off-the-shelf solutions. As AI and
    ML gain momentum, more frameworks will be packaged with solutions and service providers.
  • Focus on storage and compute clusters to support machine learning capabilities. Choose the public cloud when you don't have the appropriate staff for engineering infrastructures for ML. The cloud is a great place for designing ML capabilities because of its elastic capabilities for scaling algorithms.

Analysis

The ability to autonomously learn and evolve as new data is introduced — without explicitly programming to do so — is the holy grail of business intelligence. That's what machine learning offers: a capability that accelerates data-driven insights and knowledge acquisition. nformation is being collected and generated from more sources than ever before, many organizations don't have the resources to derive all the business value they could from this mountain of information. The capability to transform learned data into business insight and action, extremely rapidly, is a disruptive one that can provide any organization with a competitive edge.

What Is Machine Learning?

Learning From Data Without Being Explicitly Programmed Machine learning is a technical discipline that aims to extract knowledge or patterns from a series of observations, it can be split into three major subdisciplines:
  • Supervised learning, where observations contain input/output pairs (aka labeled data): These sample pairs are used to "train" the machine learning system to recognize certain rules for correlating inputs to outputs. Examples include types of ML that are trained to recognize a shape based on a series of shapes in pictures.
  • Unsupervised learning, where those labels are omitted: In this form of ML, rather than being "trained" with sample data, the machine learning system finds structures and patterns in the data on its own. Examples include types of ML that recognize patterns in attributes from input data that can be used to make a prediction or classify an object.
  • Reinforcement learning, where evaluations are given about how good or bad a certain situation is: Examples include types of ML that enable computers to learn to play games or drive vehicles.
ML is a discipline that evolved from artificial intelligence, but it focuses more on cognitive learning capabilities. AI has many other aspects that attempt to model human function and intelligence (such as problem solving). However, ML is a subset technology specific to the use of data to simulate human learning. Once data is acquired and prepared for ML, and algorithms are selected, modeled and evaluated, the learning system proceeds through learning iterations on its own to uncover latent business value from data. Deep learning is a type of machine learning that is based on algorithms with extensive connectionsor layers between inputs and outputs.

What Business Trends and Benefits Are Driving Machine Learning?

Examples of How Machine Learning Can Deliver Value to Organizations

Machine Learning Can Also Provide Process Benefits for IT Organizations

Business Strengths and Challenges of Machine Learning

How Should IT Prepare for Machine Learning?

Learn the Stages of the Machine Learning Process


  • Classify the problem. Build your problem taxonomy that describes how to classify the problem
    or business question to solve.
  • Acquire data. Identify where the data exists to support the problem you're trying to solve. Data used in ML can come from a variety of sources, such as ERP systems, IoT edge devices or mainframe data. The data used may be structured (such as NoSQL database records) or
    unstructured (such as emails).
  • Process data. Identify how to prepare data for ML execution. Steps here include data transformation, normalization and cleansing, as well as the selection of training sets (for supervised learning).
  • Model the problem. Determine the ML algorithms to be used for training or clustering. A range of algorithms can be acquired and extended to suit different purposes.
  • Validate and execute. Validate results, determine the platform to execute models and algorithms, and then execute the ML routines. The execution process will likely comprise many cycles of running the ML routine and tuning and refining results.
  • Deploy. Finally, the output of the ML process is deployed to provide some form of business value. This value may come in the form of data that will inform decisions, feed applications or systems, or be stored for future analysis. Depending on the type of ML routines executed, the output may also take the form of new models or routines that may supplement existing systems or applications (such as predictive models). Whatever the form of the results, this phase entails determining where and how to deploy them for consumption and decision making.
Model development differs from traditional software development because of the requirements to monitor and tune ML models in short iterations.

Understand the Model Development Life Cycle Needed for Machine Learning

Understand the Basic Architecture Needed for Machine Learning

A Comprehensive End-to-End Architecture


Understand What Skills Will Be Needed for Machine Learning


Steps to Get Started With Machine Learning

Start by investigating the technology, identifying value opportunities and trying to launch their first ML solutions to gain experience and demonstrate value.

Learn About and Experiment With ML Concepts and Technology

The first important step is to learn as much as possible about ML technology, and begin experimenting with the technology to gain your first experience about how ML solutions operate. Recommended learning and experimentation steps include:
  • Participate in online courses. An abundance of online training is available. Good places to start include the "Machine Learning" course offered by ML pioneer Andrew Ng, as well as "Intro to Machine Learning" offered by Udacity, an online university.
  • Pick a simple algorithm to study. Your goal is to get a brief understanding of what an algorithm looks like. Many toolkits are available for users to experiment with. Scikit-learn, for example, is an excellent source for understanding ML models and algorithms using dynamic programming languages, such as Python.
  • Experiment with ML technology in the cloud. Try conducting an experiment in the cloud now by using a cloud-based offering such as Amazon Machine Learning or Microsoft Azure Machine Learning. To learn more about how ML algorithms behave and execute at runtime, monitor these experimental projects with a service such as Amazon CloudWatch.

Work Closely With Data Science Teams and Business Users to Identify a Use Case

Build a Use Case in the Cloud

Iteratively Expand Your ML Platform and Services Over Time

Recommendations

  • Build a taxonomy for classifying the problems or challenges to be solved by ML. The cheat sheet shown in Figure 5 offers a good template for starting this technique. ML algorithms can be overwhelming because there are many to choose from. Organizations often spend too much time debugging models that don't fit the data, business problem or challenge they are trying to address. Start by categorizing to help reduce capabilities and to avoid overwhelming users.
  • Evaluate self-service platforms that support data preparation and applied machine learning. For example, C3 IoT (formerly C3 Energy) offers a platform product for self-service ML called C3 Ex Machina. This tool provides a designer interface that aids developers in building ML applications. C3 IoT also offers a significant ML toolkit for data science teams to explore.
    Note: There are a variety of ML platforms that support proprietary deep learning frameworks, but don't support common frameworks offered by the open-source community (such as Google TensorFlow, Caffe, Torch, Deeplearning4j and so on). Gartner recommends evaluating self-service ML platforms against their capability to interoperate with multiple deep learning frameworks.
  • Offer ML as a toolkit to data scientists rather than allowing them to build their own customized algorithms. There are extensive toolkits available, and they will likely support your use case or business challenge. Developing customized algorithms can be a nontrivial undertaking and can expand your architecture with unconventional integration to third-party tools. Gartner recommends offering toolkits to be exploited by data science teams to avoid potential integration challenges.
  • Use the public cloud to start your initiative because it can elastically scale to accommodate any requirement. Amazon, Microsoft, IBM, Google and many other cloud providers offer ML capabilities that can be leveraged to achieve self-service capabilities.
    However, Gartner recommends exploring the capability to interoperate with multiple ML frameworks and toolkits in order to design an open architecture.

留言

熱門文章