How AutoML Is Simplifying Data Science :
Originally Posted: Indiaai
A recent NASSCOM survey found that 60 percent of enterprise executives believe that investments in AI are a priority, and 45 percent want to use the technology for strategic decision-making, but only 20 percent believe they have done so successfully. The report also highlights challenges with respect to shortage of talent, complex workflows, data quality, unexplainable AI black-box models and lack of business expertise within data science teams.
Automated Machine Learning or AutoML is an exciting new development in the way companies are able to apply and leverage data science into their business workflows, by using AI to automate time-consuming aspects of ML applications.
Quite simply, what AutoML does is that it puts the power of machine learning in the hands of everyone - right from CXO’s to data experts. Now, everyone within the organisation can run complex data science models. It creates a new class of citizen data scientists who can create advanced ML models with tremendous support from automation at each step of the workflow.
The current challenges with ML workflows
Currently, users have to select and test individual ML models on their data, and fine-tune them tediously in order to select and deploy the best performing models. This makes data science difficult for functional experts to understand, test and develop by themselves.
ML, currently, involves a lot of steps such as raw data ingestion, data cleaning, feature selection and construction, parameter optimization, parameter tuning and so on - and requires a lot of manual programming. Machine learning analysis can also be extremely complex and what we need right now is smarter optimization techniques.
So, how does AutoML fit here?
AutoML helps automate as many of the steps, without compromising the accuracy of the results. It automates the entire data workflow by integrating with ML algorithms and systematically comparing different models, providing complete transparency to the user for predictive decision making. AutoML takes advantage of the strengths of both humans and computers; and helps with data identification, data preparation, feature engineering, pre-processing, human friendly insights, easy deployment, model management and monitoring.
Essentially, it is a productivity tool, as it allows for time to focus on the creative aspects of the data science process such as, deciding how to properly frame a data science problem, how to incorporate their domain knowledge, how to interpret results and how to communicate their results to their team.
In the future, it will only make data science jobs more accessible. As the demand for analysis will increase, the demand for AutoML, too will increase, because businesses will become more and more hungry for data. Data scientists will be needed to represent the problem, interpret results, and apply models effectively and correctly. Having said that, experts will need to be better educated and trained - upskilling will become paramount to be able to stay ahead with the changing times.
Popular AutoML tools and platforms
What do AutoML tools look like? There are a host of tools out there - right from open source tools and research prototypes to commercial tools, which can help automate some or all parts of the machine learning pipeline. TPOT, devol and H2O.ai AutoML are examples of open source tools, which largely help configure the ML pipeline, deep learning architecture search and basic data preparation over the ML algorithms.
Some of the commercial tools that exist have, in comparison, much more simpler and seamless interfaces - for example, Google AutoML, H2O.ai Driverless AI which provides better feature construction and DataRobot which with its web-based interface eliminates the reliance on manual workflows and even supports external open-source algorithms and 24*7 availability in the cloud, giving users the power of AI to drive better business outcomes.
Future of AutoML
The era of manual scripting for ML is reaching a critical point - it is constantly changing and evolving. In the coming years, we will see AutoML handle even more aspects of the data cleaning process and scale to larger datasets as it does now, and also vastly improve deep learning. Going forward, AutoML as a practice will transform data science as we know it, as it will continue to enable data experts to focus on posing the right questions, collecting and curating the right data and thinking like a data scientist.