Data Action Layer August 27, 2019

Alkymi CTO on lessons learned implementing active learning

by Alkymi

5fc3d76b60969883f60d8bf9 Screen Shot 2020 11 29 at 9 16 02 AM

Over 11,000 people attended the Amazon Web Services AWS Summit in New York and had the opportunity to learn how Alkymi uses active learning to enable self-automation. Alkymi’s CTO Steven She walked the audience through how Alkymi leverages AWS for machine learning in his talk titled: "ML on AWS + Kubernetes." 

How Alkymi uses active learning for document understanding

In the financial services industry, data often comes in the form of documents sent via email. For a business analyst, part of their typical workflow includes sifting through that document to locate critical information to copy and paste into a structured data store, such as Excel. This is an arduous, risky process that results in data either not being used or possibly being wrong.

Alkymi uses computer vision to identify the basic components of the document, such as tables and charts, and Natural Language Processing (NLP) and machine learning to understand the intention of the texts and summarize their content. Computer vision works similarly to that of a self-driving car, as it applies labels to elements of text, such as charts, tables, and paragraphs to understand the basic components of the document.

Next a selection model employs active learning to understand what the user is looking for and interactively trains the product to extract that data. The system identifies low confidence documents and sends them to an annotation queue, which is then presented to the user for them to label. That information is retained using SageMaker. Alkymi uses SageMaker on Kubernetes, allowing for real-time training, deployment, and prediction loops between the user and the model. In seconds, the newly learned information is applied to the subsequent unlabeled data and the user immediately benefits by saving time. Now you have a more accurate model that requires less labeled data. 

From the presentation abstract:

The process of automating workflows with machine learning models often requires a significant amount of labeled data. Acquiring this data can be a costly and time-consuming process. Active learning is a type of machine learning that reduces the amount of labeled data required by allowing the model to select which examples will be labeled. In this talk, we describe the challenges and solutions Alkymi has encountered while implementing active learning on AWS using Kubernetes and SageMaker.

Watch the replay below or here.

Follow Alkymi to learn more about self-automation, active learning, or get in touch for a live demo.

More from the blog

April 24, 2024

Sitting down with Alkymi’s new VP of Customer Success

by Bethany Walsh

Hear from George Chedzhemov, our new VP of Customer Success, on his approach to enhancing customer experience and maximizing value.

April 15, 2024

Expanding our embedded integration with SimCorp

by Harald Collet

Our partnership offers customers a fully integrated, automated workflow for processing unstructured investment data, directly into the SimCorp platform.

March 27, 2024

Revolutionize your data management: how ML transforms data operations

by Bethany Walsh

When it comes to unstructured data, switching from templates to ML-powered workflows will provide you with more efficiency, adaptability and scalability.