Data Action Layer August 27, 2019

Alkymi CTO on lessons learned implementing active learning

by Alkymi

5fc3d76b60969883f60d8bf9 Screen Shot 2020 11 29 at 9 16 02 AM

Over 11,000 people attended the Amazon Web Services AWS Summit in New York and had the opportunity to learn how Alkymi uses active learning to enable self-automation. Alkymi’s CTO Steven She walked the audience through how Alkymi leverages AWS for machine learning in his talk titled: "ML on AWS + Kubernetes." 

How Alkymi uses active learning for document understanding

In the financial services industry, data often comes in the form of documents sent via email. For a business analyst, part of their typical workflow includes sifting through that document to locate critical information to copy and paste into a structured data store, such as Excel. This is an arduous, risky process that results in data either not being used or possibly being wrong.

Alkymi uses computer vision to identify the basic components of the document, such as tables and charts, and Natural Language Processing (NLP) and machine learning to understand the intention of the texts and summarize their content. Computer vision works similarly to that of a self-driving car, as it applies labels to elements of text, such as charts, tables, and paragraphs to understand the basic components of the document.

Next a selection model employs active learning to understand what the user is looking for and interactively trains the product to extract that data. The system identifies low confidence documents and sends them to an annotation queue, which is then presented to the user for them to label. That information is retained using SageMaker. Alkymi uses SageMaker on Kubernetes, allowing for real-time training, deployment, and prediction loops between the user and the model. In seconds, the newly learned information is applied to the subsequent unlabeled data and the user immediately benefits by saving time. Now you have a more accurate model that requires less labeled data. 

From the presentation abstract:

The process of automating workflows with machine learning models often requires a significant amount of labeled data. Acquiring this data can be a costly and time-consuming process. Active learning is a type of machine learning that reduces the amount of labeled data required by allowing the model to select which examples will be labeled. In this talk, we describe the challenges and solutions Alkymi has encountered while implementing active learning on AWS using Kubernetes and SageMaker.

Watch the replay below or here.

Follow Alkymi to learn more about self-automation, active learning, or get in touch for a live demo.

More from the blog

September 22, 2023

A high-performance approach to personalizing an LLM

by Elizabeth Matson

Fine-tuning is not the only way to get relevant, domain-specific responses out of an LLM. Alkymi’s team of expert data scientists explain an alternate route.

September 6, 2023

IDP: find the right document processing solution for your business

by Bethany Walsh

Find out which type of automated document processing solution is right for you: data extraction, an IDP, or a complete business system for unstructured data.

August 29, 2023

Alkymi and Portfolio BI partner to empower alternative asset managers

by Elizabeth Matson

We’re partnering with Portfolio BI, a provider of portfolio analytics and reporting solutions, to bring structured and unstructured data sources together.