Tech Corner September 22, 2023
If you’ve explored using large language models (LLMs) to generate insights from your data, you’ve probably heard the words “fine-tuning.” News about large financial services firms like Morgan Stanley offering generative AI assistants to their staff often mention that they’ve fine-tuned the model using their own databases of hundreds of thousands of proprietary research reports and documents. That’s a big undertaking—one that requires technology, resources, and a team of data scientists. If you don’t have those resources, it might seem like getting accurate answers from an LLM is out of reach, but that’s not the case. Fine-tuning is not the only way to get relevant, validated, domain-specific responses out of a model. Alkymi’s team of expert data scientists explain an alternate route:
Large language models like OpenAI’s GPT models or Google’s PaLM 2 are trained on extensive datasets, primarily using publicly accessible data sources like public websites, Wikipedia, and digitized books, available at a specific point in time. That means that there are countless proprietary, secured datasets that these models have never seen before and are not aware of, like a pharma company’s proprietary trial data or a financial services firm’s secure database of investment data. If you were to ask a question about that data, the LLM would not know the answer.
Fine-tuning is a method of additional training to make a model aware of your dataset, so you can leverage it to perform tasks based on your data, like answering specific questions about items that only you might care about or want to focus on. Essentially, it’s taking a generalist model and training it to specialize in your specific dataset, so you can ask questions about your data and get the right answer. It’s a highly technical process that requires significant resources, including data science expertise, sufficient computing power and data storage, and access to a pre-trained LLM. While there are use cases for fine-tuning, there are also more cost-effective and scalable alternatives.
If you want to take advantage of generative AI in your workflows without the time and resources required for fine-tuning, an alternative that verticalized platforms like Alkymi can offer is a process called Retrieval Augmented Generation (RAG). Instead of changing the underlying model with fine-tuning, RAG is a method where you provide the LLM with relevant information it can use to find the answer to a question, at the moment the user asks the question. Using semantic search and vector databases, the Alkymi platform can quickly identify this type of relevant information from a document or dataset and surface it to the model. Your questions are then augmented with your domain-specific information, without requiring fine-tuning the model itself.
For example, when our customers use Alpha to ask a question about their documents or data, an internal system prompt is generated around that user’s question that combines the question with relevant content from their documents to provide the needed context to the LLM. The LLM itself might not know anything internally about that set of Capital Calls, brokerage statements, or equity research reports you are asking about. Using RAG, the LLM doesn’t need to be trained on that user’s dataset to find the answer; we can provide it with the information that it needs to answer the question, when it needs it. With our platform providing the wider context from the data within the prompt, the LLM can find an accurate and traceable answer in seconds, without seeing the whole dataset. Retrieval Augmented Generation and our robust semantic search system enable our customers to experience all the benefits of LLMs to get insights and answers out of their data, without the cost and resources associated with fine-tuning.
In some cases it makes sense to use your resources to fine-tune an LLM, for example a company with an expansive knowledge base that needs to not only ask specific questions but to also imitate style and form in the generated response. For most firms looking to integrate generative AI into their workflows, options like Retrieval Augmented Generation and the opportunity to choose a model will enable you to explore how LLM-powered technology can unlock your data, without sacrificing security or performance.
The dream: every document that comes into your firm is formatted in the exact same way. The reality? Well, it’s a bit more chaotic than that.
With the right strategy in place, you can minimize the risk of LLM hallucinations, allowing your business to confidently leverage them in your workflows.