AI safeguarding tips for large language models

Generative AI has taken the world by storm in the past six months, and concerns about safeguarding have been raised alongside the breathless hype around the technology. Even OpenAI, creators of ChatGPT, have made their dedication to AI safety very clear.

But it isn’t just the developers of these tools who must ensure user safety when interacting with Large Language Models (LLMs). Due to the occasionally unpredictable nature of LLMs, those of us integrating this cutting-edge technology into our business need to be aware of how to safeguard effectively while the rough edges are sanded off.

Safeguarding against AI hallucinations

One of the most spoken-about issues concerning generative AI and LLMs is the possibility of “hallucinations.” These are when the AI responds tangentially to the initial prompt—the request given to the model—providing non-sequitur or incorrect information. This lack of control, especially with such a new technology, is sure to concern anyone wishing to implement an LLM in their business. But rest assured; you can mitigate the risks in many ways.

Indeed the best way, in my experience, to train a new LLM for a customer-facing role is to think of it as if you are onboarding a new employee. When you take on an employee to represent your company, you’re giving up control. Yet, with the proper training and onboarding, you ensure the new team member is prepared to give your customers factual answers to their queries.

When a new salesperson or customer service agent joins a company, there’s a lot they have to learn. Among other things are the company’s values, which products or services they sell, and internal guidelines. The same goes for an LLM powering a chatbot on your store. Out of the box, the custom LLM instance will only have a generic understanding of ecommerce and no knowledge of your company. Thankfully, however, you can teach it.

A illustration explaining how to train your LLM.

Training your LLM

Training your LLM on this data is vital for AI safeguarding, as the more knowledge it has on such subjects, the less likely it is to stray from them. Once it knows your FAQs, branding guidelines, and store inventory, it has a stable “source of truth” for its answers. This is as opposed to the entirety of the internet, as the base LLM does. It is, therefore almost impossible for your custom instance to deviate from the source of truth you have provided.

You can do this by:

  • Customizing the tone of voice to align with your brand’s
  • Uploading your business policies, catalog, and FAQs into the LLM or connecting it to your help center platform
  • Integrating it into your ecommerce platform so that it has live knowledge of information such as product availability; Shopify, for example, is an excellent source of structured store information for an LLM

Of course, in some instances, you don’t want the LLM to paraphrase specific copy, such as legal terms. In such cases, the bot sends it verbatim, like giving an agent a script to follow. But unique responses reacting to what the customer has written are usually preferential over canned answers. Customers prefer a more humanizing, personalized experience, after all.

These generated responses are only helpful if they align with your internal policies and brand identity. Once the LLM is trained, the next step is to test the model to ensure it consistently answers factually constantly. In the same way that you wouldn’t give a new team member a handful of training sessions and never check in with them again, you should have regular test scenarios to audit your LLM.

A flow chart showing how to test your bot for inconsistency for AI safeguarding.

Ensuring GDPR compliance and data safeguarding with AI

Another significant AI safeguarding issue is with GDPR. The base technology of the LLM is a third-party service, after all, which you will constantly be sending data back and forth to. However, your customer is ultimately conversing with you rather than with OpenAI or whichever provider you choose.

That’s why, at Certainly, our LLM integration anonymizes all information that is not crucial to the smooth conversational flow. For instance, email addresses are identified and, instead of being sent to the LLM, are sanitized, and the LLM itself only receives “<EMAIL_ADDRESS>.” As such, no sensitive customer data is leaving your tech ecosystem. This is part of our wider commitment to keeping your data and your customers’ data, secure.

The process of anonymizing data between the Certainly platform and the LLM, a key aspect of AI safeguarding.

LLMs are the future. We are ready.

Large Language Models, whether GPT, LLaMA, Bard, or any other, will become a core technology for most industries in the near future. So, we need to ensure that we’re using them in ways that are safe for our businesses and customers.

This is something we’re deeply aware of at Certainly. We’re working hard to provide solutions for our customers to allow them to use this new technology safely and effectively. To learn more about what we’re doing with LLMs, look at our recent series on OpenAI.

Michael Larsen & Fergus Doyle wrote this article with visuals by Vital Sinkevich.

Related blog posts