Playing with Fire – Enterprise Grade ChatGPT Integration – A Compliance Nightmare?
In my previous post I discussed how easy, at least at a technical level, it is to integrate ChatGPT and other LLM providers into your stack by using the Marvin LLM Framework for Python.
So, after a profitable morning of coffee and coding, I’m sending automated requests out happily from my server to ChatGPT when some bright spark in the office asks “Is that actually a safe thing to do?”.
Thankfully, I had done my homework before even considering piping juicy nuggets of corporate data to an external service, that some claim may be the downfall of humanity, so I did have somewhat of a reasoned answer.
The use case I was working on was to feed ChatGPT with excerpts of contractual documents (redacting all personally identifiable information of course!), and then to automate a series of questions about the content in order to extract structured data for further automated processing. Technically, this process works surprisingly well. However, from a SecOps perspective there is the obvious risk of ChatGPT taking umbrage with my interactions, doxing me and creating a whole world of pain. Whilst that might not spell the downfall of humanity it could spell the end of my business!
Through careful pawing over the Open AI Privacy Policy, it is clear that the primary risk of data leakage is that OpenAI stores all user prompt submissions for use in further training of the model. In my case the prompt submission contains fragments of sensitive commercial information. That could really back fire! I could imagine the next incremental version of ChatGPT being able to wax lyrical about the intimate details of Encircle’s commercial arrangements. That would really not do! Thankfully OpenAI’s privacy policy provides a clear link to the OpenAI Opt Out form which allows a user to opt out of the user prompt collection. This mitigates this risk, up to a point…
With my compliance hat on, I’m still not happy about sending out potentially sensitive information to a disruptive start up – that I would imagine works on a “move fast, break things” principal. What we really need is a single tenancy ChatGPT instance, with a stable release schedule, ideally allowing multiple environments that we can test and stage releases on before letting loose on our precious data.
Thankfully there is such an offering, look no further than Microsofts Azure OpenAI Service. Azure OpenAI offers the prerequisite data collection opt out facility and provides single tenancy ChatGPT instances on a competitive pricing model.
The real beauty of the Azure OpenAI service is that you can create multiple ChatGPT instances for development, test, staging and live environments, which are bound to a specific ChatGPT release version. Each instance can be secured using Azure’s best in class identity access management and security frameworks. The icing on the cake is that by using the Marvin LLM Framework we were up and running with AzureOpen AI without changing any code. The future is looking bright!
Full disclosure: we haven’t yet to pipe in any real sensitive information into ChatGPT as we are naturally proceeding with caution. However, we are much less fearful about adopting this technology and are really excited by the potential to enhance out internal business processes in this way.
I was going to write my next article about why ChatGPT won’t end humanity just yet due to it having the memory of a goldfish – seriously diminishing its ability to work with large data sets. However, I need to ruminate on that a little further as OpenAI have just announced they have massively extended the GPT4 context size (Chat GPTs memory of previous conversation).
If you need any help or simply want a confidential discussion concerning integrating ChatGPT and/or other LLM providers into your enterprise, please feel free to drop us a line using our contact page.