If you are worried about sharing private data, you really can’t beat running it on your computer. There are open source, and open weight models that you can run right from your command line. Ollama makes it easy on a Mac. I haven’t looked enough for options for other OSs. Once you have the software, you don’t even need an internet connection.
I don’t think what you describe exists. I’m a software engineer, but not in this field, so please take that as a weak signal at best.
Yes I'm assuming a locally run open weight model will be useful, but not ultimately not sufficient for very complex tasks. I hope that something of the sort I describe can and will exist before too much regulation and optimized monetization occurs.
I don't work in cyber security, so others will have to teach me.
I'm interested in the question of how AI systems can become private. How to make communications with an AI system as protected as the confessional. Some AI capabilities are throttled not for public interest reasons but because if those private conversations became public, the company would suffer reputational damage.
I'm not libertarian enough to mind that AI companies don't allow certain unsavory conversations to occur, but I do think they could be more permissive if there were less risk of blowback.
A lot of high value uses of a eyes is impossible without data security of the inputs and outputs. Sensitive financial information, State secrets, health data: this isn't information you can just hand over to an AI company no matter the promise of security.
Similarly a lot of individuals are going to want to cordon off certain parts of their life, including their own mental health.
The obvious answer is to have locally hosted AI. However even vast improvements in data cleaning and algorithmic learning are unlikely to get us acceptably high performance.
You could start out with your local Host, send an encrypted file, and receive an encrypted file from a huge Network hosted model. But I don't see how that model could interact with that encrypted file not being trained on that type of thing as an input. There's no point in sending the key along with it.
Or is there?
If there is an encryption and decryption layer in the AI system for the inputs and the outputs, an AI service could probably use zero knowledge proofs (or something else) to help create trust that they do not have method to read your messages. At the very least this would help with blocking out third parties.
But I don't know enough about software architecture for creating an audit that would show the AI company did not have access to the unencrypted input or output.