Large Language Models (LLMs) and Generative AI, like GPT engines, have become incredibly popular in the AI field. People from all walks of life, including individuals and companies, are excited about this new technology. However, as its usage expands, we must also pay attention to the security risks associated with it, particularly with open-source LLMs. Rezilion, a well-known automated software supply chain security platform, conducted research on this very topic, and their findings are surprising.
To conduct their research, Rezilion used a framework called OpenSSF Scorecard, which is a security assessment tool for open-source projects. It evaluates various aspects of a repository, such as vulnerabilities, maintenance, and the presence of binary files. The goal is to ensure that projects adhere to security best practices and industry standards. The scorecard assigns a risk level and an ordinal score between 0 and 10 for each check.
The research revealed that most open-source LLMs and related projects have significant security concerns. These concerns can be categorized into four main areas: trust boundary risk, data management risk, inherent model risk, and basic security best practices. Trust boundary risks involve issues like inadequate sandboxing, unauthorized code execution, and insufficient access controls. Data management risks include data leakage and training data poisoning. Inherent model risks arise from limitations in the underlying ML model, such as inadequate AI alignment and overreliance on LLM-generated content. Basic security best practices encompass issues like improper error handling and insufficient access controls.
What’s concerning is the low security scores these projects received. On average, the checked projects scored only 4.6 out of 10 for security, with an average age of 3.77 months and an average number of stars on GitHub of 15,909. Projects that gain popularity quickly are at a higher risk compared to those developed over a longer period.
Rezilion not only highlighted the current security issues but also provided suggestions to mitigate these risks and improve the long-term security of these projects. It is crucial to properly administer security protocols, identify weak points, and implement the necessary changes to ensure a secure environment. Organizations can leverage the power of open-source LLMs while protecting sensitive information by conducting thorough risk assessments and implementing robust security measures.
To stay updated on the latest AI research news and projects, join our ML SubReddit, Discord Channel, and Email Newsletter. If you have any questions or feedback about this article, email us at Asif@marktechpost.com.
References:
– Dark Reading: “Open Source LLM Project Insecure, Risky to Use”
– Rezilion: “Explaining the Risk: Exploring the Large Language Models Open-Source Security Landscape”
About the Author:
Anant is a computer science engineer and data scientist with experience in finance and AI products. He is passionate about developing AI-powered solutions that solve everyday problems in an impactful and efficient manner.
Check out the amazing features just released by StoryBird.ai. Generate illustrated stories from prompts and bring your ideas to life. Visit the website to learn more.