AI Companies at Risk: Study Reveals Major Security Breaches on GitHub

Two-Thirds of Leading AI Companies Leaking Secrets on GitHub, Report Finds
A new study from cloud security firm Wiz has revealed that 65% of leading private AI companies on the Forbes AI 50 list have leaked sensitive secrets including API keys, tokens, and credentials on GitHub, collectively putting over $400 billion in company valuations at risk.
The findings highlight how the rapid pace of AI development is outstripping security measures, even among specialized companies tasked with safeguarding valuable intellectual property. Despite their technical sophistication, many AI firms appear to be neglecting fundamental security practices in their rush to innovate and collaborate openly.
On this page:
The scope of AI security vulnerabilities
Wiz researchers Shay Berkovich and Rami McCarthy employed an advanced scanning methodology called the "Depth, Perimeter, and Coverage" framework to uncover these leaks. Their approach went beyond standard security scans by examining commit histories, deleted forks, gists, and contributors' personal repositories.
"By analyzing deeper layers of developer activity," the report notes, "we found credentials embedded in places that traditional tools never check."
The investigation discovered that even companies with minimal public GitHub presence had leaked sensitive information. Among the most frequently exposed credentials were API keys from platforms central to AI development, including:
- Weights & Biases
- Hugging Face
- ElevenLabs
These exposed keys could potentially give unauthorized parties access to private data, model weights, or inference endpoints—essentially the crown jewels of AI companies' intellectual property.
Interestingly, the size of a company's GitHub footprint didn't necessarily correlate with security risks. One company with no public repositories and only 14 organization members still leaked sensitive information, while another with 60 public repositories maintained perfect security hygiene, which researchers attributed to stronger security discipline and automated secret management.
Organizations should consider implementing a comprehensive cybersecurity risk assessment framework to identify potential vulnerabilities in their development practices before they lead to serious data exposures.
Security debt in the AI ecosystem
The research revealed concerning gaps in security governance among AI startups. When Wiz attempted to notify affected companies about their exposed secrets, nearly half of their responsible disclosure attempts went unanswered—suggesting many AI companies lack formal vulnerability response processes.
"Secrets management is still treated as an afterthought, even by companies whose entire business depends on safeguarding data and algorithms," the researchers concluded.
Randolph Barr, CISO at Cequence Security, characterized the findings as the predictable result of "hyper-speed AI development colliding with long-standing security debt."
"The majority of these exposures stem from traditional weaknesses such as misconfigurations, unpatched dependencies, and exposed API keys in developer repositories," Barr explained. "What's changed is the scale and impact. In AI environments, a single leaked key doesn't just expose infrastructure; it can unlock private training data, model weights, or inference endpoints—the intellectual property that defines a company's competitive advantage."
This situation mirrors the early days of cloud adoption, when developers inadvertently exposed AWS keys and S3 buckets. However, in the AI context, the stakes are potentially much higher, with the entire foundation of a company's competitive advantage at risk.
Historical context and lessons learned
Looking at the history of technology adoption, this pattern of security lagging behind innovation is unfortunately common. When cloud computing first gained prominence, similar security oversights led to massive data breaches. The lesson seems clear: new technology paradigms require updated security approaches implemented from the beginning, not added retrospectively.
Companies developing AI solutions should learn from these historical patterns by implementing robust application security strategies that account for the unique challenges of AI development environments.
New dimensions to familiar problems
Security experts emphasize that while these vulnerabilities aren't new in concept, they take on greater significance in AI environments. The exposed credentials represent just one facet of what Barr calls "AI-native" risks—including model and data poisoning, prompt injection, and autonomous agents chaining together API calls with minimal human oversight.
Shane Barney, CISO at Keeper Security, highlighted the growing challenge of managing machine-based credentials at scale: "Each of these credentials represents an access pathway that, if left unsecured, can expose sensitive systems or data. As organizations adopt AI and cloud-native development, the number of non-human accounts and automated processes continues to rise. These machine identities are critical—but they often exist outside traditional identity and access management frameworks."
The explosion of machine identities in AI development creates a blind spot for many organizations. These non-human credentials often proliferate outside traditional identity governance frameworks, yet they can provide the same level of access as human accounts.
Machine identity management challenges
A particularly concerning aspect of this problem is that machine identities—the digital credentials used by automated processes, containers, and services—often outnumber human users by factors of 10 or more in modern cloud environments. Each of these identities requires proper authentication, authorization, and auditing controls, yet many organizations lack visibility into the full scope of their machine identity landscape.
As AI systems become more autonomous and interconnected, implementing comprehensive data protection strategies becomes essential for safeguarding both the models themselves and the sensitive information they process.
Building security into development pipelines
Jason Soroko, Senior Fellow at Sectigo, stressed that avoiding leaks requires engineering secure defaults, not luck: "If a company with many public repositories can avoid leaks, the lesson is not luck, but investment in plumbing that makes the safe path the fast path."
Security experts recommend several approaches to mitigate these risks:
- Default to secretless authentication with short-lived tokens and workload identity
- Implement pre-commit scanning for sensitive information
- Deploy canary keys that alert on use to track how quickly leaks are detected
- Treat machine-to-machine credentials with the same rigor as human ones
- Combine Privileged Access Management (PAM) with enterprise secrets management
Soroko also warned that even after key rotation, model artifacts may still contain old credentials: "Rotation fixes the immediate door lock, yet the old combination can live on inside models and evaluation artifacts for months."
Practical implementation steps
For organizations looking to improve their security posture around AI development, several concrete steps can make an immediate difference:
- Implement GitOps security tools that scan not just current code but historical commits and branches for sensitive information
- Establish automated key rotation policies with expiration dates for all credentials
- Create a comprehensive machine identity inventory to track all non-human credentials across your ecosystem
- Employ least-privilege access models for both human and machine identities
- Institute mandatory security reviews before code merges to main branches
These practices should be embedded into CI/CD pipelines rather than treated as optional steps in the development process. According to a 2023 GitHub security report, organizations that integrate automated security scanning into their pipelines detect and remediate vulnerabilities 26 times faster than those relying on manual processes.
Practical implications for businesses
The Wiz report offers several key lessons for organizations developing or implementing AI systems:
First, security must evolve as rapidly as the systems it protects. As AI development accelerates, security controls need to be automated and integrated into the development pipeline rather than applied as an afterthought.
Second, traditional security practices remain essential but must be adapted for AI's unique characteristics. Basic hygiene like credential management takes on heightened importance when those credentials protect not just infrastructure, but intellectual property and competitive advantage.
Finally, the report highlights the need for formal vulnerability response processes. The fact that nearly half of disclosure attempts went unanswered suggests many AI companies lack mature security operations—a significant liability as these firms grow in value and importance.
For businesses leveraging these AI platforms, the research underscores the importance of conducting thorough security assessments before entrusting sensitive data to third-party AI services. Just as the "move fast and break things" ethos of early social media development eventually gave way to more responsible practices, the AI industry appears to be experiencing growing pains that will require similar maturation.
The bottom line: AI innovation is currently outpacing security controls. The solution isn't necessarily to slow innovation but to automate defense mechanisms that can keep pace with development—continuous secret scanning, runtime credential management, and governance built directly into development pipelines.