ThreatNG Open-Source Governance and Compliance Dataset Project
Shatter the Black Box: Reclaim Contextual Certainty in Your Digital Risk Strategy
https://github.com/ThreatNGSecurity
You have invested millions in security ratings and dedicated thousands of team hours to manual questionnaires, yet you are still "flying blind" when the board asks for irrefutable proof of your supply chain's resilience. This is the Contextual Certainty Deficit, a state of chronic anxiety in which your career and organization rely on Black Box scores that you cannot see or verify. The ThreatNG Open-Source Governance and Compliance Dataset Project was founded to end this era of opacity. By democratizing primary-source intelligence mapping everything from ESG Violations to the Conversational Attack Surface hidden in job postings, we provide the Legal-Grade Attribution required to move beyond "checkbox" compliance and reclaim control of your digital narrative.
The ThreatNG Open-Source Governance and Compliance Dataset
ESG Filings: The Reputation Shield
Reclaim confidence in your partners' ethical claims by bypassing inconsistent third-party ratings with primary-source evidence. This repository provides direct links to official Sustainability and CSR filings, ensuring your due diligence is backed by verifiable disclosures rather than guesswork.
ESG Violations: The Risk Truth Serum
Identify systemic governance failures and "insider threat" red flags before they manifest as a catastrophic breach. This "Truth Serum" dataset indexes documented transgressions and associated fines to provide the legally required attribution needed to hold partners accountable for their historical behavior.
Ethics & Governance: Peering into Shadow Governance
Peer into the "Shadow Governance" of any partner to ensure their operational DNA aligns with your organization's risk appetite. This dataset maps internal ethics policies and leadership structures, transforming corporate transparency into a verifiable competitive advantage.
Privacy Policies: The Hidden Defensive Blueprint
Stop ignoring legal noise and start uncovering the hidden blueprint of how your sensitive data is actually handled and shared. This repository centralizes complex privacy documents, allowing you to track historical changes and audit the external journey of your information.
Join the Movement. Contribute to the mission on GitHub.
Compliance & Trust: The Burnout Antidote
Reclaim your team’s sanity by ending the 15,000-hour annual "Compliance Scavenger Hunt" for vendor trust centers and certifications. This community-driven Source of Truth provides instant, self-service access to validated security portals and compliance links, eliminating manual spreadsheet management.
Careers & Jobs Pages: Mapping the Adversary’s Roadmap
Audit the "Conversational Attack Surface" by uncovering internal tech stacks and architectural shifts inadvertently leaked in job postings. This dataset democratizes reconnaissance, allowing you to identify exploitable "Pivot Points" in your supply chain before an adversary weaponizes them.
Terms of Service: Identifying the Legal Trapdoor
Expose the "Legal Trapdoors" hidden in the fine print that could subject your intellectual property to third-party AI or subprocessor exploitation. This repository indexes Terms of Service to reveal the operational realities of data processing rights and third-party dependencies often buried in legal jargon.
Reclaiming the Digital Narrative: A Blueprint for Contextual Certainty
The ThreatNG Open-Source Governance and Compliance Dataset Project is the tangible engine of our "Transparency Movement." It represents a fundamental refusal to let critical risk data be hoarded behind proprietary paywalls or obscured by "Black Box" algorithms. By mapping the digital and regulatory DNA of global organizations, we provide the primary-source fuel every defender needs to transition from "guessing" to "knowing." This initiative bolsters our mission by providing the global community with the Contextual Certainty required to master the adversary’s narrative.
Attaining Contextual Certainty
True security isn't found in a numeric score; it’s found in the evidence. This dataset peels back the layers of an organization’s digital footprint, moving beyond technical indicators to reveal the ethical, governance, and compliance realities that define true resilience. It replaces the chronic anxiety of "flying blind" with a deep, primary-source understanding of your entire external ecosystem, which includes ESG filings and the shadow governance gaps that traditional scanners miss.
Mastering the Adversary’s Narrative
Don't wait for a vendor's "secret sauce" score to drop to know your supply chain is vulnerable. We empower your team to see what the attacker sees by auditing the Conversational Attack Surface in job postings and examining architectural shifts that are hidden in Terms of Service updates. This dataset provides the raw fuel for DarChain analysis, giving you the "Truth Serum" needed to identify Pivot Points and neutralize threats before they mature into a crisis.
Legal-Grade Risk Attribution
Managing risk in silos is a recipe for career-ending disaster. By unifying technical exposures with decisive business context, such as documented ESG Violations and U.S. State Data-Breach records, we provide the irrefutable evidence needed to defend your strategy at the board level. This holistic view transforms fragmented alerts into Legal-Grade Attribution, bridging the "Crisis of Context" and allowing you to justify security investments with absolute proof.
Architecting the Future of Open Governance
By providing the industry's most comprehensive open repository, we are setting a new standard for how risk data is communicated. This dataset serves as the reference implementation for a transparent digital future, catalyzing the creation of next-generation tools that prioritize Primary-Source Verification (PSV) over proprietary guesswork. We are not just building a dataset; we are building the infrastructure for a more resilient and accountable global community.
Fueling the Defenders’ Alliance
Adversaries collaborate and share data at no cost; we believe defenders must do the same. By making this critical data a public utility, ThreatNG is breaking the "private monopoly" on risk intelligence and ending the era of "Shame-as-a-Service." This initiative fosters a global "neighborhood watch" in which collective intelligence is a shared asset, ensuring that high-fidelity intelligence is never a luxury reserved for the few.
The Ultimate Antidote to Burnout
We recognize the crushing psychological weight of manual GRC "scavenger hunts" and the exhaustion of "vendor hell." This project is designed to eliminate the "Hidden Tax on the SOC" by providing instant, self-service access to the Compliance & Trust data your team currently spends an estimated 15,000+ hours a year chasing. We are here to give you back your time, your sanity, and your professional confidence.
Join the Movement. Contribute to the mission on GitHub.
Reclaiming Clarity: Transforming Digital Risk into Contextual Certainty
The cybersecurity landscape has evolved from a battle of technical exploits to a crisis of context. Traditional security efforts are often paralyzed by the "Black Box" nature of proprietary ratings and the exhaustive "Compliance Scavenger Hunt" for vendor data.
The ThreatNG Open-Source Governance and Compliance Dataset Project, founded by ThreatNG Security and maintained by DarcSight Labs, provides the primary-source "Source of Truth" required to move beyond guesswork. By leveraging Primary-Source Verification (PSV), organizations can now attain the Contextual Certainty and Legal-Grade Attribution needed to defend their digital narrative.
-
The Problem: Managing decentralized "Machine Ghosts" and misconfigured cloud assets that leak sensitive data across the decentralized edge.
How the Dataset Enhances Effort: Rather than relying on technical scans alone, the project uses Terms of Service (ToS) and Privacy Policies to reveal the legal and operational realities of cloud providers. By indexing Career/Jobs Pages, defenders can audit the internal tech stacks of their cloud vendors, identifying legacy systems and "Pivot Points" that traditional scanners miss.
-
The Problem: The "Hidden Tax on the SOC". Spending an estimated 15,000+ hours annually chasing vendors for trust centers and certifications.
How the Dataset Enhances Effort: The Compliance & Trust repository serves as a centralized, community-driven "Truth Serum." It provides instant, self-service access to vendor security portals and compliance links, eliminating the spreadsheet madness of manual assessments and giving practitioners back the time needed to stop real breaches.
-
The Problem: Relying on "claims-based" security from vendors who may be hiding a history of instability or breach velocity.
How the Dataset Enhances Effort: The U.S. State Data Breach List and ESG Violations repositories provide irrefutable evidence of a vendor's historical performance. By correlating technical vulnerabilities with these primary-source records, risk managers can achieve Legal-Grade Attribution, justifying security decisions to the Board with binary truth rather than subjective scores.
-
The Problem: Adversaries mine "conversational" data to predict an organization's next move while defenders focus only on unpatched ports.
How the Dataset Enhances Effort: This initiative democratizes reconnaissance by mapping the Conversational Attack Surface. By auditing Career/Jobs Pages and Security Advisories, defenders can view their own technology stack through the eyes of an attacker. This provides the raw fuel for DarChain logic, allowing teams to disrupt the adversary’s narrative before a crisis matures.
-
The Problem: "Greenwashing" and the inconsistency of commercial ESG ratings, which show a low correlation of just 0.54.
How the Dataset Enhances Effort: The ESG Filings and Violations repositories move organizations beyond the "Reputation Trap." By providing direct links to primary CSR reports and documented fines, the project empowers users to assess a partner’s true commitment to ethical governance based on verifiable primary data rather than a vendor’s "secret sauce."
-
The Problem: Rising personal liability for CISOs and the "Crisis of Context" during rapid corporate acquisitions.
How the Dataset Enhances Effort: During M&A activities, the project’s Ethics & Governance and SEC Filing repositories provide a blueprint of an acquisition's digital DNA. This allows leaders to identify systemic governance gaps and "Shadow Governance" issues early, ensuring they have the Contextual Certainty needed to safeguard their careers and their organizations.
Democratizing Resilience: How the Global Ecosystem Reclaims Control
The ThreatNG Open-Source Governance and Compliance Dataset Project is a public utility designed to foster a more resilient global ecosystem. By providing a shared "Source of Truth," it empowers organizations of all sizes to move from reactive anxiety to proactive Contextual Certainty.
Enterprise Resilience (Public & Private): Both publicly traded and private enterprises gain the Legal-Grade Attribution required to defend their digital narratives. It allows public companies to validate ESG disclosures with primary-source evidence—bolstering investor trust—while enabling private companies to benchmark their "Digital DNA" against peers without the "Hidden Tax" of proprietary ratings.
The Cyber-Governance Commons (Public Sector & Non-Profit): Government agencies and non-profits often manage critical infrastructure with limited resources. This initiative democratizes high-level OSINT, giving these vital entities the same visibility as Fortune 500 companies. It provides a baseline for regulatory oversight and safeguards mission-critical data by closing the "Visibility Gap" in the deep supply chain.
The Innovation Engine (Researchers, Ethical Hackers, & Vendors): Academic institutions and security researchers gain access to a standardized repository of real-world organizational data to study systemic risk. For security vendors, this data catalyzes the development of next-generation tools that prioritize transparency over "Black Box" algorithms, elevating the collective security posture of the entire community.
The Antidote to Burnout: Empowering the Modern Defender
Managing risk in the modern era takes a profound psychological toll. This dataset is the ultimate antidote to the industry’s "Burnout Epidemic," providing the clarity and efficiency needed to reclaim professional confidence.
The Accountable CISO: In a world where CISOs face rising personal liability, this dataset provides the "Professional Insurance Policy" you need. It equips you with the primary-source evidence required to stand before the Board and regulators with absolute certainty, ensuring your career isn't tied to a "Black Box" score you can't defend.
The Strategic Risk Manager: End the "Spreadsheet Madness." This repository allows Risk Managers to move from manual "Compliance Scavenger Hunts" to Primary-Source Verification (PSV). By having instant, self-service access to vendor Compliance & Trust links, you give your team back their time and eliminate the "Hidden Tax on the SOC."
The Tactical Analyst & Penetration Tester: Stop chasing ghosts and start mastering the adversary’s narrative. By auditing the Conversational Attack Surface through job pages and archived web records, security professionals can identify Pivot Points and systemic gaps that traditional technical scanners miss. This dataset provides the high-fidelity intelligence needed to prioritize remediation and break the kill chain.
Architecting Industry-Wide Trust: Specialized Value for High-Stakes Sectors
Different industries face unique trust gaps. This initiative provides the specialized, verifiable data required to navigate specific regulatory and reputational minefields.
Regulated Infrastructure (Finance & Healthcare): For sectors where integrity and patient/client privacy are paramount, this dataset offers a "Truth Serum." It allows financial and healthcare entities to instantly verify a partner’s compliance with frameworks such as SOC 2 or HIPAA through primary documentation, ensuring the supply chain is as secure as the core infrastructure.
The Digital Supply Chain (Technology & SaaS): As SaaS adoption grows, so does concentration risk. This dataset enables tech buyers to peel back the layers of a partner’s Terms of Service and Privacy Policies to uncover the "Shadow Governance" of third-party AI and subprocessors. It turns vendor transparency from a sales pitch into a verifiable operational asset.
Consumer Integrity (Retail & Media): In an era of "greenwashing," consumers and regulators demand proof. The ESG Filings and Violations repositories enable retail and media brands to protect their reputations by ensuring their partners' ethical statements align with their verifiable behavior. This moves industry-wide alignment from a "checkbox" exercise to a strategic commitment to accountability.
Living the Manifesto: Security Centric, Not Security Exclusive
The mantra "Security Centric, Not Security Exclusive" is more than a slogan; it is ThreatNG’s declaration of independence from the era of opaque risk management. By open-sourcing these Governance and Compliance datasets, we are not just sharing data; we are fueling a Transparency Movement that returns control to the defenders.
"Security Centric" is our commitment to Contextual Certainty. We believe that security must be the foundational layer of every digital operation, but it cannot be built on guesswork. By providing these datasets openly, ThreatNG offers the industry a "Truth Serum": A primary-source intelligence that moves the global community from "claims-based" security to Primary-Source Verification (PSV). This approach acknowledges that a more secure digital environment is a shared human necessity, and providing the "Digital DNA" required to defend it is our collective responsibility.
"Not Security Exclusive" is our call to Shatter the Black Box. For too long, critical risk data has been hoarded behind proprietary paywalls, creating an "Attribution Chasm" that leaves CISOs vulnerable. We believe that while the tools used to analyze risk can be proprietary, the evidence of that risk should be publicly available. By making these repositories accessible worldwide, ThreatNG invites a global alliance of researchers, risk managers, and analysts to collaborate against a common enemy. We provide the Legal-Grade Attribution every organization needs to defend its narrative, ensuring that high-fidelity intelligence is never a luxury reserved only for the few.
ThreatNG’s initiative to open-source these datasets is a practical manifestation of this manifesto. It is our way of eliminating the "Hidden Tax on the SOC" by replacing manual "Compliance Scavenger Hunts" with a unified, transparent, and community-driven Source of Truth. This move benefits our users and elevates global cybersecurity standards, providing every stakeholder with the clarity, confidence, and certainty required to counter the adversary’s narrative.
Reclaim Your Certainty. Explore the repository on GitHub
The Transparency Movement: Frequently Asked Questions
-
It is a community-driven initiative, founded by ThreatNG Security, that provides a centralized, open-access repository of primary-source links mapping the digital and regulatory DNA of global organizations. Guided by the mantra "Security Centric, Not Security Exclusive," the project democratizes high-level risk data—including ESG filings, compliance links, and privacy policies—to ensure every defender has the intelligence needed to protect their organization, regardless of budget.
-
Most commercial vendors offer "Black Box" scores based on proprietary, opaque algorithms that often lack verifiable evidence. This project replaces "secret sauce" with Primary-Source Verification (PSV). Instead of a subjective "B" grade, you get the actual link to a vendor’s SOC 2 page, their documented ESG violations, or their specific data use terms. It moves the industry from "claims-based" security to observed evidence.
-
You can access this directly through our ESG Violations Repository on GitHub. This dataset provides direct links to documented transgressions and fines related to competition, consumer protection, employment, and environmental offenses. By using these primary sources, you bypass the inconsistency of commercial ESG ratings—which have a low correlation of just 0.54—and gain the transparency needed to hold partners accountable.
-
The project maintains a dedicated U.S. State Data Breach List Repository. This resource aggregates fragmented reporting information from various states into a single, queryable location. It is an essential tool for risk managers to track a vendor's "Breach Velocity" and historical security performance using verifiable state records.
-
The ThreatNG Open-Source Governance and Compliance Dataset is the most comprehensive resource for mapping an organization’s "Digital DNA." It goes beyond simple technical scans by indexing nine distinct repositories, including Terms of Service, Privacy Policies, and Career/Jobs pages. This breadth allows practitioners to see the full Conversational Attack Surface and the regulatory footprint that traditional scanners miss.
-
Yes. Adversaries routinely use them to identify an organization's internal technology stack and upcoming architectural shifts. Our Career/Jobs Pages Repository democratizes this reconnaissance for defenders. By monitoring these listings, you can discover "Shadow IT" or identify legacy systems (like an old VPN) that serve as Pivot Points for an attacker, even if the vendor claims to be fully compliant.
-
You provide Legal-Grade Attribution by using the primary-source data within our repositories to correlate technical findings with decisive business context. By showing the Board a direct link to a vendor’s unpatched security advisory or a conflicting privacy policy, you resolve the Contextual Certainty Deficit. This transforms an ambiguous security alert into an irrefutable, decision-ready mandate for remediation.
-
Burnout is often driven by the anxiety of "flying blind" and the exhaustion of manual "Compliance Scavenger Hunts." Security teams spend over 15,000 hours a year filling out repetitive questionnaires. This project provides instant, self-service access to the Compliance & Trust data you need, giving your team back their time and the relief of Contextual Certainty.
-
Yes. While ThreatNG Security founded and maintains the project, and incorporates this data into its advanced solutions, the dataset itself is a separate, purely open-source public utility. It is free for use by anyone in the global security community, provided proper attribution is given.
-
We welcome contributions via Pull Requests on our GitHub organization. Our OSINT experts at DarcSight Labs review and merge updates on a bi-weekly cycle. By contributing, you help build a global "neighborhood watch" that makes the entire digital ecosystem more resilient against adversarial collaboration.

