The Data Broker Accountability Gap: Why AI Has Outpaced Oversight, and What Organizations Must Do Now
A $330 billion industry is being turbocharged by AI. Its regulatory infrastructure is fragmented, politically contested, and years behind the tech. This is a lot more than a compliance problem.
There is a class of economic actors that knows more about you than your closest friends, operates almost entirely outside your awareness, and faces fewer meaningful constraints on how it uses that knowledge than virtually any comparably powerful industry. Data brokers—companies that collect, aggregate, and sell personal information about consumers to third parties—have existed in some form since the 1960s. What has changed, with the broad deployment of AI and machine learning, is the scale, sophistication, and opacity of what they can do with that information. The result is a structural accountability gap that is growing faster than the regulatory frameworks designed to close it.
For senior leaders, this is not a peripheral compliance matter. It’s a strategic exposure point touching vendor relationships, reputational risk, employee data, customer trust, and an increasingly volatile regulatory environment that is shifting—simultaneously and in opposite directions—at the state and federal levels.
The Signal: A Market Turbocharged by AI, Governed by Patchwork
The data broker market was valued at approximately $294 billion in 2025 and is projected to reach nearly $450 billion by 2031, growing at roughly 7.3% annually. The market’s growth trajectory is inseparable from AI: machine learning algorithms allow brokers to process datasets at speeds and scales that would have been computationally infeasible a decade ago, to build predictive behavioral profiles from fragmented data points, and to cross-reference disparate datasets to generate inferences no single data source could support.
The industry includes what market analysts estimate to be up to 5,000 brokers globally, ranging from multinational firms like Experian, TransUnion, and Acxiom, which reported year-on-year revenue growth of 8-12% in recent periods, to smaller specialized players operating in health data, financial intelligence, and location analytics. What unites them is a business model premised on the insight that consumer data is more valuable in aggregate than in any individual transaction, and that buyers—from advertisers and insurers to employers and government agencies— will pay substantial premiums for predictive behavioral intelligence.
AI has qualitatively transformed this model in three ways that deserve strategic attention. First, it has expanded the inferential range of brokered data: AI systems can now generate reliable predictions about income, health status, political views, and creditworthiness from data that is not, on its face, sensitive such as purchase histories, location patterns, device behavior. Second, it has collapsed the practical distinction between “public” and “private” information, since machine learning can reconstruct private attributes from publicly available signals. Third, it has automated the assembly of detailed individual profiles at a scale that makes meaningful human oversight of data flows structurally impossible under current frameworks.
The Analysis: A Fractured Regulatory Landscape
The Federal Vacuum
The most significant recent development in data broker regulation is, paradoxically, a non-event. In December 2024, the CFPB proposed an ambitious rule that would have extended the Fair Credit Reporting Act to cover data brokers, treating those that sell certain sensitive consumer data as consumer reporting agencies with attendant accuracy, access, and consent obligations. The proposal would have curtailed the sale of Social Security numbers, financial data, and sensitive personal information and restricted use to explicitly “permissible purposes,” essentially eliminating data-for-marketing as a business model for brokers handling this category of data.
In May 2025, the CFPB withdrew the rule entirely, citing statutory authority concerns and questions raised by commenters. The withdrawal reflects the current administration’s broader posture toward deregulation. The practical consequence is that, at the federal level, the primary regulatory tool for data brokers remains the FTC’s authority under Section 5 of the FTC Act to challenge unfair or deceptive practices, a meaningful but fundamentally reactive instrument that cannot systematically govern an industry this large.
The State Mosaic
In the absence of federal action, states have moved aggressively and inconsistently. California’s Delete Act and CPPA regulations approved in July 2025 impose phased cybersecurity audit requirements and risk assessment obligations on businesses handling California consumer data, with large enterprises (annual revenue over $100 million) facing the first compliance deadlines for 2027 audits by April 2028. Vermont, Oregon, and Texas have enacted separate data broker registration and transparency requirements. The result is a compliance landscape of significant fragmentation: organizations operating across state lines face overlapping, sometimes conflicting obligations that are rapidly becoming a material operational cost.
The international dimension adds further complexity. Europe’s GDPR, now the de facto global standard for serious data governance, applies to any organization processing the personal data of EU residents, regardless of where the organization is headquartered. The GDPR’s requirements for lawful basis, data minimization, purpose limitation, and cross-border transfer safeguards represent the upper bound of what robust data broker oversight looks like. The Irish Data Protection Commission’s enforcement actions against LinkedIn (€310m), Meta (multiple nine-figure fines), and, in 2025, TikTok (€530m) demonstrate that this is not theoretical exposure.
The AI-Specific Accountability Gap
The most analytically important dimension of data broker oversight is the one least addressed by existing frameworks: the accountability gap created specifically by AI-driven inference. Existing privacy law, including GDPR and CCPA, generally regulates data collection and use based on the category of data collected. AI inference systems collapse this framework by generating sensitive predictions—about health, financial status, political orientation, or behavioral vulnerability—from data that was collected for other, apparently innocuous purposes. The organization that purchased location data to optimize logistics and the organization that purchased it to predict which consumers are experiencing financial distress are using the same data product in ways that existing frameworks treat identically. The inference process itself—the most consequential step in the value chain—remains largely unregulated.
The Asset: An Organizational Data Broker Oversight Framework
The following framework draws on guidance from the IAPP (International Association of Privacy Professionals), FTC enforcement principles, and GDPR best practices to give senior leaders a structured approach to managing data broker risk. It is organized into four operational domains:
1. Vendor Due Diligence and Third-Party Data Governance
The starting point is knowing what data you are purchasing and from whom. This sounds obvious; in practice, most organizations’ procurement processes for data products are less rigorous than those for any other significant vendor relationship. At minimum, organizations should require data brokers to document: (a) the source and collection method for all data provided; (b) the consent architecture (i.e., what, if anything, individuals were told about their data’s use); (c) whether the data has been cross-referenced with other sources and what inferences have been applied; and (d) compliance certifications against applicable frameworks (GDPR, CCPA, FTC guidelines).
2. AI-Inference Audit
Given the inferential capabilities of modern data systems, organizations should not assess data broker risk based solely on the category of data they have purchased but on the categories of inference that data can support. An internal audit should map: what behavioral, financial, health, or identity inferences can be derived from our current data inputs, whether or not those inferences are explicitly performed by us? This is both a risk management exercise and a compliance imperative — under GDPR’s automated decision-making provisions and California’s ADMT rules (effective 2026), organizations can face liability for inferences made by their systems even when derived from data that was not originally sensitive.
3. State Compliance Mapping and Monitoring
Given the pace of state-level legislative activity—California, Colorado, Texas, Oregon, Vermont, and several others have enacted or are advancing data broker-specific obligations—organizations need a live compliance map, not a static legal opinion. This means designating ownership for tracking legislative developments (the IAPP’s Privacy Tracker is a useful tool), and building compliance monitoring into the product and procurement calendar rather than treating it as a reactive legal function. The key operational implication: compliance decisions made in the next 12 months will determine audit posture for 2027 and beyond, as California’s phased requirements come into effect.
4. Internal Data Minimization as Strategic Posture
The most durable data broker risk management strategy is not compliance optimization, it’s demand reduction. Organizations that have developed a structural dependence on externally brokered data for core business processes (targeting, underwriting, pricing, hiring) are exposed to regulatory, reputational, and supply disruption risks that organizations with strong first-party data strategies are not. Investing in first-party data quality—such as developing direct, consent-based relationships with customers and building inferential capability from ethically collected data—is simultaneously a privacy risk mitigation strategy and a competitive positioning move in an environment where brokered data faces escalating legal uncertainty.
The data broker accountability gap is not closing. In the near term, federal deregulatory posture has created a vacuum that state-level action is filling inconsistently, while AI-driven inference capabilities are advancing faster than either regime can track. The organizations that navigate this environment most effectively will be those that treat data governance not as a compliance checkbox but as an operational capability built into vendor relationships, product design, and strategic planning from the ground up.
Thank you for reading. If this analysis provided value, please consider sharing it with a colleague or subscribing to receive future posts directly in your inbox.
A final note: The analyses and perspectives shared here are my own, developed independently. They do not represent the views of any client, employer, or affiliated institution.


