What can we learn from recent cases related to data scraping? A comparative overview of international practice

What can we learn from recent cases related to data scraping? A comparative overview of international practice

14.01.2026.

Data scraping is the process of extracting content from a website and downloading it to a computer. The content can then be analyzed or used to train an artificial intelligence algorithm. This is the definition provided by the Intellectual Property Committee of the GPAI (Global Partnership on Artificial Intelligence) Innovation & Commercialization Working Group. As early as 2022, they published guidelines on data scraping and copyright protection, noting that there is no harmonized international solution and that different jurisdictions have different approaches.

Looking beyond intellectual property law, this article highlights recent cases, focusing particularly on European interpretations. It can be said that legal gaps regarding data scraping are gradually being filled—whether in cases involving intellectual property rights, breaches of terms of use, or data protection—all in light of new examples from practice.

hiQ Labs v. LinkedIn

After years of litigation, hiQ Labs and LinkedIn settled their dispute. The case was turbulent—to say the least—from the issuance of a preliminary injunction in 2017 to its later reversal by the U.S. Supreme Court in light of the decision in Van Buren v. United States, with the matter finally concluding in 2022.

The question posed by the Ninth Circuit was: After hiQ received LinkedIn’s cease-and-desist letter, did any further collection and use of data constitute a violation of law? The data analytics company hiQ prevailed on the issue of unauthorized access under the U.S. Computer Fraud and Abuse Act (CFAA) concerning data from publicly accessible web pages. However, in November 2022, the U.S. District Court for the Northern District of California held that the provisions of LinkedIn’s user agreement prohibiting data scraping techniques and the creation of fake profiles were enforceable in a breach of contract claim (hiQ Labs, Inc. v. LinkedIn Corporation, Case No. 17-cv-03301-EMC, November 4, 2022).

The European approach 

In Europe, the approach is somewhat different. In January 2025, the French Data Protection Authority (CNIL) imposed a fine on the company KASPR for collecting LinkedIn contacts, including hidden e-mail addresses, without informing users (Délibération de la formation restreinte n° SAN-2024-020 du 5 décembre 2024 concernant la société KASPR). The focus here was not on LinkedIn’s terms of use but on compliance with the GDPR. The authority explained that KASPR, in addition to publicly available data, collected data from users who had restricted visibility settings and that there was no consent for such data collection; its activities demonstrated a lack of transparency and disregard for fundamental rights granted to individuals under the GDPR.

A stricter stance on data scraping was also taken by the Dutch Data Protection Authority, which published guidelines in May 2024, stating that the mere fact that data is publicly available does not mean there is consent for its collection. Publishing data online is not an automatic license for further collection and processing. Furthermore, scraping may be based on legitimate interest in exceptional cases, but not commonly in commercial contexts—especially when it involves large-scale data collection for commercial use.

In Europe, the fact that something is publicly accessible does not mean it is free to use.

What does this mean for legal entities using data scraping techniques? Nothing new—continuous monitoring of regulations, compliance with relevant laws, careful reading of terms of use, and recognition of the importance of data protection rules.

Are legal entities always exempt from data protection rules?

When discussing data scraping and data protection, we usually think only of natural persons. However, the Spanish Data Protection Authority (AEPD) issued an interesting decision on April 14, 2025 (EXP202404644) – AEPD imposed a fine of €1.8 million on the company Informa D&B for misuse of personal data of self-employed individuals. The data originated from the tax authority, which provided personal data of self-employed persons to official registers for legitimate public purposes, and which later ended up with Informa D&B under a commercial agreement. AEPD found that the company lacked a lawful basis for processing data of 1.67 million entrepreneurs. More details and explanations can be found in the decision itself.

The Polish Data Protection Authority (UODO) issued a decision in March 2019 imposing a fine on Bisnode for failure to fulfill the information obligation and violation of Article 14 GDPR. Bisnode (now Dun & Bradstreet), through its Polish branch, collected data from public registers and databases concerning 6 million company owners for the purpose of assessing their creditworthiness. The data included names, identification numbers, and information related to business activities. Bisnode sent email notifications to only 90,000 individuals, of whom 12,000 objected, and the company published a publicly accessible document on its website as an additional means of notification.

The Polish authority argued that Bisnode, as a controller, was aware of its obligation to provide information but failed to properly comply with the obligation imposed by the GDPR. Bisnode disputed these claims, arguing that the costs of notifying individuals by phone or mail would be excessive and constitute disproportionate effort.

Without delving further into the substance and circumstances of the case, after more than four years of litigation, the Supreme Administrative Court in Poland dismissed Bisnode’s appeal against the judgment of the Administrative Court in Warsaw. What is important to note here—similar to the Spanish case—is that this case sparked significant reactions, particularly given that Bisnode collected data from publicly available registers. On the other hand, the data protection authority did not question the sources of data collection but reacted to the failure to inform the majority of individuals whose data had been collected. The Administrative Court held that the defendant was obliged to fulfill the obligation under Article 14 GDPR, but only in relation to individuals who were actively engaged in business or had temporarily suspended their activity at the time of the decision. The Supreme Administrative Court emphasized that under the GDPR, transparency of processing is the rule and any exception to this rule must be interpreted restrictively. The decision was annulled in the part concerning the processing of data of individuals who had conducted business activity in the past and in the part concerning the amount of the fine (III OSK 2538/21).

CK v. Dun & Bradstreet – What does the CJEU say?

Another development from 2025 – on February 27, 2025, the Court of Justice of the European Union (CJEU) delivered a judgment in the case of CK v. Dun & Bradstreet (C-203/22). This judgment primarily concerns the right of access to personal data (Article 15(h) GDPR) and the obligations of controllers in cases of automated decision-making and profiling (Article 22 GDPR). The judgment also addressed the balance between transparency requirements and the protection of trade secrets.

In short, the claimant, as a natural person, sued Dun & Bradstreet Austria GmbH after being refused a mobile phone contract based on an automated creditworthiness assessment carried out by the defendant. The claimant requested access rights, but the defendant refused, arguing that the algorithm used in the decision-making process was protected as a trade secret.

The CJEU concluded that Article 15(1)(h) must be interpreted to mean that, in cases of automated decision-making, including profiling (within the meaning of Article 22(1) GDPR), the data subject may require the controller, as ‘meaningful information about the logic involved’, to explain, through relevant information and in a concise, transparent, intelligible, and easily accessible form, the procedure and principles actually applied to use, by automated means, the personal data of that person to obtain a specific result, such as a creditworthiness score.

The provision must also be interpreted to mean that, where the controller considers that the information to be provided to the data subject under that provision contains third-party data protected by the GDPR or trade secrets, the controller is obliged to provide the allegedly protected information to the competent authority or court, in order to establish a balance between the rights and interests at stake and determine the scope of the data subject’s right of access.

In paragraph 59 of the judgment, the Court stated that merely providing a mathematical formula or a description of the steps in automated decision-making does not satisfy the requirement for a concise and intelligible explanation.

Last but not least, the CJEU added that a Member State cannot determine the outcome of this balancing exercise and the scope of the right of access solely by applying its national legislation; rather, the competent authority must assess each case individually, taking into account all specific circumstances.

Note: This text reflects the author’s personal opinion and does not constitute legal advice.

Scroll