A Lawyers' Duty to Confidentiality Restricts Legal AI Training

As legal professionals increasingly explore the use of generative AI to assist with research, drafting, and other legal tasks, a significant challenge emerges: the strict duty of confidentiality that governs lawyers’ practice is at odds with the needs of large language model (LLM) training.

For AI to effectively understand and respond to the nuances of legal jurisdictions, it requires comprehensive data—data that, in the legal field, is often shrouded in confidentiality.

We have more insight than ever into how these groundbreaking AI technologies will influence the future of the legal industry, read more in our latest Legal Trends Report.

The competence paradox

A lawyer’s primary obligation is to provide competent representation, as outlined in the legal profession’s competency rules. This includes the responsibility to maintain up-to-date knowledge and, as specified in the Federation of Law Societies of Canada’ Model Code of Professional Conduct Rule 3.1-2, to understand and appropriately use relevant technology, including AI. However, lawyers are also bound by Rule 3.3-1, which mandates strict confidentiality of all client information.

Similarly in the United States, the American Bar Association released Formal Opinion 512 on general artificial intelligence tools. This document emphasizes that lawyers must consider their ethical duties when using AI, including competence, client confidentiality, supervision, and reasonable fees.

This paradox results in a catch-22 for legal professionals: while they must use AI to remain competent, they are hindered in improving AI models by their inability to share case details. Without comprehensive legal data, LLMs are often undertrained, particularly in specialized areas of law and specific jurisdictions.

As a result, AI tools may produce incorrect or jurisdictionally irrelevant outputs, increasing the risk of “legal hallucinations”—fabricated or inaccurate legal information.

Legal hallucinations: A persistent problem

Legal hallucinations are a significant issue when using LLMs in legal work. Studies have shown that LLMs, such as ChatGPT and Llama, frequently generate incorrect legal conclusions. These hallucinations are prevalent when asked about specific court cases, with error rates as high as 88%.

This is particularly problematic for lawyers who rely on AI to expedite research or drafting, as the models may fail to differentiate between nuanced regional laws or provide false legal precedents. The inability of AI to correctly handle the variation in laws across jurisdictions points to a fundamental lack of training data.

The confidentiality trap

The heart of the issue is the prohibitions on legal professionals to share their work product with AI training systems due to confidentiality obligations. Lawyers cannot ethically disclose the intricacies of their clients’ cases, even for the benefit of training a more competent AI. While LLMs need this vast pool of legal data to improve, lawyers are bound by confidentiality rules that prohibit them from sharing client information without express permission.

However, maintaining this strict siloing of information across the legal profession limits the development of competent AI. Without access to diverse and jurisdiction-specific legal data, AI models become stuck in a “legal monoculture”—reciting overly generalized notions of law that fail to account for local variations, particularly in smaller or less prominent jurisdictions.

The solution: Regulated information sharing

One potential solution to this problem is to empower legal regulators, such as law societies and bar associations, to act as intermediaries for AI training.

Most Rules permit the sharing of case files with regulators as not breaking confidentiality. Regulators could mandate the sharing of anonymized or filtered case files from their members for the specific purpose of training legal AI models, ensuring that the AI tool receives a broad spectrum of legal data while preserving client confidentiality.

By requiring lawyers to submit their data through a regulatory body, the process can be closely monitored to ensure that no identifying information is shared. These anonymized files would be invaluable in training AI models to understand the complex variations in law across jurisdictions, reducing the likelihood of legal hallucinations and enabling more reliable AI outputs.

Benefits to the legal profession and public

The benefits of this approach are twofold:

First, lawyers would have access to far more accurate and jurisdictionally appropriate AI tools, making them more efficient and improving the overall standard of legal services.
Second, the public would benefit from improved legal outcomes, as AI-assisted lawyers would be better equipped to handle cases in a timely and competent manner.

By mandating this data-sharing process, regulators can help break the current cycle where legal professionals are unable to contribute to, or benefit fully from, AI models. Shared models could be published under open-source or Creative Commons licenses, allowing legal professionals and technology developers to continually refine and improve legal AI.

This open access would ultimately democratize legal resources, giving even small firms or individual practitioners access to powerful AI tools previously limited to those with significant technological resources.

Conclusion: A path forward

The strict duty of confidentiality is vital to maintaining trust between lawyers and their clients, but it is also hampering the development of competent legal AI. Without access to the vast pool of legal data locked behind confidentiality rules, AI will continue to suffer from gaps in jurisdiction-specific knowledge, producing outputs that may not align with local laws.

The solution lies with legal regulators, who are in the perfect position to facilitate the sharing of anonymized legal data for AI training purposes. By filtering contributed client files through regulatory bodies, lawyers can continue to honor their duty of confidentiality while also enabling the development of better-trained AI models.

This approach ensures that legal AI will benefit not only the legal profession but the public at large, helping to create a more efficient, effective, and just legal system. By addressing this “confidentiality trap,” the legal profession can advance into the future, harnessing the power of AI without sacrificing ethical obligations.

Read more on how AI is impacting law firms in our latest Legal Trends Report. Automation is not just reshaping the legal industry, but leaving vast opportunities on the table for law firms to close the justice gap while increasing profits.

Note: This article was originally posted by Joshua Lenon on LinkedIn and is based on a lightning talk he gave at a recent Vancouver Tech Week event hosted by Toby Tobkin.

We published this blog post in October 2024. Last updated: November 20, 2024.

Categorized in: Marketing

Explore AI insights in our latest report