The Data Deluge: Navigating the AI and Data Revolution as a Lawyer

December 2023

The Data Deluge: Navigating the AI and Data Revolution as a Lawyer

8 min read

1.43k

The Onslaught of the AI Era – It’s Not Coming, It’s Here

Welcome to the advent of AI, and to the data revolution – a mantra which we have seen some version of for many years now! Some of you may be grizzled and tired, and others raring to go yet again.

As we stand amidst another new tsunami of tech, it is time to shift our narrative from mantras like “Let’s use X”, “X will be a boon to the legal industry”, or some variation on that theme. These mantras serve up the hype, but do not provide enough analytical clarity about using X.

Instead, we must say “Let’s use X data to achieve Y“.

Clarity about X and Y goes a long way. Not just in achieving new goals, but in getting you from X to Y in the first place. For example: “let’s use contract negotiation data (X) to achieve quicker contract turnaround times (Y)”. We do not want pipe dreams to remain pipe dreams, after all.

In addition, having been around for the previous AI wave, this author suspects that this particular tsunami of tech is going to be different, and the abovementioned clarity will become a starting point rather than something to strive for.

Lawyers cannot just be legal experts anymore. Everywhere one looks, risks increasingly take a digital, and perhaps, numerical form. If we do not have in our mind’s eye a clear view of the data ecosystems surrounding even everyday software like e-mail and Microsoft Word, are we in a position to take on greater legal challenges surrounding technology?

The goal here is not to be a data-munger extraordinaire, but rather to understand what X and Y mean, and what it takes to get from X to Y effectively and efficiently. For example, asking these questions would help:

What information does the data contain?
What encryption, if any, does the information use?
What insights may such information yield?
What other information can the information be aggregated with to yield even more insights?
What form of automation is possible?
Who should use the information?

(more questions in the Annex)

Most, if not all, of these questions would apply to any machine. Whether it is the laptop in your hands, the servers on-premise at your office, or in the cloud.

In addition to the questions above, take some time to consider the following examples of data literacy. Some of them have been around in the legal sector for a while, but comprehension across the sector still cannot be taken for granted. Other examples represent new frontiers. And new frontiers always present new opportunities, at the same time that they present new risks.

Some Brief Real-World Applications

Contract Analysis: In corporate law, AI algorithms can analyse contracts in seconds, identifying risks and non-compliance issues. Lawyers, equipped with statistical knowledge, can understand how this happens, and interpret these findings to provide even more insightful and strategic advice. Remember, algorithms are recipe steps for how to process and manipulate data, to get you the dish that you want. It is not enough to refer to the names of recipes, most of the time. All of this is a long way from asking a vendor “Is AI smart enough to do X?”
Contract Lifecycle Management: viewing contract management as a series of lifecycles rather than discrete tasks opens up a broad vista of possibilities ranging from AI drafting, to automated management of deadlines and key figures in contracts. Statistical understanding allows one to better understand turn-around times, negotiation times, and risk areas.
Intellectual Property and Big Data: IP lawyers can use data analytics to monitor trends across the IP landscape, and even to scan patents.
Business Process Review and Redesign: Programmers and computer scientists have a word for the discipline of rewriting code so that it is more efficient, more understandable, and more concise: reformatting. The simplicity of the word (no, it is not formatting as lawyers understand it) belies the huge gains that programmers make when reformatting: better integrations with other chunks of code, maintainability over years, 10x improvements in efficiency. It is no wonder that they have also coined the related concept of the 10x developer.
Suffice it to say, many processes in law have not been subject to the same iterative and intensive process in a while. Perhaps, one may argue, there is a reason for their existence in the first place (see Chesterton’s fence), or because they are so intensively used. The lack of downtime is often a reason why changes are not carried out. But can the same reason be used to justify not reviewing the process in the first place? As the changes around us accelerate, probably not. We will also find it increasingly more difficult to justify not implementing changes for intensively used processes. Developers have made solving this issue a craft: Continuous Integration/Continuous Deployment. Timely, and well-made documentation (and automated accompanying processes, to boot) address the issue of Chesterton’s fence as well.

Going Paperless: you might roll your eyes, and be bored out of your mind at reading another mention of this. It is certainly not sexy. But stay with me for a moment, and consider this: if the scanning of your entry pass to your office took five seconds every instance, how much time in a day would you save eliminating that step every time you entered the office, and every time you went to the loo? The amount of time you would save in a year would leave you gobsmacked. It is probably enough time for another episode of Netflix, or another dram of whisky.
For the modern office, or rather the modern office that has implemented all of the trappings of the modern world, but not its second-order efficiencies (e.g. you still have paper-based client onboarding forms), you would have wasted time in the order of multiple binge-worthy series. In other words, these are the efficiencies that you would have to take a step back to consider without any haste, and then slowly but surely implement.

Computational Law: imagine if we moved away from having law contained in Microsoft Word documents, text-based, and very ordered lists, and instead in a data format that could be programs? Could we find edge cases in a contract ahead of time? Could we find lacunas in the law? Could we simulate a contract while we sleep? Could citizens spot errors in the law and suggest legislative amendments? The answer to all of that is yes. The movement towards such a world is very much alive at SMU’s Centre for Computational Law, and the worldwide Rules as Code movement.

These are just some of the possibilities. You might be wondering: what about disputes?

How Can Data Help Lawyers?

Imagine a scenario where a lawyer is sifting through thousands of case files. In the past, this would have been a Herculean task, involving countless hours of reading. For a while now, we have had eDiscovery, and even homegrown vendors who are consummate experts in such tools and services.

These tools can dissect, categorize, and analyse data at an unprecedented scale. But here is the catch: to effectively wield these tools, one must be fluent in the language of data, and to know how data works in the torrent of everyday use.

This is all too easy to dismiss amidst the daily deluge of our own emails, documents, and cases. So perhaps another example may be more apt. Let us use a holiday to that most hallowed of Singapore holiday destinations: Japan.

Now, imagine that you were there with your family. You would, of course, be wary of the tourist traps. But consider the insider opportunities beyond travel guides, the discounts being broadcasted in a different language, and the possibilities that just trying to connect with the language and culture may bring.

The gulf between trying to connect with the language and culture, and being completely fluent, is vast and full. So it is for the language and culture of data as well. And when one can access plenty of resources about data in English, and many other languages, there is no reason for a lawyer to take their traditional refuge in being joyously bad with numbers or tech.

Data literacy and digital competency are about understanding not just what the data is, but about its source.

Where did an e-mail come from? Where will its data go? Is the data repeated elsewhere? Worse still, is it manually input by someone again? Do Word documents follow the same journey? These seeming ephemera sit for years in hard drives, without ever being exploited, besides the odd search.

That means the data cannot be put to good use. No auto-magic workflows, staffed by erstwhile document automation, and a vanguard of AI. No data story-telling. No revealing patterns of data. No insight into the vast warrens of data or evidence provided by a client. No intelligent guesses at a more economically secure future. And ultimately, less time working towards a more just and fair future, if the vocational ideals of the profession remain intact.

Why would anyone risk continuous and abject failure at these tasks?

The frequent answer often is “I don’t need to do so at the moment”. Well, no one ever needs to do something until they have to. A tautology, but everyone forgets the phrase “when push comes to shove”. Darwin definitely saw the effects of that writ large, over generations and species: only the fittest survive.

So what can one do? On top of the questions above, sign up for courses, both online and offline, offered by educational institutions such as SMU, NUS, or even Coursera and EdX. At Dentons Rodyk, we run our own inhouse seminars on such topics.

From Averages to Algorithms: Why Statistical Knowledge Is No Longer Your Nemesis

Apart from understanding how data works, gone are the days when knowing your averages was enough. In our sophisticated digital era, a lawyer’s toolkit must include statistical familiarity: if not with the actual handling of numbers, then with the intuitions underlying important concepts.

For instance, consider risk assessment: evaluating the likelihood of litigation or regulatory issues may require more than just surface-level number crunching of potential amounts of liability. What factors should one consider? Are they qualitative or quantitative? Are they discrete or continuous variables? Can factors come together to reveal new insights? Does the mean of a dataset count for anything? Is it big enough?

Another area is dispute resolution. Using dispute resolution analytics software involves sophisticated statistics about judges, arbitrators, and counsel as well. Both involve deeper dives into predictive analytics, understanding the nuances of correlation versus causation, and recognizing statistical outliers.

To paraphrase authors such as Nassim Nicholas Taleb, your usual assumptions about the bell-shaped curves that so often mark the Singapore experience, perhaps do not matter as much as the outlier events that characterise the modern world. So it is for law as well.

For example, if the law is as elite as it is, does placing performance on the bell curve, which we take for granted, actually reflect the realities of practice? In other words, is performance of a select group of professionals anything like a naturally occurring attribute such as physical height? Perhaps, one can be more honest and transparent when one approaches it as a resource allocation tool. Statistics may prove useful in providing insights into issues such as retention, and other human resource issues in the profession. Asking questions through a statistical lens also provides another avenue for reflection, and self-awareness in the profession. If we looked at other statistics, besides the bell curves that yielded most of our places as the crème de la crème of academic crops and entry to law school, do other statistics support the pride, prestige, and elite status of the profession?

Extrapolating, the same analysis can be applied to risks facing clients. In many ways, the profession can use these statistical tools to the risks that clients face – especially where finance is involved. There is a subtle distinction between being sensitive to outliers, and an approach where one assumes that the worst that can happen, happens – for everything!

Yet another area is that of pricing analysis, a discipline well on its way in other jurisdictions. How can we charge better in such a competitive space? Will Singapore follow?

Conclusion

As we embrace AI and data, remember, it’s not just about using AI as a tool; it’s about harnessing the power of data and algorithms to inform our decisions, and improve our work.

For the legal profession, this means diving far deeper into data literacy and statistical knowledge than we have. It requires a cultural sea-change, and in many ways a humility to admit that lawyers have lagged behind technologically.

The future is not just about understanding the law, but understanding the language of data that increasingly defines, constrains, and yet amplifies, the human condition. So, grab your digital compass, and start navigating this brave new world. There is no better way to start than to try.

Thank you to Professors Alexander Joseph Woon, and Jerrold Soh, for reading drafts of this. Any errors and omissions are my own.

Annex

Where is the information stored?
How is the information stored?
What data format is the information stored in? E.g. Both Microsoft Word documents and webpages use markup languages in their backend
How does the data format work? E.g. Does your PDF store image or structured document information?
How is the information transferred?
What can be done with such information?
Who uses the information?
Who should not be able to?