Home > Analysis > Smart Data Extraction

Smart Data Extraction


Andrey_Popov / Shutterstock.com

In February, leading UK insurance risk and commercial law firm BLM LLP, officially adopted iManage RAVN Extract to capture data from its documents to analyse and make accurate predictions around claims outcomes.

Ian Rogers, Legal Practice Lead at iManage and Andrew Dunkley, Head of Analytics at BLM highlight how AI can make law firms more efficient and the impact the technology will have.

Overview of iManage RAVN Extract

Ian Rogers

Essentially the RAVN platform can read documents. It uses an OCR platform together with the structure of the document to understand the text inside each document – it can be trained to classify documents. Our current, out-of-the-box classification models include 20–30 different agreements that the RAVN product can automatically classify. The ability to train the system to identify different types of documents is something that people in the market are finding really useful because they can, depending on practice area, determine which types of documents they want the RAVN system to be able to read – software agreements, staff agreements, share purchase agreements etc. – and they can train the RAVN engine to read it in a way that the firm itself is comfortable with so it essentially owns that process of training the engine to spot different arguments.

Andrew Dunkley

The idea is to use RAVN to identify and extract information from key parts of documents that, depending on the use case, my team can then use to build predictive models from. We’ll also look at process improvements and process automation use cases as we go on as well. The problem is more deciding which good idea to do first rather than having to struggle to think what to do with it.

Reception and takeup of AI

IR

We’re seeing a lot of interest across the market. The BLM adoption is typical in a sense. It’s fair to say that a lot of early adopters are coming forward to use AI. You see a lot of firms who have introduced innovation teams and transformation heads, they’re leading their firms on the journey into innovation. The adoption is being driven by the fact that there are now more players in this field. AI vendors are selling directly to lawyers, so the lawyers themselves are becoming more familiar with the different platforms and the different features that they can expect to see. There’s also the commercial aspect, which is the new business models like fixed matter fees rather than simply being based on billable hours driving innovation because that productivity gap is something that the firms themselves now see benefit in closing.

AD

We’re really fortunate to work with some statistically significant data-driven clients: insurers are perfectly comfortable with using data to make forecasts, they’re happy with the concept of managing risk as a portfolio. It’s probably reasonably safe to say that in insurance and litigation – certainly in the UK – the insurance law space is at the forefront of this sort of development, and that’s due to a confluence of sophisticated clients that are used to working with data, and also the availability of large amounts of data in relation to relatively homogenous litigation. 

Barriers, obstacles, and learning processes

IR

Everyone who’s applying AI across business sectors is having to learn the same lessons, they’re not issues unique to the legal sector. It’s coming down to education around what machine learning can and cannot do, what tools are available to supplement it to be more cost effective. 

At iManage RAVN we’re providing what we call an “AI University” to firms which helps them enhance their AI capabilities. It teaches participants the principles of data curation – how you curate the data, collect the data, and what you can do around the data itself – and it also touches on principles of machine learning. So how the machine actually reads through the agreement. This goes down to quite a technical level. This helps the lawyers to understand what it is that these types of AI tools are doing. The familiarity with how the RAVN engine reads documents makes people a lot more susceptible and a lot more keen to employ the technology once they understand it, because it loses its mystique and is simply a way of doing your job better. 

We also show them what would be known as rules-based tools. So, for certain agreements, if there’s not enough context around a statement, it may actually be more time efficient to simply write a regular expression or similar to identify the information they want to extract. So we’re helping the participants to make the decision between where machine learning, as it currently sits, is most useful and where sometimes more cost-effective solutions could be used. That’s something we’re getting a lot of positive feedback for. The way that we’re doing this is by asking participants to bring their own data. If you wanted to use it on software agreements, we would ask the firm to bring maybe two hundred samples of software agreements and then we would train it to extract key pieces of information that they actually want to use themselves. It’s not a hypothetical exercise, it actually shows machine learning and rules-based tools working on their own documents, and that’s something which has been really well received in the market. Rather than seeing these things as obstacles, it’s helpful to see these as learning points for a new generation of lawyers.

As the market becomes more educated, it all fits together the willingness to adopt it has increased proportionately. Every time we get people in and talk exactly about how it works, rather than have grand visions of how AI can do their jobs, they start to understand how AI can actually augment certain tasks that they already do, and that will free them up to do clients’ billable work. A lot of the use cases we’re seeing have to do with business development essentially – trying to work out what is market standard on certain items. A lot of firms use manual processes – they go through past documents and they identify and maintain a spreadsheet that sets out what has been going on over time. Whereas if you have a tool like the RAVN engine, it’s able to extract the information from the documents that are already sat in their document management system. That frees up senior associates from having to do very labour-intensive work, with no immediate value; they can then transfer that time which would have been non-billable into billable work, which will give the best value to their clients.

AD 

Law has a somewhat deserved reputation for being a reasonably conservative marketplace. That’s changing. There have been huge strides made in recent years – iManage have been a big part of that movement – in relation to AI-driven data extraction. We’re working with RAVN to keep pushing that forwards. 

As an industry, what we are only scratching the surface of is what you can then do with the information you have extracted. We don’t think you can just buy any AI tool out of the box, plug it in, and have AI; in a legal sense that’s just not how it works. The magic is where you take that data and use it to build something that supplements your human expertise in a way that leads to better predictions or recommendations than you would otherwise be able to make. As an industry, we’re still in relatively early days. 

There are also regulatory challenges. We need to be very careful about how we work with client data chiefly, in our section of the market where my team works with medical reports and records for injured children – which is basically as sensitive as it gets, from a data protection perspective. We are also a regulated industry, so there are regulatory considerations about what sorts of recommendations that we can make, and we’re having to think through some of those issues for the first time. So there’s a lot to do and a lot that can be done. Certainly, from our perspective, in our section of the market, if we were not pushing this forwards, we wouldn’t be competitive.

 

Automation

AD 

There are certainly fears of automation of both tasks and jobs. I get asked that question by colleagues and people in the industry and I think it’s an important question, one that we need to grapple with honestly. I don’t think you can duck it. I get nervous whenever I hear someone claiming very confidently that it’s going to make huge numbers of people redundant or that it won’t have any effect at all. 

I have a hypothesis that one of the effects will be that we get better at working out which cases will likely win at trial. The most efficient way of resolving litigation is either by settling as early as possible so that you don’t pay lawyers’ fees throughout the process, or if, you can accurately predict right from the start that you’re going to win, then you take it right the way through to trial. It’s like poker: you start out with two cards and some communal cards that you have to pay to see progressively. What you don’t want to do is to pay to see all those cards and then fold just before the end. I suspect that we might get better at valuing the hand right at the start. 

What effect is that going to have on the litigation process and on how many lawyers we need? Well, I can imagine a future world where we get better at forecasting which cases we’re just going to lose so we advise the clients to get the chequebook out right at the start and those cases go away. And in that situation, you can imagine a world where fewer lawyers are required. The flip side is to imagine a situation where, because we are better at identifying cases that we are likely to win at trial, we take more cases all the way through. We know, for a fact, that a hugely disproportionate amount of the effort involved in litigation happens just before trial. If that factor outweighs the reduction in legal work required from the reduction in number of cases overall, then it’s possible that the amount of legal work required might stay the same or increase. It just moves around. 

Ultimately this is about delivering better results for our clients and for their customers. The idea is to get the appropriate results as early in the process as possible. Yes, this technology is going to change the way that we do things and I can imagine scenarios in either direction in terms of the impact that it will have on automation. 

Let’s say that we get genuinely good at this and we do drive the reduction in the cost of litigation overall; what that potentially does is increase access to justice. If you reduce the cost of getting legal services, then potentially the ability of the people who need it increases. Legal aid in this country isn’t really a thing anymore. So we just don’t know what knock-on effects the technology is going to have; we can only hypothesise about what it might have. This is why I say we need to be agile and to be open-minded.

We need to recognise that there are going to be nuances, there isn’t going to be a single answer because unintended consequences will happen.