News You Can Use

Edition 1 · 1st - 13th Sep 2024

News You Can Use

Deep Dives

Three stories worth sitting with

OpenAI Release GPT o1

What
This is a step forward in reasoning capability, using chain of thought prompting and spending time coming up with the best approaches for reviewing received instructions.
So what
This may potentially help us with complex reasoning like identifying risks in contracts alongside our Deal Clarity offering, or it may support building out AI Agents in AGPT because we can use the reasoning process to decide what types of data are needed or which personas to use.

AI Agents / Agentic AI

What
Agentic AI refers to AI systems that act autonomously to complete tasks without constant human input. These agents can understand high-level goals, plan complex tasks, and perform actions, making them ideal for automating repetitive or time-consuming tasks across industries. They can also handover to other agents specialising in other tasks by building in an understanding of what other agents can do.
So what
This could have a significant impact on AG's projects like AGPT, enabling us to potentially design AI agents capable of independently navigating legal workflows. It can be especially beneficial in areas such as e-discovery or contract reviews, reducing time spent on mundane tasks while ensuring accuracy and compliance. Potentially we could build agents that review emails for specific tasks and handover to other agents to complete that tasks i.e. a playbook review for a supply agreement that is emailed into a legal team's inbox.

Claude's System Prompt

What
Claude released details about their system prompt, which sets out the persona and instructions for the models in Claude.
So what
This showed some of the prompt engineering steps needed to make these LLMs work better, some of which we are already doing within AGPT and Deal Clarity.

Harvey's LLM Benchmark And Osborne Clarke test

What
Harvey released a GenAI evaluation system for legal benchmarking, with ongoing comparisons between Harvey and GPT-4o. Osborne Clarke claims to have surpassed Harvey's test.
So what
The evaluation system offers an opportunity for us to test our AI tools in direct comparison with Harvey and GPT-4o. This could help AG refine our own AI offerings, such as Deal Clarity and AGPT, ensuring that we stay at the forefront of legal AI developments and select the best tools for our needs.

New vLex Announcements

What
vLex's Vincent AI capability to conduct legal research more efficiently, showing how AI-powered search tools can drastically reduce time spent searching for precedents, case law, and relevant statutes. They have added contract review and redlining solutions to their product under a "Transaction" banner.
So what
This is directly relevant to AG's efforts to integrate AI into our legal research tools. By adopting similar AI search functionalities within AGPT or Deal Clarity, we can enhance the speed and precision of our internal research processes. This may also allow us to access some doc comparison features through their API rather than doing this work ourselves.

Worth Reading

Everything else worth a click

OpenAI o1 model release

OpenAI's announcement of o1, the first model designed to "think before it speaks" by spending more compute on internal reasoning before answering.

o1 for legal

Artificial Lawyer's take on why o1's reasoning capability matters for legal AI, particularly for agentic workflows where a model has to plan across multiple steps.

Claude System Prompt

Anthropic publishes the actual system prompt behind Claude.ai - a rare look at how a frontier lab shapes model behaviour in production.

McKinsey Report about AI Agents

McKinsey's pitch that agents are the next wave of GenAI, moving from single-turn chat to systems that plan, act, and complete multi-step business processes.

vLex Vincent AI Overview

Artificial Lawyer TV walkthrough of vLex's Vincent AI - useful for seeing how a research-led vendor is layering generative capability on top of a deep legal content set.

Harvey benchmark

Harvey's public benchmark for legal AI, scored against lawyer-quality work with rubrics for answer quality and source accuracy - a template for how vendors should evidence performance claims.