Why ChatGPT wrappers don't work for legal research

Open ChatGPT. Ask it a legal question. Get an answer that sounds confident, reads well, and might be completely wrong.

That's the wrapper problem. A surprising number of "legal AI" products are exactly this: ChatGPT's API with a legal-looking interface. They add some prompting, maybe some templates, definitely some marketing. But underneath, it's a general-purpose model doing general-purpose things.

For legal work, that's not good enough.

What wrappers get wrong

No source verification. ChatGPT doesn't know where its information comes from. It was trained on internet text—some of it accurate legal content, some of it outdated blog posts, some of it American law that doesn't apply in your jurisdiction. When you ask a question, you get a synthesis. You don't get citations you can verify.

No temporal awareness. The model's knowledge has a cutoff date. Law changes constantly. That case that was good authority in 2022 might have been overruled. That statutory provision might have been amended. ChatGPT doesn't know and can't check.

Confident errors. General language models are optimised to produce fluent, confident-sounding text. When they're wrong about legal questions—and they often are—they're wrong confidently. The citation looks real. The statute number is formatted correctly. It just doesn't exist.

No jurisdictional precision. Ask ChatGPT about contract law and you might get UK law, US law, or a muddled mix. The model doesn't naturally distinguish between jurisdictions unless you're extremely specific in your prompting—and even then, it's not reliable.

What actually matters for legal AI

Building something useful for legal work requires more than prompting a general model:

Verified legal databases. Not internet training data. Actual case law, legislation, and regulations from authoritative sources. Updated when the law changes. Structured so the system knows what's current.

Multi-step reasoning. Legal questions rarely have simple answers. A useful system needs to find relevant law, check how courts have interpreted it, verify the precedents are still good, and synthesise—with citations at every step.

Jurisdiction awareness. The system needs to know it's working in Dutch law, or English law, or EU law overlapping with national implementation. Not as an afterthought, but as a core part of how it reasons.

Model flexibility. Tying yourself to one AI provider is a strategic mistake. Models improve. New capabilities emerge. A system locked to GPT-3.5 in 2023 looks outdated by 2025. We run multiple models, selecting based on task requirements.

The depth vs. breadth trade-off

ChatGPT is designed for breadth. It should be somewhat helpful to everyone—students, hobbyists, casual users, professionals in any field. That's an impressive engineering achievement, but it means the model is a generalist by design.

Legal work rewards depth. Knowing that 6:74 BW governs non-performance in Dutch contracts isn't enough. You need to know how courts have applied it, what the burden of proof requirements are, how it interacts with limitation periods, and what happens in specific factual scenarios.

A general model has seen some of this in training. A purpose-built legal system has it structured, verified, and connected.

What we do differently

Andri doesn't call a general model and hope for the best. The architecture is built specifically for legal work:

Verified sources only: Case law from official reporters, legislation from government sources, regulations from authoritative databases
Multi-stage verification: Findings are checked against current law before being presented
Jurisdictional context: The system knows which legal system it's operating in and reasons accordingly
Transparent citations: Every claim links to a verifiable source

This isn't about being anti-ChatGPT. The underlying model technology is impressive and useful. But wrapping it in a legal interface and calling it legal AI isn't enough. The hard work is in everything else.

Try it and see the difference.