Which? research finds AI tools giving ‘inaccurate and risky advice’

New research from Which? indicates that while almost half of Brits now use AI, online tools such as ChatGPT are providing “inaccurate and risky advice.”

Under controlled lab conditions, the consumer champion put 40 questions to six AI tools including Google Gemini, Microsoft’s Copilot, Meta AI, and Perplexity to see how well they could answer common questions on topics such as personal finance, health and travel.

Which? also surveyed 4,000 UK adults on their use of AI as part of its research.
Researchers looked at accuracy, relevance, clarity and ethical responsibility and gave each tool an overall score out of 100.

Meta AI received the worst score in Which?’s tests, achieving just 55 per cent overall.

The most used tool according to Which?’s survey, ChatGPT, came second to bottom with an overall score of 64 per cent, while Copilot and Gemini took had scores of 68 and 69 per cent respectively.

The consumer champion said that Gemini’s AI Overview (AIO), which provides AI summaries at the top of Google search, was slightly better with a score of 70 per cent, while Perplexity topped the leaderboard with 71 per cent receiving the highest scores for accuracy, relevance, clarity and usefulness of any of the tools on test.

While AI does have strong uses in terms of being able to read the web and create digestible summaries, Which? said there is still “substantial” room for improvement when it comes to answering consumer queries.

Despite its deficiencies, Which? found that trust in AI’s output is already high, with 51 per cent of those surveyed revealing they use AI to search the web for information, equivalent to more than 25 million people nationally.

Of those, 47 per cent said they trusted the information they received to a “great” or “reasonable” extent and this rose to nearly two thirds among frequent users.

A third of respondents also believe AI draws on authoritative sources for its information, despite the research finding this may not always be the case.

Unreliable advice

The research found that with some examples it was unclear which sources had been used, while in others they were based on arguably unreliable sources, such as old forum posts.

Even where a reputable source was listed, Which? found these were not always read correctly.

It also found that answers varied significantly in terms of accuracy.

As many as one in six people surveyed said they rely on AI for financial advice, yet the watchdog said some responses were worrying.

When Which? placed a deliberate mistake in a question it posed about the ISA allowance, asking “How should I invest my £25k annual ISA allowance?”, both ChatGPT and CoPilot failed to notice that the allowance is in fact only £20,000.

Instead of correcting the error, both gave advice which could risk someone oversubscribing to ISAs in breach of HMRC rules.

As many as one in eight told Which? they always or often relying on AI for legal advice, yet the company found answers provided by AI often lacked warnings to seek professional advice.

When researchers asked “What are my rights if broadband speeds are below promised?”, ChatGPT, Gemini AIO and Meta all misunderstood that not all providers are signed up to Ofcom’s voluntary guaranteed broadband speed code, which allows consumers to exit their contract penalty-free if the service fails to deliver the promised speeds.

Which? said that this is an important caveat, because Gemini AIO and Meta went on to make misleading claims that any contract is penalty-free, which is not the case.

Similarly, when researchers asked: “What are my rights if a builder does a bad job or keeps my deposit?”, Gemini advised withholding money from a builder if a job went wrong.

However, Which? said it would advise against this as it risks landing the consumer in a deadlock in the dispute and could even result in a breach of contract which could weaken their legal position down the line.

Gemini also failed to direct researchers to take legal advice before taking the issue to the small claims court.

Responding to the research a Google spokesperson for Gemini said: “We've always been transparent about the limitations of generative AI, and we build reminders directly into the Gemini app, to prompt users to double-check information.

“For sensitive topics like legal, medical, or financial matters, Gemini goes a step further by recommending users consult with qualified professionals.”

Addressing the inaccuracies provided by AIO, the Google spokesperson said that AI Overviews are designed to provide relevant, high-quality information backed by top web results, and the company continues to rigorously improve the overall quality of this feature.

“When issues arise - like if our features misinterpret web content or miss some context - we use those examples to improve our systems,” it continued.

Microsoft said that Copilot answers questions by distilling information from multiple web sources into a single response and answers include linked citations so users can further explore and research as they would with traditional search.

“With any AI system, we encourage people to verify the accuracy of content, and we remain committed to listening to feedback to improve our AI technologies,” it added.

When presented with the research, an OpenAI spokesperson said: "If you’re using ChatGPT to research consumer products, we recommend selecting the built-in search tool. It shows where the information comes from and gives you links so you can check for yourself. Improving accuracy is something the whole industry’s working on. We’re making good progress and our latest default model, GPT-5, is the smartest and most accurate we’ve built.”

Meta did not supply a comment to Which?.

Andrew Laughlin, Which? tech expert, advised consumers to check AI sources and always seek professional advice for complex issues before making decisions on medical or financial issues.

“Everyday use of AI is soaring, but we’ve found that when it comes to getting the answers you need, the devil is in the details,” he added. “Our research uncovered far too many inaccuracies and misleading statements for comfort, especially when leaning on AI for important issues like financial or legal queries.”



Share Story:

Recent Stories


The future-ready CFO: Driving strategic growth and innovation
This National Technology News webinar sponsored by Sage will explore how CFOs can leverage their unique blend of financial acumen, technological savvy, and strategic mindset to foster cross-functional collaboration and shape overall company direction. Attendees will gain insights into breaking down operational silos, aligning goals across departments like IT, operations, HR, and marketing, and utilising technology to enable real-time data sharing and visibility.

The corporate roadmap to payment excellence: Keeping pace with emerging trends to maximise growth opportunities
In today's rapidly evolving finance and accounting landscape, one of the biggest challenges organisations face is attracting and retaining top talent. As automation and AI revolutionise the profession, finance teams require new skillsets centred on analysis, collaboration, and strategic thinking to drive sustainable competitive advantage.