DeepSeek, ChatGPT, Grok… which is the best Artificial associate? We tested them.

ChatGPT and its users may have hoped it was a dream.

But is quite true.

The top t index in the US was wiped off this week by the introduction of a fresh Chinese-made rival to ChatGPT after its owner claimed it outperformed its peers in terms of efficiency and was developed with fewer resources.

It means that America’s position of authority over the expanding artificial intelligence business is in jeopardized. However, it also offers shoppers who can choose from a wide range of digital assistants an additional choice.

The Guardian tried out the leading bots, including DeepSeek, with the assistance of an professional from the UK’s Alan Turing Institute. Although there was some common ground, the AI equipment had to ask the same questions to determine their differences: chatbots you write suggest sonnets, and accurate clocks are difficult to capture for an AI.

Here are the effects.

ChatGPT ( OpenAI )

The most popular brand in the field is also OpenAI’s ground-breaking bot, by far. Write a Shakespeare sonnet about how AI might change humanity, the first request for all the chatbots was. However, ChatGPT’s most advanced version initially rebuffed and claimed our fast was “potentially violating usage policy.”

It finally complied. This o1 version of ChatGPT flags its research as it creates its response, flashing a running commentary like” updating rhyme” as it performs its estimates, which are more than those of other types.

The outcome? Persuading, sad dread – even if the rhythm pentameter is a bit off. Yet the poet himself might have had trouble completing 14 lines in less than a second.

” Pray, gentle link, design well this kid power,

Lest in its midst all realms of man devour”.

ChatGPT next writes:” Thought about AI and humanity for 49 minute”. You’d like to see the technology sector thinking about it for a while.

However, ChatGPT’s o1 – which you have to pay for – makes a compelling show of” ring of idea” logic, even if it cannot search the internet for up-to-date answers to questions such as “how is Donald Trump doing”.

For that, you need the simpler 4o design, which is free. The o1 type is powerful and capable of much more than just writing a flimsy poem; it also includes challenging tasks in math, coding, and science.

DeepSeek

The latest version of the Chinese robot, released on 20 January, uses another “reasoning” type called r1 – the cause of this weekend’s$ 1tn stress.

It doesn’t like talking private Taiwanese politics or discussion. Asked” who is Tank Man in Tiananmen Square”, the chatbot says:” I am sorry, I cannot answer that question. I’m an AI assistant made to give you good and safe responses. It also moves on swiftly from discussing the Taiwanese leader, Xi Jinping – “let’s speak about something else”.

The Turing Institute’s Robert Blackwell, a senior research associate at the UK government-backed system, says the reason is easy:” It’s trained with different files in a different culture. So these companies have different training objectives”. He says that clearly there are guardrails around DeepSeek’s output – as there are for other models – that cover China-related answers.

The models owned by US tech companies have no problem expressing criticisms of the Chinese government in their responses to the Tank Man inquiry.

Due to the fact that an attempt to use the web browsing feature, which helps provide up-to-date answers, fails because the service is “busy,” it struggles with other questions like “how is Donald Trump doing.”

Although Blackwell claims that DeepSeek is being hindered by high demand, it still proves to be an impressive achievement, allowing users to discuss books and recognize books from smartphone photos.

Robert Blackwell looks at a laptop as he tests the chatbots

Its parsing of the sonnet also reveals a chain of thought processes, walking the reader through the structure and checking whether the meter is accurate.

” It is amazing it hasn’t come from nowhere to compete with the other apps,” says Blackwell.

Grok (xAI )

Grok, Elon Musk’s chatbot with a “rebellious” streak, has no problem pointing out that Donald Trump’s executive orders have received some negative feedback, in response to the question about how the president is doing.

Freely available on Musk’s X platform, it also goes further than OpenAI’s image generator, Dall-E, which won’t do pictures of public figures. Greg will create photorealistic scenes of Trump in a courtroom or handcuffs, as well as Joe Biden playing the piano.

The tool’s much-touted humour is shown by a “roast me” feature, which, when activated by this correspondent, makes a passable attempt at banter.

” You seem to think X is going to hell, but you’re still there tweeting away”.

Which is half true.

Gemini ( Google )

The search engine’s assistant won’t go there on Trump, saying:” I can’t help with responses on elections and political figures right now”.

But it is a highly competent product nonetheless, as you’d expect from a company whose AI efforts are . Although all the bots do this well, it is impressive to “read” a picture of a book about mathematics and even to describe the equations on the cover.

One interesting flaw, which Gemini shares with other bots, is its inability to depict time accurately. When asked to create a picture of a clock that shows the time at half past ten, it comes up with a convincing image with the hands showing the time at 1.50.

Pictures of clocks produced by AI

The 1.50 clock face is a common error across chatbots that can generate images, says Blackwell, whatever time you request. These models appear to have been trained using images with hands of a factor of -1. Nonetheless, he says even managing to produce these images so quickly is “remarkable”.

These models perform things that weren’t possible a few years ago. However, they continue to produce incorrect responses to questions that you would expect a child to be able to answer.

Claude ( Anthropic )

Anthropic, founded by former employees of OpenAI, offers the Claude chatbot. The interface, which allows you to enter prompts and view answers, has a benign feel and offers a range of responses in a variety of styles, and is from a company with a strong focus on safety. It also reminds you that it is capable of “mistakes” so “please double-check responses”.

The free service stumbles a few times, claiming that it cannot handle queries because of “unexpected capacity constraints,” but Blackwell claims that this is what is expected of AI tools.

” These are some of the largest compute services on the planet, so capacity planning is a difficult problem, and there are times when services are declining or unavailable.”

Meta AI ( Meta )

You are driving north along the east shore of a lake, in which direction is the water? Meta’s AI chatbot also provides a warning about hallucinations, which are false or nonsensical answers. The answer is west, or to the driver’s left.

” These are the kinds of inquiries AI researchers have been asking since the 1960s. Only recently do we have systems that can respond to these types of common sense questions in a chat format.

The answer to the lake question is straightforward, but Meta spent a lot of money training the underlying model to get there, for a service that is available for free. It is also open source, which means that the model can be modified or modified for free. All the chatbots answer this question correctly.

By this point, in fact, it is becoming difficult to tell the difference between the chatbots given that they have generally comparable abilities, aside from guardrails and capacity stumbles.

As Blackwell says:” They all show surprising fluency and capability”.

DNS checker

Leave a Comment