Head·word /ˈhedˌwɜː(ɹ)d/ n. @headword

**Bornach** @bornach@masto.ai · 15h

[Discover AI] demonstrates Gemini AI #DeepResearch failure in which its training date cutoff of 2023 makes it consider today's date to be far into the future so instead of carrying out research into current technology, it fabricates a whole story that is entirely speculative fiction
https://youtu.be/mAms6Lt46H8
#GoogleGemini #AIhype #AIfail #LLM #GenerativeAI

YouTubeWatch a Critical AI FailureBy Discover AI

**Miguel Afonso Caetano** @remixtures@tldr.nettime.org · 2d

Miguel Afonso Caetano @remixtures@tldr.nettime.org

"Ordinary users don’t want to learn about the relative strengths and weaknesses of various products like Operator and Deep Research. They just want to ask ChatGPT a question and have it figure out the best way to answer it.

It’s a promising idea, but how well does it work in practice? On Friday, I asked ChatGPT Agent to perform four real-world tasks for me: buying groceries, purchasing a light bulb, planning an itinerary, and filtering a spreadsheet.

I found that ChatGPT Agent is dramatically better than its predecessor at grocery shopping. But it still made mistakes at this task. More broadly, the agent is nowhere close to the level of reliability required for me to really trust it.

And as a result I doubt that this iteration of computer-use technology will get a lot of use. Because an agent that frequently does the wrong thing is often worse than useless."

https://www.understandingai.org/p/chatgpt-agent-a-big-improvement-but

Understanding AI · 2dChatGPT Agent: a big improvement but still not very usefulBy Timothy B. Lee

#AI #GenerativeAI #AIAgents

**Doug Holton** @dougholton@mastodon.social · Jul 5

Jul 5

Doug Holton @dougholton@mastodon.social

Testing out a #DeepResearch tool on the topic of: Research-Based #Teaching Techniques for Improving Student Retention in Online Courses
https://www.opendeepresearch.dev/chat/OSVutEomBXzjwqpga5oNA
Here are also links to NotebookLM summaries and a podcast version:
https://docs.google.com/document/d/108hIAj_SRPKv68PANTc_J41vh-3Q8WMlttfjao7aj4k/edit?usp=drivesdk
Not a bad job although it's a bit short and doesn't provide specific examples and is a bit out of date (flipgrid no longer exists). Doesn't mention most of the techniques I list here: https://canvas.instructure.com/courses/3168265
#AIEd #EdDev #EdTech #OnlineTeaching

www.opendeepresearch.devResearch-Based Teaching Techniques for Improving Student Retention and Graduation Rates in Community Colleges | Open Deep ResearchDiscover the research on "Research-Based Teaching Techniques for Improving Student Retention and Graduation Rates in Community Colleges" generated using 30 sources on Open Deep Research

**MottG** @mottg@researchbuzz.masto.host · Jun 25

Jun 25

MottG @mottg@researchbuzz.masto.host

"From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents"

We demonstrate that Agentic Deep Research not only significantly outperforms existing approaches, but is also poised to become the dominant paradigm for future information seeking.

https://arxiv.org/abs/2506.18959

Awesome Agentic Deep Research Resources Github repo: https://github.com/DavidZWZ/Awesome-Deep-Research

#research #deepResearch #AItools

**MottG** @mottg@researchbuzz.masto.host · Jun 24

Jun 24

MottG @mottg@researchbuzz.masto.host

"Kimi-Researcher:
End-to-End RL Training for Emerging Agentic Capabilities"

Kimi-Researcher is an agentic thinking model that can do multi-step planning, reasoning, and tool use. It uses 3 main tools: a parallel, real-time internal search tool; a text-based browser tool for web tasks; and a coding tool for code execution. Kimi-Researcher was trained entirely through end-to-end agentic reinforcement learning.

#research #deepResearch #AItools

**MottG** @mottg@researchbuzz.masto.host · Jun 11

Jun 11

MottG @mottg@researchbuzz.masto.host

"Deep Research Bench: Evaluating AI Web Research Agents"

We introduce Deep Research Bench, consisting of 89 multi-step web research task instances of varying difficulty across 8 diverse task categories, with the answers carefully worked out by skilled humans.

https://arxiv.org/abs/2506.06287

#research #AItools #benchmark

**MottG** @mottg@researchbuzz.masto.host · Jun 3 *

Jun 3 *

MottG @mottg@researchbuzz.masto.host

OpenAI's chatGPT deep research report generator will now let you save your reports as nicely formatted PDF files with inline images and tables nicely formatted. It still has an issue that its references at the end of the report are not in the proper bibliographic format for academic settings.

#research #deepResearch

**MottG** @mottg@researchbuzz.masto.host · May 31 *

May 31 *

MottG @mottg@researchbuzz.masto.host

Future House Falcon is a deep search AI tool that shows annotations to various bibliographical references such as "Highest Quality," "Peer Reviewed," and "Domain Leadiing" in the reports
that it generates. This is the first example I have seen of this type of bibliographical enrichment in a deep research AI tool.

https://platform.futurehouse.org

https://platform.futurehouse.org/
#research #deepResearch #FutureHouse

**beSpacific** @bespacific@newsie.social · May 26

May 26

beSpacific @bespacific@newsie.social

You can now connect your Box and #Dropbox accounts to #DeepResearch on #ChatGPT and pull data, which will be used by the #AI to conduct #research. Please do not do this - you will add to the unending pool of data being sucked out of creators and repurposed for profit - not your profit or benefit. #copyright #KM #research #legalresearch #education #copyright #intellectualproperty https://www.bleepingcomputer.com/news/artificial-intelligence/chatgpt-deep-research-can-now-pull-data-from-dropbox-and-box/

**MottG** @mottg@researchbuzz.masto.host · May 17

May 17

MottG @mottg@researchbuzz.masto.host

I am having trouble generating deep research reports from AI2 Scholar QA tool as they are timing out at 240 seconds. So, I am having to add a clause to my prompt "Generate only 5 sections in your report so that you do not time out on writing a complete report." Otherwise, the tool tries to generate a report with from 6 to 8 sections and winds up aborting with a timeout error that does not save any of the generated report text and references.

#research #deepResearch

**MottG** @mottg@researchbuzz.masto.host · May 14

May 14

MottG @mottg@researchbuzz.masto.host

AIのDeep Researchを使い比べ--ChatGPT、Gemini、Perplexity、Grokの違いは？
(Comparing AI Deep Research--What's the difference between ChatGPT, Gemini, Perplexity, and Grok?)

Nice recent article comparing several tools for deep research. The article is in Japanese so you need to bring it up in a browser like Chrome that will autotranslate each page for you if you select that option after doing a right click.

https://news.yahoo.co.jp/articles/ac54c94f65faa5f48376ec58194abb9f0218456c

#research #DeepResearch #AItools

**MottG** @mottg@researchbuzz.masto.host · May 5

May 5

MottG @mottg@researchbuzz.masto.host

I've been exploring this deep research tool mentioned by Tom Dörr on X.
The tool, available at
https://research.u14.app has many configuration options for how you set it up. The tool has more interactive features during the generation of the report that you can use to steer the direction of the report, ask it to generate more output in the report, etc. It is from the repo: https://github.com/u14app/deep-research
This would be good for exploratory
research.

#deepResearch #research #u14app

**MottG** @mottg@researchbuzz.masto.host · Apr 25 *

Apr 25 *

MottG @mottg@researchbuzz.masto.host

https://chatgpt.com now offers a deep research feature even on the free tier. It is a limited version of the deep research that you would get if you subscribed to their $200/month tier, but it still generates fairly decent deep research reports. Free users will be restricted to 5 deep research reports per month.

#research #deepResearch #AI

**MottG** @mottg@researchbuzz.masto.host · Apr 12

Apr 12

MottG @mottg@researchbuzz.masto.host

I'm looking a various AI deep research tools and I'm finding that what is just as valuable or sometimes far more valuable than the actual report it produces are tools like Genspark that allow you to see how they reasoned and what articles they "read" in producing their final report. These reasoning breadcrumbs are great for exploratory search and lateral thinking.

#deepResearch

**MottG** @mottg@researchbuzz.masto.host · Apr 9

Apr 9

MottG @mottg@researchbuzz.masto.host

"DeepSeek-R1 Thoughtology:
Let’s <think> about LLM reasoning"

Interesting, very long paper about how reasoning works in DeepSeek-R1. One finding was that enhanced reasoning creates dual-use risk - better capabilities but worse safety. Future models might move away from the single chain of reasoning in this model to diverse reasoning strategies to enhance problem-solving flexibility.

https://mcgill-nlp.github.io/thoughtology/

McGill NLPDeepSeek-R1 Thoughtology: Let’s think about LLM reasoningLarge Reasoning Models like DeepSeek-R1 mark a fundamental shift in how LLMs approach complex problems. Instead of directly producing an answer for a given input, DeepSeek-R1 creates detailed multi-step reasoning chains, seemingly “thinking” about a problem before providing an answer. This reasoning process is publicly available to the user, creating endless opportunities for studying the reasoning behaviour of the model and opening up the field of Thoughtology. Starting from a taxonomy of DeepSeek-R1’s basic building blocks of reasoning, our analyses on DeepSeek-R1 investigate the impact and controllability of thought length, management of long or confusing contexts, cultural and safety concerns, and the status of DeepSeek-R1 vis-à-vis cognitive phenomena, such as human-like language processing and world modelling. Our findings paint a nuanced picture. Notably, we show DeepSeek-R1 has a ‘sweet spot’ of reasoning, where extra inference time can impair model performance. Furthermore, we find a tendency for DeepSeek-R1 to persistently ruminate on previously explored problem formulations, obstructing further exploration. We also note strong safety vulnerabilities of DeepSeek-R1 compared to its non-reasoning counterpart, which can also compromise safety-aligned LLMs.

#research #deepResearch #DeepSeek

**Bornach** @bornach@masto.ai · Mar 14

Mar 14

Bornach @bornach@masto.ai

It is best not to trust the research summaries generated by these Deep Research AI's according to Jordan Harrod

https://youtu.be/OdZq3DJSFHE

YouTubeDon't Use Deep Research (Until You Watch This) | Gemini, OpenAI, and Perplexity Deep ResearchBy Jordan Harrod

#AI #GenerativeAI #DeepResearch

**PUPUWEB Blog** @pupuweb@mastodon.social · Mar 14

Mar 14

PUPUWEB Blog @pupuweb@mastodon.social

Google announces Gemini users can now access Deep Research for free in 45+ languages, powered by Gemini 2.0 Flash! #Google #Gemini #DeepResearch #AI #TechNews #AI2.0 #Languages #Innovation #AIResearch #Gemini2

**MottG** @mottg@researchbuzz.masto.host · Mar 4

Mar 4

MottG @mottg@researchbuzz.masto.host

I ran a little experiment with the free "Deep Research" feature of perplexity.ai
on the question of "State of the Art in Generative Retrieval-Augmented Models in Information Retrieval."

The generated report is shown below.
It consulted 71 academic sources, but only 1 source was listed in the report.
Also, I can browse the list of 71 sources but there was no way to export it and it does not show in the
Perplexity Page below:

https://www.perplexity.ai/page/state-of-the-art-in-generative-aLB0usBeTDSMYtokWfiPWg

#research #Perplexity #deepResearch