AI Brainstorming Sameness: Why ChatGPT and

What’s Flint and what changed?

Flint is a new large language model built by Australian startup Springboards to produce more varied responses than mainstream AI tools.

Most LLMs, including ChatGPT and Claude, converge on similar answers to open-ended questions. A research paper titled “Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)” won the best paper award at NeurIPS, a major AI conference, after finding that 25 different models asked 50 times each to write a metaphor about time mostly returned “Time is a river” or “Time is a weaver.” Flint was built on Alibaba’s open-source Qwen 3 model specifically to break that pattern.

Flint represents a deliberate engineering choice to prioritize variety over predictability in AI output.

What’s the evidence behind Flint?

Asked for a random number between 1 and 10, ChatGPT and Claude both returned 7. Flint returned 3.7916. For a New Balance running shoe tagline, both mainstream models said “Run your way.” Flint said “Built to last, run to win.”

The pattern shows up in naming tasks too. When testers asked mainstream models for band names, the suggestions leaned on the same handful of words (“glass,” “neon,” “velvet,” “static”) so often that one AI-suggested name, Sofa Astronauts, turned out to already belong to a real band. In a finance company reinvention exercise, 3 mainstream models all suggested teaching financial literacy in a “fun and funky way,” while Flint proposed rebranding the entire concept of wealth accumulation instead.

The evidence shows Flint consistently diverges from mainstream LLM consensus, though it remains an early prototype.

Springboards did not solve this with the obvious lever. Most models expose a “temperature” setting that adds randomness, and turning it up is the standard advice for more creative output. Springboards found that cranking temperature to maximum on a mainstream model instead produced incoherent responses that switched from English into code mid-sentence. Instead, the team trained Flint to identify the specific points in a response where more variety is possible, such as the exact word naming a destination or a material, and add randomness only there instead of across the entire output.

How does Flint affect day-to-day operations for small businesses?

Small businesses using AI for marketing and strategy currently receive outputs that closely resemble their competitors’, because most models are trained in similar ways on similar data to do similar tasks.

Springboards is positioning Flint as an alternative model inside its broader brainstorming tool for creative professionals in advertising and marketing. Marketing executives at Uncommon and Bodacious report using Flint to push their thinking in new directions, though both caution it does not work every time and needs a human filtering the output. Uncommon’s chief strategy officer put it plainly: tools like Flint should widen the field of ideas, not replace the judgment call on which idea actually ships, and he specifically warns against a team leaning on any AI output, Flint included, instead of thinking through the problem themselves. For current operations, the archive of pipeline-filtered AI signals and operational trends tracks which tools actually deliver distinct outputs versus cloned consensus.

Flint is not yet reliable enough for production use, but it exposes a genuine operational risk: AI tools that converge on identical outputs erase competitive differentiation.

You sit down to name your new product line. You fire up ChatGPT. It gives you “Glass Horizon,” “Neon Hearts,” “Velvet Echo.” You try Claude. It serves up “Static Empire” and “Sofa Astronauts,” which turns out to be a real band already. You’ve spent 45 minutes and have a page of names that sound like they came from the same generator every other founder in your space is using, because they did. The meter’s running on your subscription, your deadline’s approaching, and your brand sounds like a gray paste of every other AI-assisted launch that week. That 1,250-response convergence on 2 metaphors is not a quirk. It’s a tax on originality that compounds every time you accept the first output instead of pushing for a second or third opinion the model was never going to volunteer on its own.

What’s the final verdict on Flint?

Flint is an early-stage prototype with verified divergent capabilities but insufficient reliability for small business deployment.

Marketing executives at Uncommon and Bodacious confirm useful brainstorming applications, with the explicit caveat that the model “sometimes falls over when pushed too far.” OpenAI has responded to the broader convergence research by noting that training models for reliable, coherent answers can push them toward familiar, high-probability responses, and that pushing harder for novelty risks weaker or less reliable output, a tradeoff the major vendors have not resolved.

Small business owners should monitor Flint’s development but should not alter their current AI stack until the tool matures well past its current prototype stage.

The bigger takeaway travels past Flint itself. If a marketing team, a naming exercise, or a strategy session leans on a single mainstream model without a second pass, the output is statistically likely to resemble what a competitor pulled from the same tool last week. That is not a reason to abandon AI-assisted brainstorming. It is a reason to treat the first answer as a starting point instead of a final draft, and to build in a deliberate step where a second model, a human editor, or a tool built specifically for variety gets a look before anything ships.

Source: MIT Tech Review