JUNE 14-15 2026 / DATE-HONEST LANDSCAPE / THE LIMIT NOBODY BENCHMARKS

The models that week bragged about context windows. The number that actually bit was 20.

Every roundup of these two days hands you model cards and a column of context-window specs. None of them mention the limit that decides whether you can even reopen the work you did last week, because that limit lives in your agent client, not in the model. This page gives the date-honest answer first, then the part nobody covers.

M
Matthew Diakonov
8 min read

Direct answer, verified 2026-06-16

No major foundation model carries a precise June 14 or June 15, 2026 release date. The nearest dated open releases bracket the window rather than landing on it: DiffusionGemma 26B-A4B from Google on June 10, and GLM-5.2 from Zhipu AI on June 16. Because no platform publishes a list keyed to a calendar day, the only date-honest sources for any 48-hour window are:

That is the literal answer. The rest of this page is the thing the spec tables leave out: once you have a model worth testing, the limit that decides whether you can find your own past work is set by your client, not the model.

The landscape, dated honestly

The first half of June 2026 was one of the denser open-weight stretches on record, with the Qwen, DeepSeek, MiniMax, Hunyuan, ERNIE, and GLM families all shipping new versions inside a couple of weeks. But a calendar-day query for June 14-15 lands between the dated drops, not on one. Here is the window with real dates attached.

What is actually dated around June 14-15, 2026

1

June 10, 2026 - DiffusionGemma 26B-A4B (Google)

The nearest dated open release before the window. A diffusion-style open-weight model in the Gemma line. Listed on release trackers with a June 10 date, not June 14 or 15.

2

June 14-15, 2026 - no precise foundation-model date

No major model carries a hard June 14 or June 15 timestamp on the public trackers. A calendar-day search lands here because the platforms order by trending score, not ship date. The honest answer is the live feeds, not a single name.

3

June 16, 2026 - GLM-5.2 (Zhipu AI)

The nearest dated open release after the window, billed as a top frontend-coding model. It bookends the search window from the other side, one day later than the date in the query.

4

The durable feeds for any 48-hour window

huggingface.co/models sorted by trending, huggingface.co/papers, and github.com/trending. These carry timestamps you can trust (model-card commit dates, arXiv submission dates), unlike a homepage trending order that has no notion of dates at all.

The lesson is not that nothing shipped. It is that a homepage trending order has no notion of dates, so the trustworthy timestamps live in model-card commit histories and arXiv submission dates, not in a feed that resorts itself hour to hour.

Two limits people keep confusing

Every model in that wave advertised a context window: how much text it can read inside a single turn. It is a real number and the vendor sets it. But there is a second limit, and it is the one that quietly governs your day: how many of your past conversations the app you run the model inside will keep and let you reopen. The model does not set that one. The client does.

A 1M-token context window is useless for finding the debugging session you ran last Tuesday if your client only lists your 20 most recent chats. That is not a hypothetical. It is exactly the bug Fazm shipped a fix for in the middle of this very window.

The fix that landed June 15: LIMIT 20 to LIMIT 500

Fazm is a native macOS app that wraps Claude Code, Codex, and Gemini as agent-loop backends and keeps every chat persistent, so a closed window reopens with its full history after a Mac restart. The history list that surfaces those chats is backed by a SQL query. On June 15, 2026, that query changed by exactly one number.

ConversationHistorySection.swift, the history query

The conversation history list loaded the 20 most recent chats and stopped. There was no search and no pagination to reach anything older, so a power user's earlier sessions were present in the database but invisible in the UI.

  • SQL ended in LIMIT 20
  • Older chats existed but could not be reached
  • Reported by a power user (credited in-source as Farid Mammadov)
Desktop/Sources/MainWindow/Pages/ConversationHistorySection.swift
        ORDER BY lastMessageDate DESC
-       LIMIT 20
+       LIMIT 500
    """)
+   // 2026-06-15: bumped from LIMIT 20 -> 500. The 20 cap (added
+   // 2026-05-28 alongside the SUBSTR perf fix) silently hid older
+   // conversations from power users with no search/pagination to
+   // reach them. The actual perf storm was string LENGTH (40KB
+   // pasted prompts), already fixed by SUBSTR(...,1,300); row count
+   // is cheap because the list is a LazyVStack that only renders
+   // visible rows. 500 keeps a sane upper bound until proper
+   // search/pagination lands.

The detail that makes this checkable rather than marketing: the comment names why the original cap existed and why removing it is safe. The slowdown that prompted the 20-cap on May 28 was string LENGTH on large pasted prompts, not row count, and it had already been fixed separately with SUBSTR(...,1,300), which loads only the first 300 characters of each preview. So the cap was solving a problem that no longer existed, at the cost of hiding real work.

The number, in one line

0
past conversations listed before June 15, 2026
0
past conversations listed after commit 03499909

Why this is the right lens on release-day news

When a model lands, the useful move is not to read the benchmark table and move on; it is to try the model on a real task inside the loop you already use, then compare it against the model you ran last week. That comparison only works if last week's session is still reachable. A context window does not give you that. Retention does, and so does the absence of auto-compacting inside the live window.

That is the whole reason a one-line SQL change deserves a place next to a model launch. The releases are loud and frequent; the thing that decides whether you can actually evaluate them over time is quiet and lives in your client. On June 15 that quiet number went from 20 to 500.

Verify every claim here

  1. For the release dates: open llm-stats.com/ai-news and check that DiffusionGemma 26B-A4B is dated June 10 and GLM-5.2 is dated June 16, with nothing major pinned to June 14 or 15.
  2. For the live landscape: open huggingface.co/papers and github.com/trending. The trending order shifts, but the submission and commit dates are stable.
  3. For the Fazm change: open github.com/mediar-ai/fazm, read Desktop/Sources/MainWindow/Pages/ConversationHistorySection.swift for the LIMIT 500 query and its dated comment, and check the git log for commit 03499909, "Increase conversation history limit to 500," authored June 15, 2026.

Want to test a fresh release against last week's session, side by side?

Fifteen minutes. I will open a months-old chat from the history list, fork it, and run a new model against the same task so you can see retention and forking actually do their job.

Frequently asked questions

What AI model released on June 14 or June 15, 2026?

Nothing major is dated precisely to either day. Release trackers like llm-stats.com show the nearest dated open releases bracketing the window rather than landing on it: DiffusionGemma 26B-A4B from Google on June 10, and GLM-5.2 from Zhipu AI on June 16, billed as a top frontend-coding model. The reason a calendar-day search rarely returns a clean answer is that neither Hugging Face nor GitHub publishes a list keyed to a date. Both order discovery by a rolling trending score with no notion of when something shipped, so the date-honest feeds for any 48-hour window are huggingface.co/models sorted by trending, huggingface.co/papers, and github.com/trending.

Was there a big open-weight wave around mid-June 2026?

Yes, broadly. The first half of June 2026 was one of the denser open-weight stretches on record, with a run of competitive Chinese releases (the Qwen, DeepSeek, MiniMax, Hunyuan, ERNIE, and GLM families all shipping new versions within a couple of weeks). MiniMax M3 landed June 1. GLM-5.2 landed June 16. But none of those carry a hard June 14 or June 15 timestamp, and a vendor's own benchmark numbers are self-reported until an independent harness re-runs them. Treat a fresh release as a thing to test, not a leaderboard fact.

Why do all these releases brag about context window size?

Because context window is the spec that is easy to print and easy to compare: 256K, 512K, 1M tokens. It is a real capability, but it describes how much the model can read inside a single turn. It says nothing about whether your agent client will still have last Tuesday's conversation available to reopen. Those are two different limits. The model's context window is set by the vendor; the session-retention limit is set by the app you run the model inside, and it is the one that quietly decides whether your past work is reachable.

What is the retention limit this page is about?

It is how many of your past conversations an agent client keeps in its history list and can restore. On June 15, 2026, Fazm raised that from 20 to 500. The change is a one-line edit to the SQL query in Desktop/Sources/MainWindow/Pages/ConversationHistorySection.swift, from LIMIT 20 to LIMIT 500, in commit 03499909. The inline comment records why: the 20 cap had been added on May 28 alongside a string-length performance fix, and it silently hid older chats from a power user (credited in the comment as Farid Mammadov) who had no search or pagination to reach them.

Did raising the limit to 500 hurt performance?

No, and the commit comment explains the reasoning rather than guessing. The performance storm that prompted the original cap was not row count; it was string LENGTH on large pasted prompts (40KB and up), and that was already fixed separately with SUBSTR(...,1,300) which only loads the first 300 characters of each preview. Row count is cheap because the history list is a SwiftUI LazyVStack that renders only the rows currently on screen. So 500 is a sane upper bound that stays until proper search and pagination land, without reintroducing the slowdown.

How is conversation retention different from auto-compacting?

Auto-compacting happens inside a single live chat: the tool silently summarizes or drops earlier turns to keep the running context under a budget, which can erase decisions you made an hour ago. Retention is about chats you already closed: how many of them the app still lists and lets you reopen at all. Fazm addresses both. It does not auto-compact a live window, so the full history stays in context for the window's lifetime, and as of June 15 it lists up to 500 prior conversations instead of 20. A new model with a 1M-token window does not give you either property; the client does.

How do I verify the Fazm change myself?

Open github.com/mediar-ai/fazm and read Desktop/Sources/MainWindow/Pages/ConversationHistorySection.swift. The query near the conversation summary load reads LIMIT 500, with a dated comment explaining the bump from 20. The git log shows commit 03499909, titled "Increase conversation history limit to 500," authored June 15, 2026. The changelog entry that followed (commit f9be2fd0) records the same fix. Everything is public, so you do not have to take any of this on faith.

Where should I track model and paper releases day to day?

Three feeds plus one habit. huggingface.co/models sorted by trending for weights and quantized variants. huggingface.co/papers for research that links an implementation. github.com/trending for the application layer of agents, MCP servers, and inference engines. The habit: keep the model you actually want to evaluate wired into the agent loop you use every day, with a session history deep enough that you can reopen the comparison you ran last week rather than starting over.

How did this page land for you?

React to reveal totals

Comments ()

Leave a comment to see what others are saying.

Public and anonymous. No signup.