We've been incredibly fortunate with how things have developed over the past year
I still remember how in late 2023, people were speculating that Mixtral-8x7b was the best open-weights model that the community would get "for a long time", and possibly ever. Shortly afterwards, Mistral published a controversial blog post that appeared to indicate that they were moving away from open weights – an ominous sign at a time when there were very few open-weights models available, and Anthropic and OpenAI seemed as far out of reach as the stars.
But since then:
- Meta released the excellent Llama 3 series as open weights (though not entirely free software).
- Contrary to what many had feared, Mistral continued to publish open-weights models, even releasing the weights for Mistral Large, which was previously API-only, and now publishing their latest Mistral Small under the Apache License, when the previous version was still under their proprietary MRL.
- Yi-34b transitioned from a proprietary license to Apache.
- Microsoft has been publishing a number of excellent small models under permissive licenses.
- Qwen came out of nowhere, and released the best models that can be run on consumer hardware, almost all of them under permissive licenses.
- DeepSeek upended the entire industry, and an MIT-licensed model is now ranked joint #1 on style-controlled LMSYS, on par with cutting-edge, proprietary, API-only models.
This was completely unforeseeable a year ago. Reality has outpaced the wildest dreams of the most naive optimists. Some doomsayers even predicted that open-weights models would soon be outlawed. The exact opposite has happened, and continues to happen.
To get an idea for what could easily have been, just look at the world of image generation models. In 15 months, there have only been two significant open-weights releases: SD3, and Flux.1D. SD3 was mired in controversy due to Stability's behavior and has been all but ignored by the community, and Flux is crippled by distillation. Both models are censored to a degree that has become the stuff of memes, and their licenses essentially make them unusable for anything except horsing around.
That is how the LLM world could have turned out. Instead, we have a world where I don't even download every new model anymore, because there are multiple exciting releases every week and I simply lack the time to take all of them for a spin. I now regularly delete models from my hard drive that I would have given my right hand for not too long ago. It's just incredible.