Discovering Hidden Gems in Model Repositories
Abstract
Public repositories host millions of fine-tuned models, yet community usage remains disproportionately concentrated on a small number of foundation checkpoints. We investigate whether this concentration reflects efficient market selection or if superior models are systematically overlooked. Through an extensive evaluation of over 2,000 models, we show the prevalence of “hidden gems”, unpopular fine-tunes that significantly outperform their popular counterparts. Notably, within the Llama-3.1-8B family, we find rarely downloaded checkpoints that improve GSM8K accuracy from 83.2% to 96.0% without increasing inference costs. However, discovering these models through exhaustive evaluation of every uploaded model is computationally infeasible. We therefore formulate model discovery as a Multi-Armed Bandit problem and accelerate the Sequential Halving search algorithm by using shared query sets and aggressive elimination schedules. Our method retrieves top models with as few as 50 queries per candidate, accelerating discovery by over 50x.
Repository Inefficiency
Cumulative Downloads Distribution Usage is extremely concentrated in a tiny fraction of top models, with less than 0.15% of models accounting for more than 95% of all downloads.
Percentage of Unused Models: The vast majority of models are rarely explored, reinforcing the concept of repository inefficiency where many potentially valuable models remain undiscovered.
Detecting "Hidden Gems"
Information Asymmetry: While most downloads are centralized in a small subset of models, we believe these models are popular due to information asymmetry: users cannot evaluate all models to find the best one available, and therefore default to popular, standard base versions which are well-tested but sub-optimal.
Hidden Gems Discovery: To test this hypothesis we evaluate over 2,000 models from 4 popular model trees on diverse tasks. Our evaluation validates our hypothesis by revealing the existence of "Hidden Gems": highly unpopular models that significantly outperform their popular counterparts.
Llama 3 8B
Mistral 7B
Qwen 3B
Qwen 7B
Failure of Heuristics: Why are these models missed? Over 90% of identified gems lacked performance documentation relevant to their specific strengths, leaving users with no signal to identify them. Moreover, these gems do not cluster along predictable trajectories, implying that simple search heuristics based on popularity or structural properties are likely to fail.
Efficient Model Discovery
Exhaustively evaluating every model is infeasible. Instead, we view model discovery as a Multi-Armed Bandit (MAB) problem and propose an approach that accelerates the Sequential Halving algorithm using shared query sets as well as early and aggressive eliminations. This enables us to efficiently find high-performing candidates with only 50 queries per model, expediting the search by over 50×. Crucially, our method consistently outperforms standard MAB baselines and the original Sequential Halving algorithm iteself.
Method Overview: Our accelerated bandit-based search iteratively evaluates models on small query batches and eliminates the lowest-performing models at each round. This enables efficient discovery of top-performing models without exhaustive evaluation.
- Baseline Comparison: We compare our method against standard MAB algorithms (UCB, TTTS, Successive Rejects) and the Sequential Halving baseline. "Best Base" refers to selecting the most popular official model.
- Superior Efficiency: Our method consistently achieves the best rank and accuracy across all model trees, finding near-optimal models with only 50-100 queries per candidate.
- Practical Impact: At just 50 queries per model, our method retrieves models ranked in the top-3 out of hundreds of candidates.
BibTeX
@misc{kahana2026discoveringhiddengemsmodel,
title={Discovering Hidden Gems in Model Repositories},
author={Jonathan Kahana and Eliahu Horwitz and Yedid Hoshen},
year={2026},
eprint={2601.22157},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2601.22157}}
}