This explains why Falcon 40B outperforms LLaMA 33B on several benchmarks despite fewer parameters: cleaner data, not more compute.
There is constant confusion in the LLM community. Many users download the model weights via transformers and think they have the source. You do not. falcon 40 source code exclusive
Key resources for exploring the Falcon 40B source code and its implementation include: Official Model Repository: This explains why Falcon 40B outperforms LLaMA 33B