Model Gallery

7 models from 1 repositories

Filter by type:

Filter by tags:

arcee-ai_afm-4.5b

AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. We use a modified version of TorchTitan for pretraining, Axolotl for supervised fine-tuning, and a modified version of Verifiers for reinforcement learning. The development of AFM-4.5B prioritized data quality as a fundamental requirement for achieving robust model performance. We collaborated with DatologyAI, a company specializing in large-scale data curation. DatologyAI's curation pipeline integrates a suite of proprietary algorithms—model-based quality filtering, embedding-based curation, target distribution-matching, source mixing, and synthetic data. Their expertise enabled the creation of a curated dataset tailored to support strong real-world performance. The model architecture follows a standard transformer decoder-only design based on Vaswani et al., incorporating several key modifications for enhanced performance and efficiency. Notable architectural features include grouped query attention for improved inference efficiency and ReLU^2 activation functions instead of SwiGLU to enable sparsification while maintaining or exceeding performance benchmarks. The model available in this repo is the instruct model following supervised fine-tuning and reinforcement learning.

Repository: localaiLicense: aml

jina-reranker-v1-tiny-en

This model is designed for blazing-fast reranking while maintaining competitive performance. What's more, it leverages the power of our JinaBERT model as its foundation. JinaBERT itself is a unique variant of the BERT architecture that supports the symmetric bidirectional variant of ALiBi. This allows jina-reranker-v1-tiny-en to process significantly longer sequences of text compared to other reranking models, up to an impressive 8,192 tokens.

Repository: localai

eurollm-9b-instruct

The EuroLLM project has the goal of creating a suite of LLMs capable of understanding and generating text in all European Union languages as well as some additional relevant languages. EuroLLM-9B is a 9B parameter model trained on 4 trillion tokens divided across the considered languages and several data sources: Web data, parallel data (en-xx and xx-en), and high-quality datasets. EuroLLM-9B-Instruct was further instruction tuned on EuroBlocks, an instruction tuning dataset with focus on general instruction-following and machine translation.

Repository: localaiLicense: apache-2.0

logics-qwen3-math-4b

**Model Name:** Logics-Qwen3-Math-4B **Base Model:** Qwen/Qwen3-4B-Thinking-2507 **Size:** 4B parameters **Fine-Tuned For:** Mathematical reasoning, logical problem solving, and algorithmic coding **Training Data:** OpenMathReasoning, OpenCodeReasoning, Helios-R-6M **Description:** A lightweight, high-precision 4B-parameter model optimized for mathematical and logical reasoning. Fine-tuned from Qwen3-4B-Thinking-2507, it excels in solving equations, performing step-by-step reasoning, and handling algorithmic tasks with structured outputs in LaTeX, Markdown, JSON, and more. Ideal for education, research, and deployment on mid-range hardware. **Use Case:** Perfect for math problem-solving, code reasoning, and technical content generation in resource-constrained environments. **Tags:** #math #code #reasoning #4B #Qwen3 #text-generation #open-source

Repository: localaiLicense: apache-2.0

a2fm-32b-rl

**A²FM-32B-rl** is a 32-billion-parameter adaptive foundation model designed for hybrid reasoning and agentic tasks. It dynamically selects between *instant*, *reasoning*, and *agentic* execution modes using a **route-then-align** framework, enabling smarter, more efficient AI behavior. Trained with **Adaptive Policy Optimization (APO)**, it achieves state-of-the-art performance on benchmarks like AIME25 (70.4%) and BrowseComp (13.4%), while reducing inference cost by up to **45%** compared to traditional reasoning methods—delivering high accuracy at low cost. Originally developed by **PersonalAILab**, this model is optimized for tool-aware, multi-step problem solving and is ideal for advanced AI agents requiring both precision and efficiency. 🔹 *Model Type:* Adaptive Agent Foundation Model 🔹 *Size:* 32B 🔹 *Use Case:* Agentic reasoning, tool use, cost-efficient AI agents 🔹 *Training Approach:* Route-then-align + Adaptive Policy Optimization (APO) 🔹 *Performance:* SOTA on reasoning and agentic benchmarks 📄 [Paper](https://arxiv.org/abs/2510.12838) | 🐙 [GitHub](https://github.com/OPPO-PersonalAI/Adaptive_Agent_Foundation_Models)

Repository: localaiLicense: aml

gpt-oss-20b-esper3.1-i1

**Model Name:** gpt-oss-20b-Esper3.1 **Repository:** [ValiantLabs/gpt-oss-20b-Esper3.1](https://huggingface.co/ValiantLabs/gpt-oss-20b-Esper3.1) **Base Model:** openai/gpt-oss-20b **Type:** Instruction-tuned, reasoning-focused language model **Size:** 20 billion parameters **License:** Apache 2.0 --- ### 🔍 **Overview** gpt-oss-20b-Esper3.1 is a specialized, instruction-tuned variant of the 20B open-source GPT model, developed by **Valiant Labs**. It excels in **advanced coding, software architecture, and DevOps reasoning**, making it ideal for technical problem-solving and AI-driven engineering tasks. ### ✨ **Key Features** - **Expert in DevOps & Cloud Systems:** Trained on high-difficulty datasets (e.g., Titanium3, Tachibana3, Mitakihara), it delivers precise, actionable guidance for AWS, Kubernetes, Terraform, Ansible, Docker, Jenkins, and more. - **Strong Code Reasoning:** Optimized for complex programming tasks, including full-stack development, scripting, and debugging. - **High-Quality Inference:** Uses `bf16` precision for full-precision performance; quantized versions (e.g., GGUF) available for efficient local inference. - **Open-Source & Free to Use:** Fully open-access, built on the public gpt-oss-20b foundation and trained with community datasets. ### 📌 **Use Cases** - Designing scalable cloud architectures - Writing and optimizing infrastructure-as-code - Debugging complex DevOps pipelines - AI-assisted software development and documentation - Real-time technical troubleshooting ### 💡 **Getting Started** Use the standard `text-generation` pipeline with the `transformers` library. Supports role-based prompting (e.g., `user`, `assistant`) and performs best with high-reasoning prompts. ```python from transformers import pipeline pipe = pipeline("text-generation", model="ValiantLabs/gpt-oss-20b-Esper3.1", torch_dtype="auto", device_map="auto") messages = [{"role": "user", "content": "Design a Kubernetes cluster for a high-traffic web app with CI/CD via GitHub Actions."}] outputs = pipe(messages, max_new_tokens=2000) print(outputs[0]["generated_text"][-1]) ``` --- > 🔗 **Model Gallery Entry**: > *gpt-oss-20b-Esper3.1 – A powerful, open-source 20B model tuned for expert-level DevOps, coding, and system architecture. Built by Valiant Labs using high-quality technical datasets. Perfect for engineers, architects, and AI developers.*

Repository: localaiLicense: apache-2.0

qwen3-4b-thinking-2507-gspo-easy

**Model Name:** Qwen3-4B-Thinking-2507-GSPO-Easy **Base Model:** Qwen3-4B (by Alibaba Cloud) **Fine-tuned With:** GRPO (Generalized Reward Policy Optimization) **Framework:** Hugging Face TRL (Transformers Reinforcement Learning) **License:** [MIT](https://huggingface.co/leonMW/Qwen3-4B-Thinking-2507-GSPO-Easy/blob/main/LICENSE) --- ### 📌 Description: A fine-tuned 4-billion-parameter version of **Qwen3-4B**, optimized for **step-by-step reasoning and complex problem-solving** using **GRPO**, a reinforcement learning method designed to enhance mathematical and logical reasoning in language models. This model excels in tasks requiring **structured thinking**, such as solving math problems, logical puzzles, and multi-step reasoning, making it ideal for applications in education, AI assistants, and reasoning benchmarks. ### 🔧 Key Features: - Trained with **TRL 0.23.1** and **Transformers 4.57.1** - Optimized for **high-quality reasoning output** - Part of the **Qwen3-4B-Thinking** series, designed to simulate human-like thought processes - Compatible with Hugging Face `transformers` and `pipeline` API ### 📚 Use Case: Perfect for applications demanding **deep reasoning**, such as: - AI tutoring systems - Advanced chatbots with explanation capabilities - Automated problem-solving in STEM domains ### 📌 Quick Start (Python): ```python from transformers import pipeline question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?" generator = pipeline("text-generation", model="leonMW/Qwen3-4B-Thinking-2507-GSPO-Easy", device="cuda") output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0] print(output["generated_text"]) ``` > ✅ **Note**: This is the **original, non-quantized base model**. Quantized versions (e.g., GGUF) are available separately under the same repository for efficient inference on consumer hardware. --- 🔗 **Model Page:** [https://huggingface.co/leonMW/Qwen3-4B-Thinking-2507-GSPO-Easy](https://huggingface.co/leonMW/Qwen3-4B-Thinking-2507-GSPO-Easy) 📝 **Training Details & Visualizations:** [WandB Dashboard](https://wandb.ai/leonwenderoth-tu-darmstadt/huggingface/runs/t42skrc7) --- *Fine-tuned using GRPO — a method proven to boost mathematical reasoning in open language models. Cite: Shao et al., 2024 (arXiv:2402.03300)*

Repository: localaiLicense: apache-2.0