reasoning-language-models

Deep Reasoning Translation via Reinforcement Learning (arXiv preprint 2025); DRT: Deep Reasoning Translation via Long Chain-of-Thought (arXiv preprint 2024)

reinforcement-learning machine-translation large-language-models reasoning-language-models literature-translation

Updated Apr 15, 2025

mims-harvard / ToolUniverse

Star

ToolUniverse is a collection of biomedical tools designed for AI agents

agents precision-medicine tool-use therapeutics reasoning-agent reasoning-language-models

Updated Mar 20, 2025
Python

yihedeng9 / OpenVLThinker

Star

OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement

rl vision-language-model reasoning-language-models grpo

Updated Mar 27, 2025
Python

spcl / x1

Star

Official Implementation of "Reasoning Language Models: A Blueprint"

lrm rlm large-language-models llm large-reasoning-models reasoning-language-models reasoning-llms mcts-for-llms

Updated Feb 10, 2025
Python

Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks

Star

A Comprehensive Survey on Evaluating Reasoning Capabilities in Multimodal Large Language Models.

reasoning multimodal multimodal-large-language-models multimodal-reasoning reasoning-language-models mllm-reasoning multimodal-reasoning-benchmarks

Updated Mar 18, 2025

a-m-team / a-m-models

Star

a-m-team's exploration in large language modeling

llm reasoning-language-models

Updated Apr 2, 2025

The-FinAI / Fino1

Star

This is the repo of developing reasoning models in the specific domain of financial, aim to enhance models capabilities in handling financial reasoning tasks.

llamas financial-modeling llms deepseek gpt-4o reasoning-language-models

Updated Mar 31, 2025
Jupyter Notebook

DolbyUUU / Logic-RL-Lite

Star

Lightweight replication study of DeepSeek-R1-Zero. Interesting findings include "No Aha Moment", "Longer CoT ≠ Accuracy", and "Language Mixing in Instruct Models".

reinforcement-learning fine-tuning post-training llm deepseek gpt-o1 reasoning-language-models reasoning-models deepseek-r1

Updated Apr 1, 2025
Python

DolbyUUU / DeepEnlighten

Star

Pure RL to post-train base models for social reasoning capabilities. Lightweight replication of DeepSeek-R1-Zero with Social IQa dataset.

reinforcement-learning fine-tuning post-training llm deepseek gpt-o1 reasoning-language-models reasoning-models deepseek-r1

Updated Mar 16, 2025
Python

zihao-ai / BoT

Star

🔥🔥🔥Breaking long thought processes of o1-like LLMs, such as DeepSeek-R1, QwQ

ai-agents backdoor-attacks qwq large-language-models chain-of-thought deepseek reasoning-language-models deepseek-r1

Updated Mar 6, 2025
Python

linhaowei1 / kumo

Star

☁️ KUMO: Generative Evaluation of Complex Reasoning in Large Language Models

benchmark reasoning llm reasoning-language-models

Updated Apr 10, 2025
Jupyter Notebook

mdda / getting-to-aha-with-tpus

Star

Reasoning-from-Zero using gemma.JAX.nnx on TPUs

gemma tpu jax nnx reasoning-language-models

Updated Apr 22, 2025
Python

Hyun-Ryu / clover

Star

Official code for "Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning", ICLR 2025.

logical-reasoning large-language-models reasoning-language-models

Updated Feb 19, 2025
Python

Ruiyang-061X / Awesome-MLLM-Reasoning

Star

📖Curated list about reasoning abilitiy of MLLM, including OpenAI o1, OpenAI o3-mini, and Slow-Thinking.

awesome openai multi-modal reasoning o1 chain-of-thought mllm chain-of-thought-reasoning multi-modal-large-language-model lvlm reasoning-language-models slow-thinking deepseek-r1 o3-mini

Updated Feb 7, 2025

Trustworthy-ML-Lab / ThinkEdit

Star

An effective weight-editing method for mitigating overly short reasoning in LLMs, and a mechanistic study uncovering how reasoning length is encoded in the model’s representation space.

deep-learning interpretable-machine-learning large-language-models generative-ai mechanistic-interpretability reasoning-language-models

Updated Apr 17, 2025
Python

NLPForUA / ZNO

Star

Structured test tasks and model tuning scripts for multiple subjects from ZNO - the Ukrainian External Independent Evaluation (ЗНО)

nlp benchmark natural-language-processing math history evaluation dataset exam ukraine llama geography language-model gemma reasoning data-annotation ukrainian-language large-language-models ukrainian-language-dataset reasoning-language-models

Updated Mar 19, 2025
Python

aryan-jadon / Synthetic-Data-Generation-and-Evaluation-using-Reasoning-Models

Star

This repository contains the implementation of our research on optimizing Retrieval-Augmented Generation (RAG) systems for technical domains. Our work addresses the unique challenges of precise information extraction from complex, domain-specific documents by introducing token-aware evaluation metrics and synthetic data generation pipeline.

synthetic-dataset-generation generative-ai llm-framework llm-evaluation reasoning-language-models

Updated Mar 1, 2025
Jupyter Notebook

Improve this page

Add a description, image, and links to the reasoning-language-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the reasoning-language-models topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reasoning-language-models

Here are 25 public repositories matching this topic...

mims-harvard / TxAgent

dvlab-research / Seg-Zero

LightChen233 / Awesome-Long-Chain-of-Thought-Reasoning

krystalan / DRT

mims-harvard / ToolUniverse

yihedeng9 / OpenVLThinker

spcl / x1

Wild-Cooperation-Hub / Awesome-MLLM-Reasoning-Benchmarks

a-m-team / a-m-models

The-FinAI / Fino1

DolbyUUU / Logic-RL-Lite

DolbyUUU / DeepEnlighten

zihao-ai / BoT

linhaowei1 / kumo

mdda / getting-to-aha-with-tpus

Hyun-Ryu / clover

Ruiyang-061X / Awesome-MLLM-Reasoning

Trustworthy-ML-Lab / ThinkEdit

NLPForUA / ZNO

aryan-jadon / Synthetic-Data-Generation-and-Evaluation-using-Reasoning-Models

Improve this page

Add this topic to your repo