Microsoft

Enterprise

company

Verified

https://www.microsoft.com/en-us/research/

microsoft

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

frontierai updated a model 1 day ago

microsoft/bitnet-embedding-0.6b

frontierai updated a model 1 day ago

microsoft/bitnet-embedding-270m

Yif29 updated a dataset 1 day ago

microsoft/RESOURCE2SKILL

View all activity

Papers

ResearchStudio-Reel: Automate the Last Mile of Research from Paper to Poster, Video, and Blog

ResearchStudio-Idea: An Evidence-Grounded Research-Ideation Skill Suite from ML Conference Outcomes

View all Papers

Articles

microsoft 's collections 30

HARC

A family of safety-aligned instruction models trained with HARC

microsoft/HARC

Text Generation • Updated 16 days ago • 9
microsoft/HARC-Llama-3.1-8B-Instruct

Text Generation • 8B • Updated 16 days ago • 407
microsoft/HARC-Qwen2.5-7B-Instruct

Text Generation • 8B • Updated 16 days ago • 422
HARC: Coupling Harmfulness and Refusal Directions for Robust Safety Alignment

Paper • 2607.00572 • Published 18 days ago • 3

GridSFM

Collection of datasets and models developed to support research in power grid modeling

microsoft/GridSFM_US_power_grid

Updated May 25 • 547 • 6
microsoft/GridSFM_Open

Graph Machine Learning • Updated May 27 • 22

Skala

Accurate and scalable exchange-correlation with deep learning

microsoft/skala-1.1

Updated 2 days ago • 63.8k • 10
Accurate and scalable exchange-correlation with deep learning

Paper • 2506.14665 • Published Apr 21 • 6
microsoft/skala-baselines

Updated May 4 • 720 • 6
microsoft/skala-1.0

Updated Apr 23 • 13.1k • 5

VibeVoice

Frontier Text-to-Speech Models https://microsoft.github.io/VibeVoice/

microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Jan 22 • 36k • 2.43k
microsoft/VibeVoice-Realtime-0.5B

Text-to-Speech • 1B • Updated Dec 12, 2025 • 648k • 1.24k
VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 174
microsoft/VibeVoice-ASR

Automatic Speech Recognition • 9B • Updated Jan 27 • 699k • 1.22k

Dayhoff Atlas

The models and datasets that comprise the Dayhoff Atlas

microsoft/Dayhoff

Viewer • Updated Apr 2 • 1.77B • 2.3k • 12
microsoft/Dayhoff-170m-UR50

Text Generation • 0.2B • Updated Jan 16 • 53 • 5
microsoft/Dayhoff-170m-UR90

Text Generation • 0.2B • Updated Jan 26 • 233 • 1
microsoft/Dayhoff-170m-GR

Text Generation • 0.2B • Updated Jan 26 • 249 • 2

Paza

Paza is a collection of speech models & benchmarks for low resource languages by the Microsoft Research Africa - Nairobi Lab

Running

Agents

22

PazaBench

🥇

22

ASR Leaderboard for low resource languages
microsoft/paza-Phi-4-multimodal-instruct

Automatic Speech Recognition • 6B • Updated Feb 4 • 121 • 5
microsoft/paza-whisper-large-v3-turbo

Automatic Speech Recognition • 0.8B • Updated Feb 4 • 538 • 10

Phi-4

Phi-4 family of small language, multi-modal and reasoning models.

microsoft/Phi-4-mini-flash-reasoning

Text Generation • 4B • Updated Dec 10, 2025 • 816 • 281
microsoft/Phi-4-mini-reasoning

Text Generation • 4B • Updated Dec 10, 2025 • 35k • • 240
microsoft/Phi-4-reasoning

Text Generation • 15B • Updated Nov 24, 2025 • 20.9k • 228
microsoft/Phi-4-reasoning-plus

Text Generation • 15B • Updated Nov 24, 2025 • 13.8k • 346

Phi-1

Phi-1 family of small language models.

microsoft/phi-1

Text Generation • 1B • Updated Nov 24, 2025 • 15.4k • • 222
microsoft/phi-1_5

Text Generation • 1B • Updated Nov 24, 2025 • 60.5k • • 1.36k
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 159
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 92

BitNet

🔥BitNet family of large language models (1-bit LLMs).

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated Dec 17, 2025 • 9.42k • 1.48k
microsoft/bitnet-b1.58-2B-4T-bf16

Text Generation • 2B • Updated Dec 17, 2025 • 6.85k • 44
microsoft/bitnet-b1.58-2B-4T-gguf

Text Generation • 2B • Updated Dec 17, 2025 • 21.2k • 287
BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published Apr 16, 2025 • 87

LLM2CLIP

LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever.

microsoft/LLM2CLIP-EVA02-L-14-336

Zero-Shot Image Classification • Updated Nov 22, 2024 • 93 • 61
microsoft/LLM2CLIP-Openai-L-14-336

Zero-Shot Classification • 0.6B • Updated Nov 24, 2024 • 1.54k • 45
microsoft/LLM2CLIP-EVA02-B-16

Updated Feb 8, 2025 • 21 • 11
microsoft/LLM2CLIP-Openai-B-16

Zero-Shot Classification • 0.4B • Updated Nov 24, 2024 • 24 • 19

TAPEX

TAPEX is the state-of-the-art table pre-training models which can be used for table-based question answering and table-based fact verification.

TAPEX: Table Pre-training via Learning a Neural SQL Executor

Paper • 2107.07653 • Published Jul 16, 2021 • 3
microsoft/tapex-large-finetuned-wtq

Table Question Answering • 0.4B • Updated Jan 12, 2024 • 760 • 79
microsoft/tapex-base-finetuned-wikisql

Table Question Answering • Updated Jan 24, 2023 • 948k • • 25
microsoft/tapex-large-sql-execution

Table Question Answering • 0.4B • Updated Sep 15, 2023 • 272 • 19

LayoutLM

The LayoutLM series are Transformer encoders useful for document AI tasks such as invoice parsing, document image classification and DocVQA.

microsoft/layoutlmv3-base

0.1B • Updated Apr 10, 2024 • 1.21M • 504
microsoft/layoutlmv2-base-uncased

Updated Sep 16, 2022 • 614k • 68
microsoft/layoutlm-base-uncased

0.1B • Updated Apr 16, 2024 • 200k • 62
microsoft/layoutxlm-base

Updated Sep 16, 2022 • 98.3k • 75

Orca

The Orca family of LMs developed by Microsoft.

microsoft/Orca-2-7b

Text Generation • Updated Nov 22, 2023 • 491 • • 224
microsoft/Orca-2-13b

Text Generation • Updated Nov 22, 2023 • 776 • • 668

GIT

GIT (Generative Image-to-text Transformer) is a model useful for vision-language tasks such as image/video captioning and question answering.

GIT: A Generative Image-to-text Transformer for Vision and Language

Paper • 2205.14100 • Published May 27, 2022 • 2
microsoft/git-base

Image-to-Text • 0.2B • Updated Apr 24, 2023 • 11.4k • 111
microsoft/git-large

Image-to-Text • Updated Feb 8, 2023 • 690 • 19
microsoft/git-base-vqav2

Visual Question Answering • 0.2B • Updated Mar 9, 2024 • 719 • 21

IFMs

Industrial Foundation Models

microsoft/LLaMA-2-7b-GTL-Delta

Text Generation • 7B • Updated Aug 12, 2024 • 27 • 10
microsoft/LLaMA-2-13b-GTL-Delta

Text Generation • 13B • Updated Aug 12, 2024 • 36 • 6

Froggy-Models

microsoft/FrogBoss-32B-2510

Text Generation • 677k • Updated Jan 22 • 219 • • 35
microsoft/FrogMini-14B-2510

Text Generation • 425k • Updated Jan 15 • 133 • • 63

MSR-ACC

Microsoft Research - Accurate Chemistry Collection

microsoft/msr-acc-tae25

Viewer • Updated Apr 24 • 72.6k • 76 • 6
Accurate Chemistry Collection: Coupled cluster atomization energies for broad chemical space

Paper • 2506.14492 • Published Jun 17, 2025 • 5

ChatBench

ChatBench Datasets and Simulators (same prompt + fine-tuning set-up) from the ChatBench paper.

microsoft/ChatBench

Preview • Updated Apr 28, 2025 • 162 • 13
microsoft/chatbench-distilgpt2

Text Generation • 81.9M • Updated Aug 23, 2025 • 26 • 4
microsoft/chatbench-llama3-8b

Updated Aug 23, 2025 • 12 • 6
microsoft/chatbench-mistral-7b

Updated Aug 23, 2025 • 7 • 5

MediPhi

A collection of SLMs based on Phi3.5-mini-instruct adapted to clinical natural language processing tasks: https://arxiv.org/abs/2505.10717

A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment

Paper • 2505.10717 • Published May 15, 2025 • 5
microsoft/MediPhi-Instruct

Text Generation • 4B • Updated Dec 15, 2025 • 1.97k • 69
microsoft/MediPhi

Text Generation • 4B • Updated Dec 15, 2025 • 1.04k • 24
microsoft/MediPhi-PubMed

Text Generation • 4B • Updated Dec 15, 2025 • 335 • 13

NatureLM

microsoft/NatureLM-8x7B

47B • Updated Jun 20, 2025 • 15 • 22
microsoft/NatureLM-8x7B-Inst

47B • Updated Jun 20, 2025 • 35 • 26

NextCoder

NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data.

microsoft/NextCoder-7B

Text Generation • 8B • Updated Jun 12, 2025 • 630 • • 34
microsoft/NextCoder-14B

Text Generation • 15B • Updated Jun 12, 2025 • 140 • • 19
microsoft/NextCoder-32B

Text Generation • 33B • Updated Jun 12, 2025 • 103 • • 69
microsoft/NextCoderDataset

Viewer • Updated Jul 8, 2025 • 381k • 253 • 55

Phi-3

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths.

microsoft/Phi-3.5-mini-instruct

Text Generation • 4B • Updated Dec 10, 2025 • 767k • 996
microsoft/Phi-3.5-MoE-instruct

Text Generation • 42B • Updated Dec 10, 2025 • 165k • 574
microsoft/Phi-3.5-vision-instruct

Image-Text-to-Text • 4B • Updated Dec 10, 2025 • 1.18M • 736
microsoft/Phi-3-mini-4k-instruct

Text Generation • 4B • Updated Dec 10, 2025 • 578k • • 1.44k

Controllable Safety Alignment

Artifacts for the paper "Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements" (https://arxiv.org/abs/2410.08968)

Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements

Paper • 2410.08968 • Published Oct 11, 2024 • 14
microsoft/CoSApien

Viewer • Updated Aug 1, 2025 • 200 • 71 • 3
microsoft/CoSAlign-Test

Viewer • Updated May 5, 2025 • 3.2k • 90 • 3
microsoft/CoSAlign-Train

Viewer • Updated Aug 1, 2025 • 125k • 127 • 4

MAI-DS-R1

MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team.

microsoft/MAI-DS-R1

Text Generation • 671B • Updated Dec 15, 2025 • 220 • 305
microsoft/MAI-DS-R1-FP8

Text Generation • 671B • Updated Dec 15, 2025 • 63 • 28

SpeechT5

The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks.

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing

Paper • 2110.07205 • Published Oct 14, 2021 • 6
microsoft/speecht5_tts

Text-to-Speech • Updated Nov 8, 2023 • 67.1k • 836
Runtime error

Agents

Featured

220

SpeechT5 Speech Synthesis Demo

👩

220
microsoft/speecht5_vc

Audio-to-Audio • Updated Mar 22, 2023 • 3.06k • 112

Table Transformer

The Table Transformer (TATR) is a series of object detection models useful for table extraction from PDF images.

microsoft/table-transformer-detection

Object Detection • 28.8M • Updated Sep 6, 2023 • 1.15M • 429
microsoft/table-transformer-structure-recognition

Object Detection • 28.8M • Updated Sep 6, 2023 • 1.84M • 225
microsoft/table-transformer-structure-recognition-v1.1-all

Object Detection • 28.8M • Updated Nov 18, 2023 • 251k • 83
microsoft/table-transformer-structure-recognition-v1.1-fin

Object Detection • 28.8M • Updated Nov 27, 2023 • 644 • 2

Biomedical

Models for biomedical research applications, such as radiology report generation and biomedical language understanding.

microsoft/maira-2

Text Generation • 7B • Updated Aug 14, 2025 • 4.92k • 80
microsoft/rad-dino-maira-2

Image Feature Extraction • 86.6M • Updated Aug 22, 2024 • 14.9k • 25
microsoft/rad-dino

Image Feature Extraction • 86.6M • Updated May 12 • 119k • 82
microsoft/radedit

0.4B • Updated Dec 8, 2025 • 32

UDOP

UDOP is a general multimodal model for document AI

Unifying Vision, Text, and Layout for Universal Document Processing

Paper • 2212.02623 • Published Dec 5, 2022 • 12
microsoft/udop-large

Image-Text-to-Text • 0.7B • Updated Dec 2, 2025 • 72.2k • 125
microsoft/udop-large-512

Image-Text-to-Text • 0.7B • Updated Dec 2, 2025 • 58 • 6
microsoft/udop-large-512-300k

Image-Text-to-Text • 0.7B • Updated Dec 2, 2025 • 168 • 34

Florence

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Paper • 2311.06242 • Published Nov 10, 2023 • 97
microsoft/Florence-2-large

Image-Text-to-Text • 0.8B • Updated Aug 4, 2025 • 788k • 1.83k
microsoft/Florence-2-base

Image-Text-to-Text • 0.2B • Updated Aug 4, 2025 • 2.69M • 389
microsoft/Florence-2-large-ft

Image-Text-to-Text • 0.8B • Updated Aug 4, 2025 • 34.9k • 388

MoCapAct

Locomotion policies for hundreds of simulated humanoid locomotion clips and demonstration data for training them.

microsoft/mocapact-models

Updated Aug 17, 2024 • 10
microsoft/mocapact-data

Updated Aug 17, 2024 • 147 • 6
MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control

Paper • 2208.07363 • Published Aug 15, 2022 • 2