Infrastructure

23 posts in Infrastructure

Unified Latents Hits 1.4 FID by Replacing Stable Diffusion's Ad Hoc VAE with a Diffusion Prior

February 2026Vision

Unified Latents Hits 1.4 FID by Replacing Stable Diffusion's Ad Hoc VAE with a Diffusion Prior

80.3 on ScreenSpotPro: GUI-Owl-1.5 Sets New Bar for Open-Source GUI Agents

February 2026AI Agents

80.3 on ScreenSpotPro: GUI-Owl-1.5 Sets New Bar for Open-Source GUI Agents

3.5x Faster Image Generation: DDiT Dynamically Resizes Patches in Diffusion Transformers

February 2026Vision

3.5x Faster Image Generation: DDiT Dynamically Resizes Patches in Diffusion Transformers

Reasoning Overthinking Solved: SAGE Cuts Tokens 44% While Improving Accuracy

February 2026Infrastructure

Reasoning Overthinking Solved: SAGE Cuts Tokens 44% While Improving Accuracy

Baidu Introduces ERNIE 5.0: Trillion-Parameter Unified Multimodal MoE Rivals GPT-5

February 2026Vision

Baidu Introduces ERNIE 5.0: Trillion-Parameter Unified Multimodal MoE Rivals GPT-5

First Holistic OCR Model: OCRVerse Unifies Document Parsing and Code Generation

January 2026Vision

First Holistic OCR Model: OCRVerse Unifies Document Parsing and Code Generation

175% Faster Prefill with Better Accuracy: ConceptMoE's Adaptive Token Compression for MoE

January 2026Infrastructure

175% Faster Prefill with Better Accuracy: ConceptMoE's Adaptive Token Compression for MoE

6x Fewer Tokens, Better OCR: DeepSeek's Visual Causal Flow Beats GPT-4o and Gemini

January 2026Vision

6x Fewer Tokens, Better OCR: DeepSeek's Visual Causal Flow Beats GPT-4o and Gemini

2x Faster VLA Inference with 70% Fewer Layers: Shallow-π Distillation for Edge Robotics

January 2026Infrastructure

2x Faster VLA Inference with 70% Fewer Layers: Shallow-π Distillation for Edge Robotics

UPLiFT vs Cross-Attention Upsamplers: Linear Scaling Meets SOTA Quality

January 2026Vision

UPLiFT vs Cross-Attention Upsamplers: Linear Scaling Meets SOTA Quality

90% Attention Sparsity with Zero Quality Loss: SALAD Speeds Up Video Diffusion 1.7x

January 2026Infrastructure

90% Attention Sparsity with Zero Quality Loss: SALAD Speeds Up Video Diffusion 1.7x

FP8 Rollout Instability Solved: Jet-RL Unifies Precision for Stable RL Training

January 2026Infrastructure

FP8 Rollout Instability Solved: Jet-RL Unifies Precision for Stable RL Training

73% on BrowseComp: Meituan's 560B Open-Source Model Leads Agentic Benchmarks

January 2026AI Agents

73% on BrowseComp: Meituan's 560B Open-Source Model Leads Agentic Benchmarks

Tsinghua Researchers Show Diffusion LLMs Reason Better When You Take Away Their Flexibility

January 2026Infrastructure

Tsinghua Researchers Show Diffusion LLMs Reason Better When You Take Away Their Flexibility

97ms First-Packet Latency: Qwen3-TTS Beats ElevenLabs in Voice Cloning Across 10 Languages

January 2026Voice AI

97ms First-Packet Latency: Qwen3-TTS Beats ElevenLabs in Voice Cloning Across 10 Languages

56.7% on OSWorld: EvoCUA's Evolutionary Training Beats Closed-Source Computer Use Agents

January 2026AI Agents

56.7% on OSWorld: EvoCUA's Evolutionary Training Beats Closed-Source Computer Use Agents

Microsoft's Agent Lightning Decouples RL Training from Agent Logic, Enabling Fine-Tuning of Any AI Agent with Zero Code Changes

January 2026Infrastructure

Microsoft's Agent Lightning Decouples RL Training from Agent Logic, Enabling Fine-Tuning of Any AI Agent with Zero Code Changes

16x Faster On-Device Video Generation: Qualcomm's ReHyAt Distills Attention in 160 GPU Hours

January 2026Vision

16x Faster On-Device Video Generation: Qualcomm's ReHyAt Distills Attention in 160 GPU Hours

Gold Medal at IMO and IOI: DeepSeek-V3.2 Matches GPT-5 with Open Weights

January 2026Infrastructure

Gold Medal at IMO and IOI: DeepSeek-V3.2 Matches GPT-5 with Open Weights

6% Better Math Reasoning in Fewer Tokens: Multiplex Thinking Merges Multiple Paths into One

January 2026Infrastructure

6% Better Math Reasoning in Fewer Tokens: Multiplex Thinking Merges Multiple Paths into One

Zero Training Data, Full Performance: Dr. Zero Matches Supervised Search Agents

January 2026Infrastructure

Zero Training Data, Full Performance: Dr. Zero Matches Supervised Search Agents

Why Reasoning Models Cheat on Efficiency: TNT's Fix Cuts Tokens 50%

January 2026Infrastructure

Why Reasoning Models Cheat on Efficiency: TNT's Fix Cuts Tokens 50%

Why Hyper-Connections Explode at Scale: DeepSeek's Manifold Fix

January 2026Infrastructure

Why Hyper-Connections Explode at Scale: DeepSeek's Manifold Fix