Mar 2Tech Intel

Research

Cutting-edge research, technical deep-dives, and R&D intelligence

Sources

DeepMind Blog27
Apple Machine Learning10
Microsoft Research10
PyTorch Blog10
NVIDIA Deep Learning2
The Gradient1
60 research articles analyzed
Key PapersMost recent
Apple Machine Learning3 days ago

Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments

Large-scale commercial search systems optimize for relevance to drive successful sessions that help users find what they are looking for. To maximize relevance, we leverage two complementary objectives: behavioral relevance (results users tend to click or download) and textual relevance (a result’s

Apple Machine Learning3 days ago

The Way We Notice, That's What Really Matters: Instantiating UI Components with Distinguishing Variations

Front-end developers author UI components to be broadly reusable by parameterizing visual and behavioral properties. While flexible, this makes instantiation harder, as developers must reason about numerous property values and interactions. In practice, they must explore the component’s large design

Microsoft Research4 days ago

CORPGEN advances AI agents for real work

At a glance Today’s AI agent benchmarks test one task at a time, while real workplace productivity requires managing dozens of interdependent tasks at once. To reflect this, we created a setting called Multi-Horizon Task Environments (MHTEs). Under multi-task loads, leading computer-using agents deg

DeepMind Blog4 days ago

Nano Banana 2: Combining Pro capabilities with lightning-fast speed

Our latest image generation model offers advanced world knowledge, production ready specs, subject consistency and more, all at Flash speed.

PyTorch Blog5 days ago

Enhancing Multimodal Training and Memory Efficiency with DeepSpeed

Overview This blog walks through two crucial DeepSpeed updates: (1) a PyTorch-identical backward API that enables efficient training of multimodal, multi-component models (including non-scalar backward calls), and (2) low-precision model training that significantly reduces peak memory, especially. F

All Research (60)

Apple Machine Learning5 days ago

Constructive Circuit Amplification: Improving Math Reasoning in LLMs via Targeted Sub-Network Updates

Prior studies investigating the internal workings of LLMs have uncovered sparse subnetworks, often referred to as circuits, that are responsible for performing specific tasks. Additionally, it has been shown that model performance improvement through fine-tuning often results from the strengthening

Apple Machine Learning5 days ago

A.R.I.S.: Automated Recycling Identification System for E-Waste Classification Using Deep Learning

Traditional electronic recycling processes suffer from significant resource loss due to inadequate material separation and identification capabilities, limiting material recovery. We present A.R.I.S. (Automated Recycling Identification System), a low-cost, portable sorter for shredded e-waste that a

Apple Machine Learning5 days ago

Closing the Gap Between Text and Speech Understanding in LLMs

Large Language Models (LLMs) can be adapted to extend their text capabilities to speech inputs. However, these speech-adapted LLMs consistently underperform their text-based counterparts—and even cascaded pipelines—on language understanding tasks. We term this shortfall the text-speech understanding

PyTorch Blog6 days ago

Accelerating Autotuning in Helion with Bayesian Optimization

Introduction As introduced in a previous blog post , Helion is a high-level DSL that empowers developers to write high-performance ML kernels using a familiar PyTorch-like syntax, delegating the complex task of optimization to its autotuning engine. This autotuner explores a vast, high-dimensional s

Apple Machine Learning6 days ago

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining

One of the first pre-processing steps for constructing web-scale LLM pretraining datasets involves extracting text from HTML. Despite the immense diversity of web content, existing open-source datasets predominantly apply a single fixed extractor to all webpages. In this work, we investigate whether

Apple Machine Learning6 days ago

depyf: Open the Opaque Box of PyTorch Compiler for Machine Learning Researchers

PyTorch \texttt{2.x} introduces a compiler designed to accelerate deep learning programs. However, for machine learning researchers, adapting to the PyTorch compiler to full potential can be challenging. The compiler operates at the Python bytecode level, making it appear as an opaque box. To addres

Apple Machine Learning6 days ago

AMUSE: Audio-Visual Benchmark and Alignment Framework for Agentic Multi-Speaker Understanding

Recent multimodal large language models (MLLMs) such as GPT-4o and Qwen3-Omni show strong perception but struggle in multi-speaker, dialogue-centric settings that demand agentic reasoning tracking who speaks, maintaining roles, and grounding events across time. These scenarios are central to multimo

Apple Machine Learning6 days ago

The Potential of CoT for Reasoning: A Closer Look at Trace Dynamics

Chain-of-thought (CoT) prompting is a de-facto standard technique to elicit reasoning-like responses from large language models (LLMs), allowing them to spell out individual steps before giving a final answer. While the resemblance to human-like reasoning is undeniable, the driving forces underpinni

Apple Machine Learning7 days ago

Apple Workshop on Reasoning and Planning 2025

Reasoning and planning are the bedrock of intelligent AI systems, enabling them to plan, interact, adapt, and ultimately, operate independently. At Apple, understanding and advancing reasoning capablilities in AI systems has long been an area of active research, and has resulted in numerous publicat

DeepMind Blog11 days ago

Gemini 3.1 Pro: A smarter model for your most complex tasks

3.1 Pro is designed for tasks where a simple answer isn’t enough.

Microsoft Research11 days ago

Media Authenticity Methods in Practice: Capabilities, Limitations, and Directions

Insights from Microsoft’s Media Integrity and Authentication: Status, Directions, and Futures report It has become increasingly difficult to distinguish fact from fiction when viewing online images and videos. Resilient, trustworthy technologies can help people determine whether the content they are

The Gradient11 days ago

After Orthogonality: Virtue-Ethical Agency and AI Alignment

Preface This essay argues that rational people don’t have goals, and that rational AIs shouldn’t have goals. Human actions are rational not because we direct them at some final ‘goals,’ but because we align actions to practices [1] : networks of actions, action-dispositio

Microsoft Research12 days ago

Project Silica’s advances in glass storage technology

At a glance Microsoft Research publishes breakthrough in Nature on glass-based data storage that could preserve information for 10,000 years. New technique extends technology from expensive fused silica to ordinary borosilicate glass found in kitchen cookware. Innovations enable faster parallel writ

DeepMind Blog12 days ago

A new way to express yourself: Gemini can now create music

The Gemini app now features our most advanced music generation model Lyria 3, empowering anyone to make 30-second tracks using text or images.

DeepMind Blog13 days ago

Accelerating discovery in India through AI-powered science and education

Google DeepMind brings National Partnerships for AI initiative to India, scaling AI for science and education

PyTorch Blog18 days ago

Pyrefly Now Type Checks PyTorch

We’re excited to share that PyTorch now leverages Pyrefly to power type checking across our core repository , along with a number of projects in the PyTorch ecosystem: Helion, TorchTitan and Ignite. For a project the size of PyTorch, leveraging typing and type checking has long been essential for en

DeepMind Blog18 days ago

Gemini 3 Deep Think: Advancing science, research and engineering

Our most specialized reasoning mode is now updated to solve modern science, research and engineering challenges.

DeepMind Blog21 days ago

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Research papers point to the growing impact of Deep Think across fields

PyTorch Blog23 days ago

Accelerating Mamba2 with Kernel Fusion

Summary In this post, we discuss how we optimized the Mamba-2 State-Space Dual (SSD) module with a fused Triton kernel that yields speedups of 1.50x-2.51x on NVIDIA A100 and H100 GPUs. To achieve this, we fused all five SSD kernels into a single Triton kernel with careful synchronization. To our kno

PyTorch Blog23 days ago

Some Matrix Multiplication Engines Are Not As Accurate As We Thought

What is an accumulator in an accelerator’s GEMM engine and why does it matter? GPUs and custom accelerators include specialized compute engines for matrix multiplication (also known as matmul or GEMM), such as NVIDIA’s Tensor Cores. These engines efficiently perform matmul on small tenso

PyTorch Blog25 days ago

Building Highly Efficient Inference System for Recommenders Using PyTorch

Why Choose PyTorch for Recommendation System PyTorch has emerged as the de facto framework in the AI community, with the majority of cutting-edge research, especially in areas like recommendation systems, retrieval, and ranking, being conducted with PyTorch. Developers are eager to bring the latest

Microsoft Research25 days ago

Rethinking imitation learning with Predictive Inverse Dynamics Models

At a glance Imitation learning becomes easier when an AI agent understands why an action is taken. Predictive Inverse Dynamics Models (PIDMs) predict plausible future states, clarifying the direction of behavior during imitation learning. Even imperfect predictions reduce ambiguity, making it cleare

Microsoft Research25 days ago

Paza: Introducing automatic speech recognition benchmarks and models for low resource languages

At a glance Microsoft Research releases PazaBench and Paza automatic speech recognition models , advancing speech technology for low resource languages. Human-centered pipeline for low-resource languages: Built for and tested by communities, Paza is an end-to-end, continuous pipeline that elevates h

PyTorch Blog27 days ago

Portable Paged Attention in Helion

Recently, the PyTorch team released Helion , a new domain-specific and PyTorch-based language to make the development of high-performing but portable kernels easier. With extensive autotuning built in, Helion has the promise to move the forefront of performance portability further than Triton. To te

PyTorch Blog28 days ago

Unlock Reasoning in Llama 3.1-8B via Full Fine-Tuning on NVIDIA DGX Spark

What is the unsaid joy of local LLMs? The magic of downloading weights, running some experiments overnight, maybe your room gets a bit toasty, and voila, you create a small but performant model that runs on your desktop. Often this involves a big GPU machine and lots of cables; in our case, it was a

DeepMind Blogabout 1 month ago

Project Genie: Experimenting with infinite, interactive worlds

Google AI Ultra subscribers in the U.S. can try out Project Genie, an experimental research prototype that lets you create and explore worlds.

PyTorch Blogabout 1 month ago

Accelerating On-Device ML Inference with ExecuTorch and Arm SME2

Interactive image segmentation has become a defining mobile experience across the world’s most popular apps. In plain terms, you tap (or draw a rough hint) on an image, and the app instantly “cuts out” the object by producing a pixel mask. This enables familiar features such as creating personalized

Microsoft Researchabout 1 month ago

UniRG: Scaling medical imaging report generation with multimodal reinforcement learning

At a glance AI-driven medical image report generation can help medical providers become more efficient and productive. Current models are difficult to train because reporting practices vary widely among providers. Universal Report Generation (UniRG) uses reinforcement learning to align model trainin

PyTorch Blogabout 1 month ago

PyTorch 2.10 Release Blog

We are excited to announce the release of PyTorch® 2.10 ( release notes )! This release features a number of improvements for performance and numerical debugging. Performance has been a focus for PyTorch throughout the 2.x release series, building on the capabilities of the PyTorch compiler stack in

Microsoft Researchabout 1 month ago

Multimodal reinforcement learning with agentic verifier for AI agents

At a glance Today’s multimodal AI systems can give answers that sound right but may not be grounded in what they actually observe over time, leading to unpredictable errors and safety risks in real-world settings. Argos is a verification framework for multimodal reinforcement learning that tra

DeepMind Blogabout 1 month ago

D4RT: Teaching AI to see the world in four dimensions

D4RT: Unified, efficient 4D reconstruction and tracking up to 300x faster than prior methods.

Microsoft Researchabout 2 months ago

OptiMind: A small language model with optimization expertise

At a glance Many real-world business problems can benefit from optimization, but translating decisions, constraints, and goals from natural language into optimization algorithms is slow. OptiMind is a small language model designed to convert business problems described in natural language into the m

DeepMind Blogabout 2 months ago

Veo 3.1 Ingredients to Video: More consistency, creativity and control

Our latest Veo update generates lively, dynamic clips that feel natural and engaging — and supports vertical video generation.

NVIDIA Deep Learningabout 2 months ago

NVIDIA Rubin Platform, Open Models, Autonomous Driving: NVIDIA Presents Blueprint for the Future at CES

NVIDIA founder and CEO Jensen Huang took the stage at the Fontainebleau Las Vegas to open CES 2026, declaring that AI is scaling into every domain and every device. “Computing has been fundamentally reshaped as a result of accelerated computing, as a result of artificial intelligence,” Huang said. “

DeepMind Blog2 months ago

Google's year in review: 8 areas with research breakthroughs in 2025

Google 2025 recap: Research breakthroughs of the year

DeepMind Blog2 months ago

Gemini 3 Flash: frontier intelligence built for speed

Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.

DeepMind Blog3 months ago

Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior

Open interpretability tools for language models are now available across the entire Gemma 3 family with the release of Gemma Scope 2.

NVIDIA Deep Learning3 months ago

As AI Grows More Complex, Model Builders Rely on NVIDIA

Unveiling what it describes as the most capable model series yet for professional knowledge work, OpenAI launched GPT-5.2 in December. The model was trained and deployed on NVIDIA infrastructure, including NVIDIA Hopper and GB200 NVL72 systems. GPT-5.3 Codex — the first OpenAI agentic coding model t

Microsoft Research3 months ago

Agent Lightning: Adding reinforcement learning to AI agents without code rewrites

AI agents are reshaping software development, from writing code to carrying out complex instructions. Yet LLM-based agents are prone to errors and often perform poorly on complicated, multi-step tasks. Reinforcement learning (RL) is an approach where AI systems learn to make optimal decisions by rec

DeepMind Blog3 months ago

Deepening our partnership with the UK AI Security Institute

Google DeepMind and UK AI Security Institute (AISI) strengthen collaboration on critical AI safety and security research

Microsoft Research3 months ago

Promptions helps make AI prompting more precise with dynamic UI controls

Anyone who uses AI systems knows the frustration: a prompt is given, the response misses the mark, and the cycle repeats. This trial-and-error loop can feel unpredictable and discouraging. To address this, we are excited to introduce Promptions ( prompt + options ), a UI framework that helps develop

DeepMind Blog3 months ago

Strengthening our partnership with the UK government to support prosperity and security in the AI era

Deepening our partnership with the UK government to support prosperity and security in the AI era

DeepMind Blog3 months ago

FACTS Benchmark Suite: Systematically evaluating the factuality of large language models

Systematically evaluating the factuality of large language models with the FACTS Benchmark Suite.

DeepMind Blog3 months ago

Engineering more resilient crops for a warming climate

Scientists are using AlphaFold to strengthen a photosynthesis enzyme for resilient, heat-tolerant crops.

DeepMind Blog3 months ago

AlphaFold: Five years of impact

Explore how AlphaFold has accelerated science and fueled a global wave of biological discovery.

DeepMind Blog3 months ago

Revealing a key protein behind heart disease

AlphaFold has revealed the structure of a key protein behind heart disease

DeepMind Blog3 months ago

Google DeepMind supports U.S. Department of Energy on Genesis: a national mission to accelerate innovation and scientific discovery

Google DeepMind and the DOE partner on Genesis, a new effort to accelerate science with AI.

DeepMind Blog3 months ago

Introducing Nano Banana Pro

DeepMind Blog3 months ago

Start building with Gemini 3

DeepMind Blog3 months ago

We’re expanding our presence in Singapore to advance AI in the Asia-Pacific region

Google DeepMind opens a new Singapore research lab, accelerating AI progress in the Asia-Pacific region.

DeepMind Blog3 months ago

Introducing Google Antigravity