AI Research Radar

A dashboard for AI research papers, research-tool releases, and lab updates.

I made this small AI research radar to keep up to date with AI news related to research. It autonomously collects research-focused AI signals such as new papers, lab announcements, model and developer-tool updates, research-writing tools, AI-for-science work, mathematical reasoning, literature-search systems, and selected company news. These signals are filtered by a set of research-oriented keywords.

UpdatedMay 31, 2026, 4:35 PM ET

Items75

Priority cutoff15

75 visible items

arXiv — formal proof search May 21, 2026

Advancing Mathematics Research with AI-Driven Formal Proof Search

Large language models (LLMs) increasingly excel at mathematical reasoning, but their unreliability limits their utility in mathematics research. A mitigation is using LLMs to generate formal proofs in languages like Lean. We perform the first large-scale evaluation of this method's ability to solve open problems. Our most capable agent autonomously resolved 9 of 353 open Erdős problems at the per-problem cost of a few hundred dollars, proved 44/492 OEIS conjectures, and is being deployed in combinatorics, optimiza…

Matched: evaluation, formal proof, formal proof search, Lean, LLM · score 102

Source Mix

Interleaved papers, lab updates, and research-tool signals.

NVIDIA Developer Blog May 26, 2026

Run Key Genomics and Protein Folding Workloads Faster with NVIDIA RTX PRO 4500 Blackwell

Precision medicine depends on two fundamental capabilities: understanding disease at the genomic level and identifying treatments at the molecular level. ...

Matched: Blackwell, life sciences, molecular, NVIDIA, protein · score 47

NVIDIA Blog May 27, 2026

AI Factories: The New Infrastructure of Intelligence

AI factories are token factories, converting power into intelligence in real time. And as agentic AI scales and autonomous, always-on special agents are deployed in the enterprise, performance per watt and cost per token become the economics that matter.

Matched: agentic, agents, Blackwell, NVIDIA · score 29

Google Research Blog May 19, 2026

Empirical Research Assistance (ERA): From Nature publication to catalyzing Computational Discovery

General Science

Matched: computational discovery, empirical research assistance · score 26.6

Tesla-related News May 28, 2026

Three Humanoid Robotics ETFs Built for the Tesla Optimus and Figure AI Era Most Investors Have Never Heard Of - 24/7 Wall St.

Three Humanoid Robotics ETFs Built for the Tesla Optimus and Figure AI Era Most Investors Have Never Heard Of 24/7 Wall St.

Matched: Optimus, robotics · score 21.2

Hugging Face Blog May 27, 2026

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

No summary available.

Matched: agentic, benchmark · score 20.2

OpenAI Codex changelog May 29, 2026

Computer use and mobile access on Windows

Matched: changelog, computer use · score 20.1

Microsoft Research Blog May 21, 2026

MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models

MagenticLite is an agentic system for small models that works across the browser and local file system in a single workflow. It combines specialized models and orchestration to support efficient agentic performance on everyday tasks. The post MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models appeared first on Microsoft Research .

Matched: agentic, workflow · score 19.1

OpenAI News May 27, 2026

Warp’s big bet on building open source with GPT-5.5

Warp uses GPT-5.5 and OpenAI models to coordinate coding agents across local, cloud, and open-source development workflows.

Matched: agents, open-source · score 19

Xiaomi-related News May 31, 2026

HyperOS 4 Eligible Device List & Rollout Timeline Leaks: Is Your Xiaomi Phone Getting Android 17? - nokiapoweruser.com

HyperOS 4 Eligible Device List & Rollout Timeline Leaks: Is Your Xiaomi Phone Getting Android 17? nokiapoweruser.com

Matched: HyperOS · score 13.1

Google DeepMind News May 18, 2026

Fast-tracking genetic leads to reverse cellular aging

Biologists use Co-Scientist to find novel factors that successfully rejuvenate human cells.

Matched: co-scientist · score 13

Anthropic News May 31, 2026

Introducing Claude Opus 4.8 Product May 28, 2026 An upgrade to our Opus class of models, with stronger performance across coding, agentic tasks, and professional work, and the consistency to handle long-running work.

Current highlighted link on Anthropic News.

Matched: agentic · score 12.1

Apple Newsroom May 19, 2026

Apple unveils new accessibility features, and updates with Apple Intelligence

Apple announced major accessibility updates powered by Apple Intelligence, including new capabilities for VoiceOver, Magnifier, and Voice Control.

Matched: Apple Intelligence · score 12.1

Anthropic Research May 31, 2026

Alignment

Current highlighted link on Anthropic Research.

Matched: alignment · score 7.7

Apple Machine Learning Research May 31, 2026

Machine Learning Research

Current highlighted link on Apple Machine Learning Research.

Matched: source/tag match · score 6

Microsoft AI News May 18, 2026

At aged care provider Regis, AI takes on paperwork so staff can focus on residents

The post At aged care provider Regis, AI takes on paperwork so staff can focus on residents appeared first on Source .

Matched: source/tag match · score 5

arXiv — formal proof and mathematical reasoning May 26, 2026

MerLean-Prover: A Recursive Looping Harness for Lean 4 Theorem Proving

MerLean-Prover is an end-to-end Lean4 theorem prover that replaces sorry declarations with kernel-checkable proofs. It is built from three agent types (Planning, Check, and Lean) composed by a recursive outer loop whose unit of revision is the proof plan itself, and uses no fine-tuning, no custom RL objective, and no theorem-specific scaffolding. On FormalQualBench, a benchmark of 23 PhD-qualifying-exam theorems, Me…

Matched: benchmark, fine-tuning, Lean, Lean 4, open-source, theorem proving · score 61.5

NVIDIA Developer Blog May 27, 2026

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance

Large language models (LLMs) are revolutionizing the financial trading landscape by enabling sophisticated analysis of vast amounts of unstructured data to...

Matched: agentic, Blackwell, LLM, NVIDIA, RAG, retrieval augmented generation · score 40.9

NVIDIA Blog May 18, 2026

Vera Arrives: NVIDIA’s First CPU Built for Agents Lands at Top AI Labs

The first NVIDIA Vera CPUs arrived at three of the world's leading AI labs on Friday — Anthropic in San Francisco, OpenAI in Mission Bay, SpaceXAI in Palo Alto — followed by a delivery to Oracle Cloud Infrastructure in Santa Clara on Monday. NVIDIA Vice President of Hyperscale and High-Performance Computing Ian Buck hand-delivered them.

Matched: agentic, agents, NVIDIA · score 27.2

Google Research Blog May 28, 2026

A New Era of Innovation: Google Research at I/O 2026

General Science

Matched: source/tag match · score 5

Tesla-related News May 24, 2026

Weekend Round-Up: Tesla's FSD In China, Nvidia's Uber Partnership, Boeing's 737 Max Case Victory And More - Benzinga

Weekend Round-Up: Tesla's FSD In China, Nvidia's Uber Partnership, Boeing's 737 Max Case Victory And More Benzinga

Matched: FSD, NVIDIA · score 21.2

Hugging Face Blog May 23, 2026

Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models

No summary available.

Matched: diffusion · score 6.7

Microsoft Research Blog May 28, 2026

Data Formulator 0.7: AI-powered data analytics for enterprise data

Data Formulator introduces AI-powered analytics for enterprise data workflows. Data teams can easily bring enterprise data into an AI-ready workspace where users can explore, analyze, and visualize data with AI agents to turn raw data into actionable insights. The post Data Formulator 0.7: AI-powered data analytics for enterprise data appeared first on Microsoft Research .

Matched: agents · score 11

OpenAI News May 28, 2026

How Endava builds an agentic organization with Codex

Learn how Endava uses Codex to build an agentic organization, accelerating software delivery and reducing requirements analysis from weeks to hours.

Matched: agentic · score 15.1

Xiaomi-related News May 30, 2026

40 Xiaomi devices have now received the HyperOS 3.1 update - MSN

40 Xiaomi devices have now received the HyperOS 3.1 update MSN

Matched: HyperOS · score 13.1

Google DeepMind News May 21, 2026

We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks

No summary available.

Matched: source/tag match · score 5

Anthropic Research May 31, 2026

Skip to footer

Current highlighted link on Anthropic Research.

Matched: source/tag match · score 5

Apple Machine Learning Research May 31, 2026

Open Menu

Current highlighted link on Apple Machine Learning Research.

Matched: source/tag match · score 6

arXiv — formal proof and mathematical reasoning May 22, 2026

Agentic Proving for Program Verification

Agentic systems have recently emerged as state-of-the-art approaches for automated theorem proving in formal mathematics. To assess how far these capabilities extend to program verification, we evaluate Claude Code in an agentic proving framework on CLEVER, a Lean 4 benchmark for verifiable code generation. Our results show that Claude generates arguably valid specifications for 98.8% of problems (with 81.3% also ac…

Matched: agentic, benchmark, evaluation, Lean, Lean 4, theorem proving · score 57.1

NVIDIA Developer Blog May 19, 2026

NVIDIA-Verified Agent Skills Provide Capability Governance for AI Agents

Autonomous AI agents are becoming more capable. Open models, Model Context Protocol (MCP)-connected tools, and portable skills are also making agents easier to...

Matched: agentic, agents, MCP, Model Context Protocol, NVIDIA · score 38.2

NVIDIA Blog May 26, 2026

NVIDIA Vera CPU Is ‘Packing a Heavy-Hitting Punch’ Against Competition

The shift to agentic AI creates a new CPU requirement for the AI factory: fast cores, massive memory bandwidth and the ability to sustain high performance when all cores are active. Initial benchmark results published by Phoronix today show that the NVIDIA Vera CPU meets this need. For this first public look, the benchmark scope […]

Matched: agentic, benchmark, NVIDIA · score 25.1

Google Research Blog May 27, 2026

Private analytics via zero-trust aggregation

Security, Privacy and Abuse Prevention

Matched: source/tag match · score 5

Tesla-related News May 21, 2026

Tesla's FSD entry into China accelerates the autonomous driving sector; Zidatech (02650) pioneers the global launch of a home charging robot compatible with Tesla. - Moomoo

Tesla's FSD entry into China accelerates the autonomous driving sector; Zidatech (02650) pioneers the global launch of a home charging robot compatible with Tesla. Moomoo

Matched: autonomous driving, FSD · score 21.2

Microsoft Research Blog May 27, 2026

Extending Human Intelligence Through AI

Understanding AI as an extension of human intelligence—not a replacement for it—offers a more grounded path for building trustworthy AI systems. The post Extending Human Intelligence Through AI appeared first on Microsoft Research .

Matched: source/tag match · score 5

OpenAI News May 27, 2026

Building self-improving tax agents with Codex

See how OpenAI, Thrive, and Crete built a self-improving tax agent with Codex, automating filings, improving accuracy, and accelerating workflows.

Matched: agents · score 15.1

Xiaomi-related News May 30, 2026

HyperOS 4 confirmed to bring MiClaw autonomous AI assistant support, launch in July or August - MSN

HyperOS 4 confirmed to bring MiClaw autonomous AI assistant support, launch in July or August MSN

Matched: HyperOS · score 13.1

Anthropic Research May 31, 2026

Economic Futures

Current highlighted link on Anthropic Research.

Matched: source/tag match · score 5

Apple Machine Learning Research May 31, 2026

Publications

Current highlighted link on Apple Machine Learning Research.

Matched: source/tag match · score 6

arXiv — formal proof and mathematical reasoning May 26, 2026

ReasonOps: A Unified Operational Paradigm for Trustworthy Verified LLM Reasoning

Large Language Models (LLMs) have transformed artificial intelligence from primarily generative systems into increasingly capable reasoning agents. Recent advances in theorem proving, autoformalization, symbolic reasoning, and tool-augmented language models demonstrate substantial progress toward machine-assisted formal reasoning. However, current reasoning systems still suffer from hidden logical inconsistencies, h…

Matched: agents, autoformalization, formal verification, LLM, symbolic reasoning, theorem proving · score 55.7

NVIDIA Developer Blog May 20, 2026

Mastering Agentic Techniques: AI Agent Customization

Autonomous AI agents are taking on all types of work for businesses: routing logistics fleets, triaging support tickets, generating code, and orchestrating...

Matched: agentic, agents, NVIDIA, RAG, retrieval augmented generation · score 36.1

NVIDIA Blog May 19, 2026

NVIDIA and Google Cloud Empower the Next Wave of AI Builders

At this year’s Google I/O conference, NVIDIA and Google Cloud are accelerating the work of more than 100,000 developers in the companies’ joint developer community, which provides curated learning paths, hands-on labs and events that help them build using the full-stack NVIDIA AI platform on Google Cloud. Launched at Google I/O last year, the community […]

Matched: agentic, Blackwell, NVIDIA · score 25.1

Tesla-related News May 20, 2026

Tesla’s Full Self-Driving software is creeping into Europe - TechCrunch

Tesla’s Full Self-Driving software is creeping into Europe TechCrunch

Matched: Full Self-Driving, self-driving · score 21.2

Microsoft Research Blog May 21, 2026

Vega: Zero-knowledge proofs for digital identity in the age of AI

Vega turns a full credential into a single proof, sharing only what is needed and nothing more, with performance that works in real apps. The post Vega: Zero-knowledge proofs for digital identity in the age of AI appeared first on Microsoft Research .

Matched: source/tag match · score 5

OpenAI News May 22, 2026

OpenAI named a Leader in enterprise coding agents by Gartner

OpenAI is named a leader in the 2026 Gartner Magic Quadrant for Enterprise AI Coding Agents, with Codex recognized for innovation and enterprise-scale deployment.

Matched: agents · score 15.1

Xiaomi-related News May 30, 2026

Xiaomi HyperOS 4 (Android 17) update: Full list of eligible devices - MSN

Xiaomi HyperOS 4 (Android 17) update: Full list of eligible devices MSN

Matched: HyperOS · score 13.1

Anthropic Research May 31, 2026

Economic Research

Current highlighted link on Anthropic Research.

Matched: source/tag match · score 5

Apple Machine Learning Research May 31, 2026

Work with us

Current highlighted link on Apple Machine Learning Research.

Matched: source/tag match · score 6

arXiv — formal proof and mathematical reasoning May 25, 2026

Keep the Proof State Live: Snapshotting for Efficient Tactic Search in Lean 4

Automated theorem proving systems built on Lean 4 increasingly rely on parallel tactic search over partially specified proofs, such as those generated by Draft-Sketch-Prove (DSP) pipelines. In current systems, each search branch reconstructs a proof state by re-running elaboration, leading to substantial per-branch overhead. In Lean 4 with Mathlib, this cost has two components: (1) import loading, which deserializes…

Matched: Lean, Lean 4, proof search, theorem proving · score 54

NVIDIA Developer Blog May 19, 2026

Mastering Agentic Techniques: AI Agent Evaluation

Evaluating an AI model and evaluating an AI agent are related—but they answer fundamentally different questions. A model benchmark tests the capability of a...

Matched: agentic, benchmark, evaluation, NVIDIA · score 32.2

NVIDIA Blog May 18, 2026

NVIDIA CEO Jensen Huang at Dell Technologies World: ‘Demand Is Going Parabolic, Utterly Parabolic’

Agentic AI inference at one-tenth the cost per token with NVIDIA Vera Rubin NVL72. Agent sandboxes run 50% faster on NVIDIA Vera than traditional CPUs — while enterprise data queries are up to 3x faster with the Vera CPU. And 5,000 enterprises like Lilly, Samsung and Honeywell are running AI workloads on Dell AI Factories […]

Matched: agentic, CUDA, NVIDIA · score 25.1

Tesla-related News May 21, 2026

Tesla FSD Now Nags Less, And That Has Safety Critics Worried - Autoblog

Tesla FSD Now Nags Less, And That Has Safety Critics Worried Autoblog

Matched: FSD, safety · score 15.8

OpenAI News May 18, 2026

OpenAI and Dell partner to bring Codex to hybrid and on-premise enterprise environments

OpenAI and Dell partner to bring Codex to hybrid and on-premise environments, helping enterprises deploy AI coding agents securely across data and workflows.

Matched: agents · score 13

Xiaomi-related News May 30, 2026

HyperOS 3.1 update is yet to reach these Xiaomi devices - MSN

HyperOS 3.1 update is yet to reach these Xiaomi devices MSN

Matched: HyperOS · score 13.1

Anthropic Research May 31, 2026

Interpretability

Current highlighted link on Anthropic Research.

Matched: source/tag match · score 5

Apple Machine Learning Research May 31, 2026

ParaRNN: Large-Scale Nonlinear RNNs, Trainable in Parallel

Current highlighted link on Apple Machine Learning Research.

Matched: source/tag match · score 6

arXiv — formal proof and mathematical reasoning May 28, 2026

Formalizing Mathematics at Scale

We present AutoformBot, a multi-agent system for building an Autoformalized Textbook Library At Scale (Atlas) in Lean 4. AutoformBot orchestrates thousands of LLM agents, equipped with formal verification tools, dependency-aware task scheduling, and collaborative version control, to translate informal textbook prose into machine-checked definitions and proofs. We apply our methods to a corpus of 26 open-access textb…

Matched: agents, formal verification, Lean, Lean 4, LLM, open-source · score 51

NVIDIA Developer Blog May 26, 2026

Develop High-Performance GPU Kernels in C++ with NVIDIA CUDA Tile

Developers can now use NVIDIA CUDA Tile programming within large existing C++ GPU codebases to develop highly optimized GPU kernels using tile-based...

Matched: CUDA, GPU, NVIDIA · score 28.3

NVIDIA Blog May 28, 2026

NVIDIA Research Advances Robotics From Simulation to the Real World

Robotics is entering a new phase: moving from controlled demos and scripted automation toward generalizable, reliable embodied autonomy in the real world. At the International Conference on Robotics and Automation (ICRA), eight of NVIDIA Research’s 28 accepted papers show how simulation-to-real transfer is becoming a foundation for that shift, helping robots perceive, reason, plan and […]

Matched: NVIDIA, robotics · score 21.2

Tesla-related News May 31, 2026

Tesla Quietly Renamed FSD Just Before China Owners Took It To Court - Autoblog

Tesla Quietly Renamed FSD Just Before China Owners Took It To Court Autoblog

Matched: FSD · score 13.1

OpenAI News May 28, 2026

OpenAI’s Frontier Governance Framework

Explore OpenAI’s Frontier Governance Framework and how our AI safety, security, and risk practices align with emerging EU and California regulations.

Matched: safety · score 9

Xiaomi-related News May 28, 2026

HyperOS 4 to debut in July/August and these devices can get it first - Huawei Central

HyperOS 4 to debut in July/August and these devices can get it first Huawei Central

Matched: HyperOS · score 13.1

arXiv — formal proof and mathematical reasoning May 27, 2026

Risk-Controlled Lean-as-Judge for Natural-Language Mathematical Reasoning

Lean is increasingly used to judge natural-language mathematical answers, but its signal is partial: many answers never formalize, and a failed proof may reflect an ill-typed statement or a missing library fact, not a wrong answer. On MATH-500 we show this signal is (i) sharply coverage-dependent, that is the proof-winning answer is correct 96% of the time at high proved coverage but 20% at low, and (ii) sparse and…

Matched: autoformalization, Lean, mathematical reasoning · score 44

NVIDIA Developer Blog May 26, 2026

NVIDIA CUDA 13.3 Enhances GPU Development with Tile Programming in C++, Compiler Autotuning, and Python Updates

NVIDIA CUDA 13.3 brings new capabilities and performance optimizations to developers across the CUDA ecosystem. The launch of NVIDIA CUDA Tile programming in...

Matched: CUDA, GPU, NVIDIA · score 28.3

NVIDIA Blog May 21, 2026

NVIDIA GTC Taipei at COMPUTEX: Live Updates on What’s Next in AI

At NVIDIA GTC Taipei at COMPUTEX, the world’s developers, researchers and industry leaders are converging to dive into the latest breakthroughs shaping every industry, covering topics spanning AI factories and scaling infrastructure to agentic and physical AI and more.

Matched: agentic, NVIDIA · score 19.1

Tesla-related News May 31, 2026

Autonomous & Self-Driving Vehicle News: Tesla, Waymo, WeRide, Helm.ai, Torc & More - AUTO Connected Car News

Autonomous & Self-Driving Vehicle News: Tesla, Waymo, WeRide, Helm.ai, Torc & More AUTO Connected Car News

Matched: self-driving · score 13.1

OpenAI News May 19, 2026

Advancing content provenance for a safer, more transparent AI ecosystem

OpenAI advances AI content provenance with Content Credentials, SynthID, and a verification tool to help people identify and trust AI-generated media.

Matched: safety · score 9

Xiaomi-related News May 27, 2026

Xiaomi announced the launch of a new HyperOS in July and August - Root-Nation.com

Xiaomi announced the launch of a new HyperOS in July and August Root-Nation.com

Matched: HyperOS · score 13.1

arXiv — AI/ML/CL/CV/stat.ML May 28, 2026

SoundnessBench: Can Your AI Scientist Really Tell Good Research Ideas from Bad Ones?

Autonomous AI research agents aim to accelerate scientific discovery by automating the research pipeline, from hypothesis generation to peer review. However, existing benchmarks rarely test a fundamental bottleneck: whether Large Language Models can judge the methodological viability of a research idea before expending time and computational resources. We introduce SoundnessBench, a curated benchmark of 1,099 machin…

Matched: agents, AI scientist, benchmark, hypothesis generation, scientific discovery · score 41.8

NVIDIA Developer Blog May 27, 2026

NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes

The cold-start problem In production inference deployments, demand fluctuates over time, requiring inference replicas to scale elastically. However,...

Matched: agentic, agents, NVIDIA · score 24.1

NVIDIA Blog May 28, 2026

The Name’s Gaming … Cloud Gaming: ‘007 First Light’ Launches on GeForce NOW

License to stream, shaken and stirred. GeForce NOW is dialing up the espionage with the launch of 007 First Light, letting members slip into James Bond’s reimagined origin story from almost any device — no tux or preloads required. For a limited time, the game is included with the purchase of a 12‑month GeForce NOW […]

Matched: NVIDIA · score 11

Tesla-related News May 31, 2026

2 top AI robotics stocks to consider above Tesla - Yahoo Finance UK

2 top AI robotics stocks to consider above Tesla Yahoo Finance UK

Matched: robotics · score 13.1

OpenAI News May 29, 2026

Boston Children’s uses AI to unlock new diagnoses

Boston Children’s Hospital uses OpenAI technology to improve patient care, reduce operational burden, and help diagnose more than 40 rare disease cases.

Matched: source/tag match · score 7

Xiaomi-related News May 26, 2026

Xiaomi Confirms Next-Generation HyperOS Launch for July and August - Gizchina.com

Xiaomi Confirms Next-Generation HyperOS Launch for July and August Gizchina.com

Matched: HyperOS · score 13.1

arXiv — formal proof and mathematical reasoning May 27, 2026

On Compositional Learning Behaviours in Formal Mathematics

Self-evolving scientific agents capable of conquering the hard tail of formal mathematics require Compositional Learning Behaviours (CLBs) -- the capacity to ground and recombine novel symbolic structures in context, beyond mere recombination of prelearned atoms. We propose \textbf{S2B-LM}, an adaptation of the Symbolic Behaviour Benchmark that removes numerical processing as a confound and adds chain-of-thought sca…

Matched: agents, benchmark, Lean, olympiad · score 39

NVIDIA Developer Blog May 22, 2026

Synthesize Realistic 3D Medical Images at Scale to Ship Pre‑Trained Models

High‑quality 3D medical imaging data is the foundation of modern radiology AI, but access to it is often constrained by data scarcity, privacy restrictions,...

Matched: agentic, life sciences, NVIDIA · score 24