惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
V
Vulnerabilities – Threatpost
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Visual Studio Blog
月光博客
月光博客
IT之家
IT之家
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Tailwind CSS Blog
罗磊的独立博客
S
SegmentFault 最新的问题
博客园 - 三生石上(FineUI控件)
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
量子位
V
V2EX
Jina AI
Jina AI
The GitHub Blog
The GitHub Blog
小众软件
小众软件
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
阮一峰的网络日志
阮一峰的网络日志
Recent Announcements
Recent Announcements
MongoDB | Blog
MongoDB | Blog
Y
Y Combinator Blog
H
Help Net Security
博客园_首页
Cyberwarzone
Cyberwarzone
T
Tenable Blog
A
Arctic Wolf
C
CERT Recently Published Vulnerability Notes
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
T
Threat Research - Cisco Blogs
aimingoo的专栏
aimingoo的专栏
Google DeepMind News
Google DeepMind News
博客园 - 叶小钗
C
Cyber Attacks, Cyber Crime and Cyber Security
美团技术团队
Attack and Defense Labs
Attack and Defense Labs
GbyAI
GbyAI
博客园 - 【当耐特】
Cloudbric
Cloudbric
NISL@THU
NISL@THU
B
Blog RSS Feed
K
Kaspersky official blog
Hugging Face - Blog
Hugging Face - Blog
P
Privacy International News Feed
博客园 - Franky
博客园 - 司徒正美
Microsoft Azure Blog
Microsoft Azure Blog
Apple Machine Learning Research
Apple Machine Learning Research
Webroot Blog
Webroot Blog
Microsoft Security Blog
Microsoft Security Blog

Google DeepMind News

Investing in multi-agent AI safety research DiffusionGemma: 4x faster text generation Fluid, natural voice translation with Gemini 3.5 Live Translate Measuring the impact of learning with AI in Sierra Leone and beyond Powering the future of robotics in Europe Introducing Gemma 4 12B: a unified, encoder-free multimodal model Strengthening Singapore’s AI Future: A New National Partnership Simulate real-world places with Project Genie and Street View Introducing Gemini Omni Gemini for Science: AI experiments and tools for a new era of discovery Making it easier to understand how content was created and edited Gemini 3.5: frontier intelligence with action Co-Scientist: A multi-agent AI partner to accelerate research How WeatherNext helped the National Hurricane Center better predict Hurricane Melissa’s historic landfall in Jamaica Fast-tracking genetic leads to reverse cellular aging Finding the molecular switches behind new infectious diseases Opening new paths in aging research Accelerating discovery of liver disease mechanisms Uniting biological toolkits for a new approach to ALS Uncovering repurposed medicines to fight liver fibrosis Google Antigravity We’re launching the Google DeepMind Accelerator program in Asia Pacific to tackle environmental risks. Reimagining the mouse pointer for the AI era AlphaEvolve: How our Gemini-powered coding agent is scaling impact across fields Enabling a new model for healthcare with AI co-clinician Announcing our partnership with the Republic of Korea Decoupled DiLoCo: A new frontier for resilient, distributed AI training Partnering with industry leaders to accelerate AI transformation Gemini 3.1 Flash TTS: the next generation of expressive AI speech Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning Gemma 4: Byte for byte, the most capable open models Gemini 3.1 Flash Live: Making audio AI more natural and reliable Protecting people from harmful manipulation Lyria 3 Pro: Create longer tracks in more Google products Measuring progress toward AGI: A cognitive framework From games to biology and beyond: 10 years of AlphaGo’s impact Gemini 3.1 Flash-Lite: Built for intelligence at scale Nano Banana 2: Combining Pro capabilities with lightning-fast speed Gemini 3.1 Pro: A smarter model for your most complex tasks A new way to express yourself: Gemini can now create music Accelerating discovery in India through AI-powered science and education Gemini 3 Deep Think: Advancing science, research and engineering Accelerating Mathematical and Scientific Discovery with Gemini Deep Think Project Genie: Experimenting with infinite, interactive worlds D4RT: Teaching AI to see the world in four dimensions Veo 3.1 Ingredients to Video: More consistency, creativity and control Google's year in review: 8 areas with research breakthroughs in 2025 Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior Google DeepMind supports U.S. Department of Energy on Genesis: a national mission to accelerate innovation and scientific discovery Gemini 3 Flash: frontier intelligence built for speed Improved Gemini audio models for powerful voice interactions Deepening our partnership with the UK AI Security Institute Strengthening our partnership with the UK government to support prosperity and security in the AI era FACTS Benchmark Suite: Systematically evaluating the factuality of large language models Engineering more resilient crops for a warming climate AlphaFold: Five years of impact Revealing a key protein behind heart disease How we’re bringing AI image verification to the Gemini app Build with Nano Banana Pro, our Gemini 3 Pro Image model Introducing Nano Banana Pro We’re expanding our presence in Singapore to advance AI in the Asia-Pacific region Start building with Gemini 3 A new era of intelligence with Gemini 3 Google Antigravity WeatherNext 2: Our most advanced weather forecasting model SIMA 2: An Agent that Plays, Reasons, and Learns With You in Virtual 3D Worlds Teaching AI to see the world more like we do How AI is giving Northern Ireland teachers time back Mapping, modeling, and understanding nature with AI Accelerating discovery with the AI for Math Initiative VaultGemma: The world's most capable differentially private LLM Bringing AI to the next generation of fusion energy Introducing Veo 3.1 and advanced capabilities in Flow How a Gemma model helped discover a new potential cancer therapy pathway Introducing the Gemini 2.5 Computer Use model Introducing CodeMender: an AI agent for code security Gemini Robotics 1.5 brings AI agents into the physical world Strengthening our Frontier Safety Framework Discovering new solutions to century-old problems in fluid dynamics Gemini achieves gold-medal level at the International Collegiate Programming Contest World Finals Using AI to perceive the universe in greater depth Image editing in Gemini just got a major upgrade Introducing Gemma 3 270M: The compact model for hyper-efficient AI How AI is helping advance the science of bioacoustics to save endangered species Genie 3: A new frontier for world models Rethinking how we measure AI intelligence Try Deep Think in the Gemini app AlphaEarth Foundations helps map our planet in unprecedented detail Aeneas transforms how historians connect the past Gemini 2.5 Flash-Lite is now stable and generally available Exploring the context of online images with Backstory Advanced version of Gemini with Deep Think officially achieves gold-medal standard at the International Mathematical Olympiad T5Gemma: A new collection of encoder-decoder Gemma models Introducing Gemma 3n: The developer guide AlphaGenome: AI for better understanding the genome Gemini Robotics On-Device brings AI to local robotic devices We’re expanding our Gemini 2.5 family of models Gemini 2.5: Updates to our family of thinking models Behind “ANCESTRA”: combining Veo with live-action filmmaking How we're supporting better tropical cyclone prediction with AI
MedGemma: Our most capable open models for health AI development
2025-10-25 · via Google DeepMind News

Healthcare is increasingly embracing AI to improve workflow management, patient communication, and diagnostic and treatment support. It’s critical that these AI-based systems are not only high-performing, but also efficient and privacy-preserving. It’s with these considerations in mind that we built and recently released Health AI Developer Foundations (HAI-DEF). HAI-DEF is a collection of lightweight open models designed to offer developers robust starting points for their own health research and application development. Because HAI-DEF models are open, developers retain full control over privacy, infrastructure and modifications to the models. In May of this year, we expanded the HAI-DEF collection with MedGemma, a collection of generative models based on Gemma 3 that are designed to accelerate healthcare and lifesciences AI development.

Today, we’re proud to announce two new models in this collection. The first is MedGemma 27B Multimodal, which complements the previously-released 4B Multimodal and 27B text-only models by adding support for complex multimodal and longitudinal electronic health record interpretation. The second new model is MedSigLIP, a lightweight image and text encoder for classification, search, and related tasks. MedSigLIP is based on the same image encoder that powers the 4B and 27B MedGemma models.

MedGemma and MedSigLIP are strong starting points for medical research and product development. MedGemma is useful for medical text or imaging tasks that require generating free text, like report generation or visual question answering. MedSigLIP is recommended for imaging tasks that involve structured outputs like classification or retrieval. All of the above models can be run on a single GPU, and MedGemma 4B and MedSigLIP can even be adapted to run on mobile hardware.

Full details of MedGemma and MedSigLIP development and evaluation can be found in the MedGemma technical report.

MedGemma: A multimodal generative model for health

The MedGemma collection includes variants in 4B and 27B sizes, both of which now accept image and text inputs and produce text outputs.

  • MedGemma 4B Multimodal: MedGemma 4B scores 64.4% on MedQA, which ranks it among the best very small (<8B) open models. In an unblinded study, 81% of MedGemma 4B–generated chest X-ray reports were judged by a US board certified radiologist to be of sufficient accuracy to result in similar patient management compared to the original radiologist reports. It additionally achieves performance on medical image classification tasks that is competitive with task-specific state-of-the-art models.
  • MedGemma 27B Text and MedGemma 27B Multimodal: Based on internal and published evaluations, the MedGemma 27B models are among the best performing small open models (<50B) on the MedQA medical knowledge and reasoning benchmark; the text variant scores 87.7%, which is within 3 points of DeepSeek R1, a leading open model, but at approximately one tenth the inference cost. The MedGemma 27B models are competitive with larger models across a variety of benchmarks, including retrieval and interpretation of electronic health record data.

We developed these models by training a medically optimized image encoder (independently released as MedSigLIP, described below), followed by training the corresponding 4B and 27B versions of the Gemma 3 model on medical data. We took care to retain the general (non-medical) capabilities of Gemma throughout this process. This allows MedGemma to perform well on tasks that mix medical and non-medical information and preserve instruction-following and capabilities in non-English languages.

A key aspect of these models is their adaptability. For instance, after fine-tuning, MedGemma 4B is able to achieve state-of-the-art performance on chest X-ray report generation, with a RadGraph F1 score of 30.3. The straightforward ability for developers to improve performance on their target applications highlights the value of MedGemma as a starting point for developers looking to build AI for healthcare.

MedSigLIP: A specialized image encoder for healthcare

MedSigLIP is a lightweight image encoder of only 400M parameters that uses the Sigmoid loss for Language Image Pre-training (SigLIP) architecture. MedSigLIP was adapted from SigLIP via tuning with diverse medical imaging data, including chest X-rays, histopathology patches, dermatology images, and fundus images, allowing the model to learn nuanced features specific to these modalities. Importantly, we also took care to ensure that MedSigLIP retains strong performance on the natural images on which the original SigLIP model was trained, maintaining its versatility.

MedSigLIP is designed to bridge the gap between medical images and medical text by encoding them into a common embedding space. MedSigLIP achieves similar or improved classification performance compared to task-specific vision embedding models while being far more versatile across medical imaging domains.

MedSigLIP is ideal for:

  • Traditional image classification: Build performant models to classify medical images.
  • Zero-shot image classification: Classify images without specific training examples by comparing image embeddings to the embeddings of textual class labels.
  • Semantic image retrieval: Find visually or semantically similar images from large medical image databases.

The power of open models

Because the MedGemma collection is open, the models can be downloaded, built upon, and fine-tuned to support developers’ specific needs. Particularly in the medical space, this open approach offers several distinct advantages over API-based models:

  • Flexibility and privacy: Models can be run on proprietary hardware in the developer’s preferred environment, including on Google Cloud Platform or locally, which can address privacy concerns or institutional policies.
  • Customization for high performance: Models can be fine-tuned and modified to achieve optimal performance on target tasks and datasets.
  • Reproducibility and stability: Because the models are distributed as snapshots, their parameters are frozen and unlike an API, will not change unexpectedly over time. This stability is particularly crucial for medical applications where consistency and reproducibility are paramount.

To ensure broad accessibility and ease of use, our Hugging Face collection offers MedSigLIP and MedGemma in the popular Hugging Face safetensors format.

What developers are building with MedGemma & MedSigLIP

Researchers and developers have been exploring the MedGemma models for their use cases and have found the models adept at solving some crucial problems. Developers at DeepHealth in Massachusetts, USA have been exploring MedSigLIP to improve their chest X-ray triaging and nodule detection. Researchers at Chang Gung Memorial Hospital in Taiwan noted that MedGemma works well with traditional Chinese-language medical literature and can respond well to medical staff questions. Developers at Tap Health in Gurgaon, India, remarked on MedGemma’s superior medical grounding, noting its reliability on tasks that require sensitivity to clinical context, such as summarizing progress notes or suggesting guideline-aligned nudges.

We’re excited to continue to learn about these and other use cases from developers as they create the next generation of Health AI tools with MedGemma and MedSigLIP.

Get started and explore

To help developers get started, we’ve provided detailed notebooks on GitHub for MedGemma and MedSigLIP that demonstrate how to create instances of MedSigLIP and MedGemma for both inference and fine-tuning on Hugging Face. When developers are ready to scale, MedGemma and MedSigLIP can be seamlessly deployed in Vertex AI as dedicated endpoints, and we provide examples in GitHub of how to run inference on these endpoints. We’ve also added a new demo to our HAI-DEF Hugging Face demo collection that shows how MedGemma can be built into an application to streamline pre-visit information gathering ahead of a patient appointment.

Refer to the following table to understand which model from the MedGemma family is ideal for your use case.

Please visit the HAI-DEF site for these resources and to learn more about the MedGemma collection and other Health AI Developer Foundations models. The HAI-DEF forum is available for questions or feedback.

Note on training datasets

Models were trained on a mix of public and private de-identified datasets. Google and its partners utilize datasets that have been rigorously anonymized or de-identified to ensure the protection of individual research participants and patient privacy.

Disclaimer

MedGemma and MedSigLIP are intended to be used as a starting point that enables efficient development of downstream healthcare applications involving medical text and images. MedGemma and MedSigLIP are not intended to be used without appropriate validation, adaptation and/or making meaningful modification by developers for their specific use case. The outputs generated by these models are not intended to directly inform clinical diagnosis, patient management decisions, treatment recommendations, or any other direct clinical practice applications. Performance benchmarks highlight baseline capabilities on relevant benchmarks, but even for image and text domains that constitute a substantial portion of training data, inaccurate model output is possible. All model outputs should be considered preliminary and require independent verification, clinical correlation, and further investigation through established research and development methodologies.

Acknowledgements

MedGemma is the product of a collaboration between Google Research and Google DeepMind. We thank the many people who contributed to this work, including the engineering and cross-functional members of the Google Health AI and Gemma teams, as well as our sponsors in Google Research and Google Deepmind.