物理驅動的AI超音波:原始訊號繞過數十年的成像教條

Hugging Face April 2026
Source: Hugging FaceArchive: April 2026
NV-Raw2Insights-US將波動方程嵌入神經網路,直接處理原始射頻超音波數據,而非重建影像。該系統能根據組織類型即時調整成像參數,即使由非專業人員操作,也能提供專家級的診斷品質。
The article body is currently shown in English by default. You can generate the full version in this language on demand.

A new AI system, NV-Raw2Insights-US, is challenging the fundamental pipeline of medical ultrasound. Instead of the conventional 'image first, interpret later' workflow, this model ingests raw radiofrequency (RF) signals—the unprocessed electrical echoes from tissue—and directly maps them to diagnostic insights. The core innovation is the integration of the acoustic wave equation into the neural network's loss function and architecture. This physics-informed approach forces the model to learn physically plausible tissue reconstructions, filtering out noise and motion artifacts that plague traditional beamforming and post-processing. Crucially, the system demonstrates real-time adaptive behavior: it analyzes the returning RF signal's statistical properties to identify tissue type (fat, muscle, bone, blood) and dynamically adjusts reconstruction parameters—such as frequency weighting and speckle reduction—to optimize resolution and contrast for that specific tissue. This adaptive capability directly addresses the long-standing bottleneck of operator dependency. In clinical benchmarks, NV-Raw2Insights-US reduced inter-operator variability by over 60% and improved lesion detection sensitivity by 22% compared to state-of-the-art commercial systems. The implications extend beyond static imaging: the system's low-latency, raw-signal processing pipeline makes it suitable for continuous monitoring applications, such as tracking tumor perfusion during chemotherapy or assessing cardiac function in ambulatory settings. By moving AI from a post-hoc image analyst to a co-processor that understands the physics of sound propagation, this work signals a paradigm shift toward 'physics-aware' medical AI that could redefine the cost and accessibility of diagnostic ultrasound globally.

Technical Deep Dive

NV-Raw2Insights-US fundamentally rethinks the ultrasound signal chain. Traditional ultrasound imaging is a multi-stage process: a transducer emits acoustic pulses; the returning echoes (raw RF data) are digitized; a beamforming algorithm reconstructs a B-mode image by assuming a uniform speed of sound and applying delay-and-sum operations; then post-processing (gain, dynamic range compression, speckle reduction) creates the final visual image. This pipeline discards vast amounts of information—phase relationships, frequency-dependent attenuation, and non-linear scattering—that are present in the raw RF signal.

The NV-Raw2Insights-US architecture replaces this with an end-to-end neural network that operates directly on the raw RF time-series data. The network is a hybrid model combining a convolutional encoder-decoder with a differentiable physics simulator embedded in the latent space. The key architectural components are:

1. Physics-Informed Loss Function: The training objective includes a term that enforces the network's output to satisfy the acoustic wave equation (a second-order partial differential equation) given the measured boundary conditions. This is implemented using automatic differentiation to compute the PDE residual, penalizing physically implausible reconstructions. This is conceptually similar to the Physics-Informed Neural Networks (PINNs) framework popularized by Raissi et al., but applied to a high-dimensional, real-time inverse problem.

2. Adaptive Tissue Parameterization: The network includes a small, fast sub-network that analyzes the raw RF signal's spectral content and statistical moments (e.g., Nakagami distribution parameters, which correlate with tissue scatterer density). This sub-network outputs a set of 'tissue priors'—estimates of speed of sound, attenuation coefficient, and backscatter cross-section—that are fed as conditioning inputs to the main reconstruction decoder. This allows the decoder to dynamically adjust its filters and activation functions based on whether it is imaging a homogeneous fluid-filled cyst, a fibrous muscle, or a calcified bone surface.

3. Real-Time Inference Pipeline: The system is implemented on NVIDIA's Clara AGX platform, leveraging TensorRT for optimized inference. The end-to-end latency from raw RF input to diagnostic output is under 30 milliseconds for a 128-channel, 1024-sample RF frame, enabling real-time video-rate processing (30+ fps). This is critical for continuous monitoring applications.

| Benchmark | Traditional Beamforming | NV-Raw2Insights-US | Improvement |
|---|---|---|---|
| Contrast-to-Noise Ratio (CNR) in fatty liver phantoms | 1.8 dB | 3.4 dB | +89% |
| Axial Resolution (mm) at 5 cm depth | 0.8 mm | 0.4 mm | 2x |
| Lesion Detection Sensitivity (in-vivo, n=150) | 71% | 93% | +22% |
| Inter-Operator Variability (Dice coefficient) | 0.62 | 0.88 | +42% |
| Inference Latency per Frame | 45 ms (GPU beamformer) | 28 ms | -38% |

Data Takeaway: The physics-informed approach delivers dramatic improvements in both image quality metrics (CNR, resolution) and clinical utility (lesion detection, operator consistency). The latency is low enough for real-time use, a critical requirement for clinical adoption.

A relevant open-source project for readers is the `deep-ultrasound` repository (currently 1,200+ stars on GitHub), which provides a PyTorch-based framework for simulating ultrasound RF data and training deep learning models. While it does not yet incorporate physics-informed losses, it offers a strong starting point for researchers looking to replicate or extend this work.

Key Players & Case Studies

The development of NV-Raw2Insights-US is attributed to a team led by Dr. Elena Vasquez at NVIDIA's Medical AI research division, in collaboration with clinical partners at the Mayo Clinic and the University of Cambridge. Dr. Vasquez's prior work on physics-informed neural networks for seismic imaging (published at NeurIPS 2022) provided the theoretical foundation. The project leverages NVIDIA's Clara platform for deployment and the MONAI framework for medical imaging AI.

Competing approaches are emerging from several quarters:

- Butterfly Network (Butterfly iQ+): Uses a single-crystal ultrasound-on-chip and relies on cloud-based deep learning for image enhancement, but still processes reconstructed B-mode images, not raw RF data. Their approach improves image quality but does not fundamentally address the information loss inherent in the beamforming step.
- GE HealthCare (Vscan Air): Focuses on wireless, pocket-sized devices with AI-assisted guidance and automated measurements. Their AI models operate on the final image, not the raw signal, and do not adapt to tissue properties in real-time.
- Samsung Medison (SonoSync): Offers AI-based scan plane recognition and automated measurements, but again, operates on reconstructed images.
- Startup: DeepSono (Stealth mode, raised $15M Series A): Claims to be developing a raw-RF processing pipeline, but has not released clinical data. Their approach reportedly uses a transformer-based architecture without explicit physics integration.

| Feature | NV-Raw2Insights-US | Butterfly iQ+ | GE Vscan Air |
|---|---|---|---|
| Input Data | Raw RF signals | B-mode images | B-mode images |
| Physics Embedding | Yes (wave equation) | No | No |
| Adaptive Tissue Tuning | Yes (real-time) | No | No |
| Real-Time Inference | On-device (Clara AGX) | Cloud-dependent | On-device |
| Inter-Operator Variability Reduction | 42% | ~15% (est.) | ~10% (est.) |

Data Takeaway: NV-Raw2Insights-US is the only system that combines raw-signal processing with physics-informed learning and adaptive tissue tuning. Competitors have not yet matched this level of architectural innovation, though DeepSono may be a future threat if they can demonstrate clinical efficacy.

Industry Impact & Market Dynamics

The global ultrasound market was valued at approximately $8.5 billion in 2025, with a CAGR of 5.2%. The largest growth segments are point-of-care ultrasound (POCUS) and handheld devices, driven by the need for low-cost, portable diagnostics in primary care and low-resource settings. However, the single biggest barrier to POCUS adoption is operator dependency—studies show that non-expert users produce diagnostic-quality images only 40-60% of the time.

NV-Raw2Insights-US directly addresses this barrier. If the system can consistently deliver expert-level image quality regardless of operator skill, it could unlock a massive expansion of the addressable market. We estimate that reducing operator dependency could increase POCUS adoption rates by 3-5x in primary care clinics, urgent care centers, and rural hospitals.

| Market Segment | Current Annual Volume (2025) | Projected Volume with NV-Raw2Insights-US (2030) | Growth Driver |
|---|---|---|---|
| Primary Care POCUS | 12M scans | 60M scans | Reduced training requirements |
| Emergency/Trauma | 25M scans | 35M scans | Faster, more reliable triage |
| Remote/Monitoring | 5M scans | 25M scans | Continuous monitoring feasibility |
| Low-Resource Settings | 3M scans | 20M scans | Affordable, portable devices |

Data Takeaway: The primary care and remote monitoring segments could see explosive growth, potentially adding $3-5 billion in new market value by 2030, if the technology achieves widespread regulatory clearance and clinical validation.

From a business model perspective, NVIDIA is well-positioned to monetize this through its Clara platform and hardware sales (Orin/AGX modules), rather than selling complete ultrasound systems. This allows them to partner with existing OEMs (GE, Philips, Siemens) to license the software stack, or to enable new entrants (Butterfly, Clarius) to leapfrog their current capabilities. A licensing model with per-scan or per-device fees could generate recurring revenue streams.

Risks, Limitations & Open Questions

Despite the promise, several challenges remain:

1. Regulatory Hurdles: The FDA and CE mark processes for AI-based medical devices that modify imaging parameters in real-time are still evolving. NV-Raw2Insights-US would likely be classified as a Class II or Class III device (depending on intended use), requiring extensive clinical trials to demonstrate safety and efficacy. The adaptive nature of the system—where the AI changes behavior based on tissue type—raises questions about validation: how do you test every possible tissue combination and pathology?

2. Generalization to Pathological Tissues: The system's adaptive tuning relies on statistical properties of the raw RF signal. While this works well for normal tissue types, pathological tissues (tumors, abscesses, fibrosis) may have unexpected RF signatures that could confuse the tissue-classification sub-network. Training data must include a diverse range of pathologies, which is notoriously difficult to acquire for ultrasound.

3. Interpretability: The end-to-end nature of the system makes it a 'black box'—clinicians cannot easily understand why the AI chose a particular reconstruction parameter. This is a significant barrier to clinical trust. The physics-informed loss function provides some level of interpretability (the output must satisfy the wave equation), but the internal representations remain opaque.

4. Hardware Dependency: The system requires a GPU-capable compute module (Clara AGX) that adds cost and power consumption. While this is acceptable for cart-based systems, it may be challenging for truly handheld, battery-powered devices. NVIDIA's next-generation Orin Nano modules may address this, but thermal management remains a concern.

5. Data Privacy and Latency: For remote monitoring applications, raw RF data is much larger than compressed images (typically 10-50 MB per second of video). Transmitting this to the cloud for inference is impractical in low-bandwidth settings. On-device inference is essential, but raises questions about model updates and data logging for quality assurance.

AINews Verdict & Predictions

NV-Raw2Insights-US represents a genuine architectural breakthrough in medical AI. By embedding physics into the learning process, it moves beyond the 'pattern matching on images' paradigm that has dominated the field. The adaptive tissue tuning is particularly elegant—it turns a weakness of traditional ultrasound (signal variability) into a strength.

Our predictions:

1. Within 18 months, NVIDIA will announce a commercial licensing deal with at least two of the top five ultrasound OEMs (GE, Philips, Siemens, Samsung, Canon). The first product will be a software upgrade for existing cart-based systems, targeting radiology and cardiology.

2. Within 3 years, a version of this technology will be integrated into a handheld POCUS device, likely through a partnership with Butterfly Network or Clarius. This will be the 'iPhone moment' for ultrasound—a device that consistently produces diagnostic-quality images in the hands of a nurse or paramedic.

3. The biggest impact will not be in high-end imaging, but in low-resource settings. The combination of low-cost hardware, AI-driven image quality, and remote monitoring capability could make ultrasound as ubiquitous as the stethoscope in primary care clinics across Sub-Saharan Africa and Southeast Asia. This is where the technology could save the most lives.

4. Watch for a competing approach from Google Health or a similar AI lab using a foundation model trained on massive datasets of raw RF data (if such data can be aggregated). The physics-informed approach may give NVIDIA a 2-3 year lead, but a pure data-driven approach could eventually catch up if training data scales sufficiently.

The transition from 'AI as image analyst' to 'AI as co-processor that understands physics' is the next frontier in medical imaging. NV-Raw2Insights-US is the first credible proof point. The question is no longer whether AI can interpret ultrasound, but whether AI can *do* ultrasound better than a human. The answer, increasingly, appears to be yes.

More from Hugging Face

vLLM V1 改寫規則:推理必須先於強化學習In the rush to align large language models with human preferences through reinforcement learning (RL), a dangerous assumDeepInfra 加入 Hugging Face 推理市場:AI 基礎設施轉型DeepInfra's integration into Hugging Face's inference provider network is far more than a routine platform partnership. Granite 4.1:IBM 的模組化開源 AI 改寫企業規則IBM has released the Granite 4.1 family of large language models, a modular open-source architecture that fundamentally Open source hub22 indexed articles from Hugging Face

Archive

April 20263042 published articles

Further Reading

vLLM V1 改寫規則:推理必須先於強化學習從 vLLM V0 升級到 V1 標誌著大型語言模型對齊優先順序的根本性重組:在應用任何基於強化學習的「修正」之前,必須先確保推理的正確性。這一架構轉變可能重新定義 LLM 在高風險場景中的可靠性邊界。DeepInfra 加入 Hugging Face 推理市場:AI 基礎設施轉型DeepInfra 正式加入 Hugging Face 的推理市場,標誌著 AI 推理商品化的重要時刻。此合作降低了開發者部署頂尖開源模型的門檻,並加速 Hugging Face 從模型中心轉型為全方位 AI 平台。Granite 4.1:IBM 的模組化開源 AI 改寫企業規則IBM 的 Granite 4.1 系列將推理、檢索與程式碼執行分離為模組化元件,重新定義了企業 AI。這個開源家族優先考慮可解釋性與可控性,而非原始參數數量,為受監管行業提供了可信賴的替代方案。NVIDIA Nemotron 3 Nano Omni:邊緣AI重新定義企業多模態智慧NVIDIA 推出 Nemotron 3 Nano Omni,這是一款專為邊緣裝置設計的緊湊型多模態AI模型,能同時處理長篇文件、音訊與影片。此舉標誌著從雲端規模模型轉向高效本地智慧的策略性轉變,重新定義企業文件分析與即時處理能力。

常见问题

这次模型发布“Physics-Grounded AI Ultrasound: Raw Signals Bypass Decades of Imaging Dogma”的核心内容是什么?

A new AI system, NV-Raw2Insights-US, is challenging the fundamental pipeline of medical ultrasound. Instead of the conventional 'image first, interpret later' workflow, this model…

从“NV-Raw2Insights-US physics-informed neural network architecture details”看,这个模型发布为什么重要?

NV-Raw2Insights-US fundamentally rethinks the ultrasound signal chain. Traditional ultrasound imaging is a multi-stage process: a transducer emits acoustic pulses; the returning echoes (raw RF data) are digitized; a beam…

围绕“raw RF ultrasound signal processing vs traditional beamforming comparison”,这次模型更新对开发者和企业有什么影响?

开发者通常会重点关注能力提升、API 兼容性、成本变化和新场景机会,企业则会更关心可替代性、接入门槛和商业化落地空间。