site:www.marktechpost.com

Comparing the Top 6 OCR (Optical Character Recognition) Models/Systems in 2025

Optical character recognition has moved from plain text extraction to document intelligence. Modern systems must read scanned and digital PDFs in one pass, preserve layout, detect tables, extract key ...

marktechpost

Cache-to-Cache(C2C): Direct Semantic Communication Between Large Language Models via KV-Cache Fusion

Can large language models collaborate without sending a single token of text? a team of researchers from Tsinghua University, Infinigence AI, The Chinese University of Hong Kong, Shanghai AI ...

marktechpost

A New AI Research from Anthropic and Thinking Machines Lab Stress Tests Model Specs and Reveal Character Differences among Language Models

AI companies use model specifications to define target behaviors during training and evaluation. Do current specs state the intended behaviors with enough precision, and do frontier models exhibit ...

marktechpost

The Local AI Revolution: Expanding Generative AI with GPT-OSS-20B and the NVIDIA RTX AI PC

The landscape of AI is expanding. Today, many of the most powerful LLMs (large language models) reside primarily in the cloud, offering incredible capabilities but also concerns about privacy and ...

marktechpost

The 20 Hottest Agentic AI Tools And Agents Of 2025 (So Far)

Shobha is a data analyst with a proven track record of developing innovative machine-learning solutions that drive business value.

marktechpost

The Ultimate Guide to Vibe Coding: Benefits, Tools, and Future Trends

Vibe Coding is redefining the software landscape by harnessing artificial intelligence to make code creation faster, more intuitive, and accessible to virtually anyone. In 2025, this trend has moved ...

marktechpost

NVIDIA AI Just Released the Largest Open-Source Speech AI Dataset and State-of-the-Art Models for European Languages

Canary-1b-v2: Multilingual ASR + Translation (En ↔ 24 Languages) Canary-1b-v2 is a billion-parameter Encoder-Decoder model trained on Granary, delivering high-quality transcription and translation ...

marktechpost

MCP Team Launches the Preview Version of the ‘MCP Registry’: A Federated Discovery Layer for Enterprise AI

The Model Context Protocol (MCP) team has released the preview version of the MCP Registry, a system that could be the final puzzle piece for making enterprise AI truly production-ready. More than ...

marktechpost

Xiaomi Released MiMo-Audio, a 7B Speech Language Model Trained on 100M+ Hours with High-Fidelity Discrete Tokens

Xiaomi’s MiMo team released MiMo-Audio, a 7-billion-parameter audio-language model that runs a single next-token objective over interleaved text and discretized speech, scaling pretraining beyond 100 ...

marktechpost

What is Agentic RAG? Use Cases and Top Agentic RAG Tools (2025)

Agentic RAG combines the strengths of traditional RAG—where large language models (LLMs) retrieve and ground outputs in external context—with agentic decision-making and tool use. Unlike static ...

marktechpost

Google Releases 76-Page Whitepaper on AI Agents: A Deep Technical Dive into Agentic RAG, Evaluation Frameworks, and Real-World Architectures

At the center of this release is the evolution of RAG architectures. Traditional RAG pipelines typically involve static queries to vector stores followed by synthesis via large language models.

marktechpost

A New MIT Study Shows Reinforcement Learning Minimizes Catastrophic Forgetting Compared to Supervised Fine-Tuning

What is catastrophic forgetting in foundation models? Foundation models excel in diverse domains but are largely static once deployed. Fine-tuning on new tasks often introduces catastrophic forgetting ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results