site:www.marktechpost.com

Cache-to-Cache(C2C): Direct Semantic Communication Between Large Language Models via KV-Cache Fusion

Can large language models collaborate without sending a single token of text? a team of researchers from Tsinghua University, Infinigence AI, The Chinese University of Hong Kong, Shanghai AI ...

marktechpost

Comparing the Top 6 OCR (Optical Character Recognition) Models/Systems in 2025

Optical character recognition has moved from plain text extraction to document intelligence. Modern systems must read scanned and digital PDFs in one pass, preserve layout, detect tables, extract key ...

marktechpost

Google vs OpenAI vs Anthropic: The Agentic AI Arms Race Breakdown

In this article we will analyze how Google, OpenAI, and Anthropic are productizing ‘agentic’ capabilities across computer-use control, tool/function calling, orchestration, governance, and enterprise ...

marktechpost

OpenAI Introduces ChatGPT Atlas: A Chromium-based browser with a built-in AI agent

Atlas is a Chromium-based browser that keeps a persistent ChatGPT interface in the new tab page and as an “Ask ChatGPT” sidebar on any site. Users can summarize pages, compare products, extract data, ...

marktechpost

Context Engineering

Anthropic recently released a guide on effective Context Engineering for AI Agents — a reminder that context is a critical yet limited resource. The... In this tutorial, we explore how to build a ...

marktechpost

Alibaba’s Qwen AI Releases Compact Dense Qwen3-VL 4B/8B (Instruct & Thinking) With FP8 Checkpoints

Do you actually need a giant VLM when dense Qwen3-VL 4B/8B (Instruct/Thinking) with FP8 runs in low VRAM yet retains 256K→1M context and the full capability surface? Alibaba’s Qwen team has expanded ...

marktechpost

Model Context Protocol (MCP) vs Function Calling vs OpenAPI Tools — When to Use Each?

Orchestration Host routes across many servers/tools App-local chaining Agent/toolkit routes intents → operations ...

marktechpost

Google Proposes TUMIX: Multi-Agent Test-Time Scaling With Tool-Use Mixture

TUMIX runs a group of heterogeneous agents—text-only Chain-of-Thought, code-executing, web-searching, and guided variants—in parallel, then iterates a small number of refinement rounds where each ...

marktechpost

IBM Released new Granite 4.0 Models with a Novel Hybrid Mamba-2/Transformer Architecture: Drastically Reducing Memory Use without Sacrificing Performance

IBM just released Granite 4.0, an open-source LLM family that swaps monolithic Transformers for a hybrid Mamba-2/Transformer stack to cut serving memory while keeping quality. Sizes span a 3B dense ...

marktechpost

Google AI Proposes ReasoningBank: A Strategy-Level I Agent Memory Framework that Makes LLM Agents Self-Evolve at Test Time

Google AI Proposes ReasoningBank: A Strategy-Level AI Agent Memory Framework that Makes LLM Agents Self-Evolve at Test Time ...

marktechpost

MLPerf Inference v5.1 (2025): Results Explained for GPUs, CPUs, and AI Accelerators

decoding MLPerf Inference v5.1 2025 results, scenarios, TTFT/TPOT, power metrics for GPUs, CPUs, accelerators, datacenter, edge ...

marktechpost

Sakana AI Released ShinkaEvolve: An Open-Source Framework that Evolves Programs for Scientific Discovery with Unprecedented Sample-Efficiency

An open-source framework that couples LLM-driven program mutations with evolutionary search to automate algorithm discovery and optimization. Code and report are public. 2) How does it achieve higher ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results