Introduction – Web Search and Extract Api for AI Agents and RAG workflows

Docs

Introduction

Web search and content extract api for AI Agents and RAG workflows

One sentence introduction

Similar to OpenRouter, but we focus on the web search and content extraction API.

Background

AI agents and RAG workflows acquire the real-time, dynamic knowledge they crave to deepen their understanding of problems and work better. Web Search API is fundamentally transforming AI agents and Retrieval-Augmented Generation (RAG) workflows by providing access to real-time, dynamic information, effectively overcoming the limitations of static knowledge inherent in large language models (LLMs). By integrating Web Search API, AI agents can perform intent-aware retrieval, breaking down complex queries into sub-tasks and fetching the most relevant, up-to-date information from the web.

This capability is critical for multi-step reasoning and planning, enabling agents to execute parallel actions—such as checking weather, booking hotels, and planning routes—simultaneously through directed acyclic graph (DAG)-based task planners. For instance, systems like Mistral AI's Agents API demonstrate that augmenting LLMs with web search tools significantly enhances accuracy; benchmark performance improved from 23% to 75% for Mistral Large and from 22% to 82% for Mistral Medium on SimpleQA tasks when web search was enabled

Web search API is very valuable in terms of price. Here are the quotes from major providers:

Provider	Cost(per 1000 searches)
Google	$35
Bing	$35
Openai	$10
Anthropic	$10
Tavily	$8~$16
Brave	$3~$5

What are we doing?

We are building a web search API router service that aggregates web search API providers in the market to offer a cost-effective and high-quality web search API service.
Crawl and analyze web pages to extract the main content and generate summaries, saving the raw content and summaries in Markdown syntax to ensure perfect transmission of web information to LLMs.
Priority will be given to processing content related to Wikipedia, programming, medicine, law, and other similar fields.
In the future, we will build a search engine specifically for text-based web pages. For now, we will not process information in video or audio formats, as most current scenarios involve text-based core information as input for LLMs.

Feasibility Analysis

There are many web search API providers on the market. We can integrate them and route queries to suitable providers.
Historical data can be used to train AI models, reducing future operational costs.
Users can contribute data and receive rewards, which will also make our system more competitive.
Common Crawl has billions of web pages, and we have nearly 60 million (query, SERP) pairs. Based on these, we can build our own index and ranker.
We have algorithm experts who have previously worked at Google and Alibaba, bringing extensive experience.
Our peers have all raised money and are growing rapidly. Tavily raised $25M and Firecrawl just raised a $14.5M Series A.

Dev team promise

All creator fees will be used to purchase servers.
Pump fun Live every Monday night to report on project progress.
Gross Profit distribution: 50% towards the token, 50% towards project operation.

FAQ