[
  {
    "owner": "calesthio",
    "name": "OpenMontage",
    "full_name": "calesthio/OpenMontage",
    "url": "https://github.com/calesthio/OpenMontage",
    "description": "World's first open-source, agentic video production system. 12 pipelines, 52 tools, 500+ agent skills. Turn your AI coding assistant into a full video production studio.",
    "language": "Python",
    "total_stars": 26831,
    "forks": 2973,
    "stars_this_period": 18000,
    "source_slice": "all",
    "source_slices": [
      "all",
      "python"
    ],
    "metadata": {
      "topics": [
        "agent",
        "agentic-ai",
        "ai",
        "claude",
        "copilot",
        "cursor",
        "elevenlabs",
        "ffmpeg",
        "flux",
        "image-generation",
        "open-source",
        "openai",
        "python",
        "remotion",
        "stable-diffusion",
        "text-to-speech",
        "text-to-video",
        "video-generation",
        "video-production"
      ],
      "license": "AGPL-3.0",
      "open_issues": 120,
      "created_at": "2026-03-29T15:23:22Z",
      "pushed_at": "2026-06-28T00:21:35Z",
      "homepage": "https://github.com/calesthio/OpenMontage",
      "default_branch": "main",
      "forks": 2973,
      "watchers": 150,
      "archived": false,
      "size_kb": 69899
    },
    "readme_content": "<p align=\"center\">\n  <img src=\"assets/logo.png\" alt=\"OpenMontage\" width=\"200\">\n</p>\n\n<h1 align=\"center\">OpenMontage</h1>\n\n<p align=\"center\"><strong>The first open-source, agentic video production system.</strong></p>\n\n<p align=\"center\">\n  <a href=\"#start-from-a-video-you-already-love\">Paste A Video</a> &nbsp;·&nbsp;\n  <a href=\"#quick-start\">Quick Start</a> &nbsp;·&nbsp;\n  <a href=\"#try-these-prompts\">Try These Prompts</a> &nbsp;·&nbsp;\n  <a href=\"#pipelines\">Pipelines</a> &nbsp;·&nbsp;\n  <a href=\"#how-it-works\">How It Works</a> &nbsp;·&nbsp;\n  <a href=\"docs/PROVIDERS.md\">Providers</a> &nbsp;·&nbsp;\n  <a href=\"docs/PR_REVIEW_GUIDE.md\">Review Guide</a> &nbsp;·&nbsp;\n  <a href=\"AGENT_GUIDE.md\">Agent Guide</a>\n</p>\n\n<p align=\"center\">\n  <a href=\"LICENSE\"><img src=\"https://img.shields.io/badge/license-AGPLv3-blue.svg\" alt=\"License\"></a>\n</p>\n\n<p align=\"center\">\n  <a href=\"https://github.com/trending\">\n    <picture>\n      <source media=\"(prefers-color-scheme: dark)\" srcset=\".github/assets/repo-of-the-day-dark.svg\">\n      <img alt=\"🏆 #1 Repository of the Day on GitHub Trending\" src=\".github/assets/repo-of-the-day-light.svg\" height=\"60\">\n    </picture>\n  </a>\n</p>\n\n<p align=\"center\"><strong>Follow The Build</strong></p>\n\n<p align=\"center\">\n  <a href=\"https://www.youtube.com/@OpenMontage\"><img src=\"https://img.shields.io/badge/YouTube-%40OpenMontage-FF0000?style=for-the-badge&logo=youtube&logoColor=white\" alt=\"YouTube\"></a>\n  <a href=\"https://x.com/calesthioailabs\"><img src=\"https://img.shields.io/badge/X-%40calesthioailabs-111111?style=for-the-badge&logo=x&logoColor=white\" alt=\"X\"></a>\n  <a href=\"https://github.com/calesthio/OpenMontage/discussions\"><img src=\"https://img.shields.io/badge/Community-GitHub%20Discussions-0b1220?style=for-the-badge&logo=github&logoColor=white\" alt=\"GitHub Discussions\"></a>\n</p>\n\n---\n\nTurn your AI coding assistant into a full video production studio. Describe what you want in plain language — your agent handles research, scripting, asset generation, editing, and final composition.\n\n**Important distinction:** OpenMontage can make image-based videos, but it can also make a real **video video** for free/open-source workflows: the agent builds a corpus from free stock footage and open archives, retrieves actual motion clips, edits them into a timeline, and renders a finished piece. That is not the usual \"animate a handful of stills and call it video\" trick.\n\n<div align=\"center\">\n  <video src=\"https://github.com/user-attachments/assets/f77ce7a4-68b8-4f94-a287-e94bf50a32e1\" width=\"100%\" controls></video>\n</div>\n\n> **\"SIGNAL FROM TOMORROW\"** — a cinematic sci-fi trailer fully produced through OpenMontage: concept, script, scene plan, Veo-generated motion clips, soundtrack, and Remotion composition.\n\n<div align=\"center\">\n  <video src=\"https://github.com/user-attachments/assets/8daca07f-cdf8-4bec-89c3-9dc2176363fa\" width=\"100%\" controls></video>\n</div>\n\n> **\"THE LAST BANANA\"** — a 60-second Pixar-style animated short about a lonely banana who finds friendship with a kiwi. 6 Kling v3-generated motion clips (via fal.ai), Google Chirp3-HD narration, royalty-free piano music, TikTok-style word-level captions, and Remotion composition. Total cost: **$1.33**.\n\n<div align=\"center\">\n  <video src=\"https://github.com/user-attachments/assets/e03b5d1f-1199-4093-9f31-a43aa9da2c68\" width=\"100%\" controls></video>\n</div>\n\n> **\"The Library at Alexandria\"** — a 70-second history elegy on what humanity lost in a single night. Five hand-authored scenes — an illuminated manuscript page, cascading scroll-tags, a Burning Counter ticking 700,000 → 0 inside a candle's flame, a charred vellum fragment with surviving Greek, and an empty void — set to OpenAI 'ash' narration and a free Pixabay strings score. Total cost: **$0.02**. Built through OpenMontage's atelier (bespoke) composition mode — every scene crafted from scratch, no shared components.\n\n<div align=\"center\">\n  <video src=\"https://github.com/user-attachments/assets/8a6d2cc3-7ad2-46f5-922f-a8e3e5848d9f\" width=\"100%\" controls></video>\n</div>\n\n> **\"VOID — Neural Interface\"** — a product ad produced with just one API key (OpenAI). 4 AI-generated images (gpt-image-1), TTS narration, auto-sourced royalty-free music, word-level subtitles via WhisperX, and Remotion data visualizations. Total cost: **$0.69**. Zero manual asset work.\n\n<div align=\"center\">\n  <video src=\"https://github.com/user-attachments/assets/3c5d7122-7198-43e2-a97d-ed27558dd324\" width=\"100%\" controls></video>\n</div>\n\n> **\"Afternoon in Candyland\"** — a Ghibli-style anime animation. A little girl's whimsical afternoon adventure through candy gates, gumdrop rivers, and lollipop gardens. 12 FLUX-generated images with multi-image crossfade, cinematic camera motion (zoom, pan, Ken Burns), sparkle/petal/firefly particle overlays, and ambient music with auto-detected energy offset. Total cost: **$0.15**. No video generation, no manual editing.\n\n<div align=\"center\">\n  <video src=\"https://github.com/user-attachments/assets/e8dc5e32-5c70-46de-bd52-eef887719d13\" width=\"100%\" controls></video>\n</div>\n\n> **\"Mori no Seishin\"** — a Ghibli-style anime animation of a forest spirit's journey through ancient woods. 12 FLUX-generated images with parallax crossfade, drift and pan camera motion, firefly and petal particles, cinematic vignette lighting, and ambient forest soundtrack. Total cost: **$0.15**. Still images brought to life through Remotion's animation engine.\n\n<p align=\"center\">\n  <a href=\"https://www.youtube.com/@OpenMontage?sub_confirmation=1\"><strong>Subscribe to @OpenMontage on YouTube</strong></a> to see new videos as they ship — every video includes the full prompt, pipeline, tools used, and cost so you can reproduce it yourself.\n</p>\n\n---\n\n## Start From A Video You Already Love\n\nStarting from a reference video is often faster than starting from a blank prompt.\n\nOpenMontage can start from a **YouTube video, Short, Reel, TikTok, or local clip** and turn it into a grounded production plan:\n\n1. **Paste a reference video**\n2. **The agent analyzes transcript, pacing, scenes, keyframes, and style**\n3. **You get 2-3 differentiated concepts, an honest tool path, cost estimates, and a sample before full production**\n\n```text\n\"Here's a YouTube Short I love. Make me something like this, but about quantum computing.\"\n```\n\nWhat you get back is not \"best guess prompt spaghetti.\" You get:\n\n- **What it keeps** from the reference: pacing, hook style, structure, tone\n- **What it changes**: topic, visual treatment, angle, narration approach\n- **What it will cost** at your target duration, before asset generation starts\n- **What it will actually look like** with your currently available tools\n\nWorks with **Claude Code, Cursor, Copilot, Windsurf, Codex** — any AI coding assistant that can read files and run code.\n\n---\n\n## Quick Start\n\n### Prerequisites\n\n- **Python 3.10+** — [python.org](https://www.python.org/downloads/)\n- **FFmpeg** — `brew install ffmpeg` / `sudo apt install ffmpeg` / [ffmpeg.org](https://ffmpeg.org/download.html)\n- **Node.js 18+** — [nodejs.org](https://nodejs.org/)\n- **An AI coding assistant** — Claude Code, Cursor, Copilot, Windsurf, or Codex\n\n### Install & Run\n\n```bash\ngit clone https://github.com/calesthio/OpenMontage.git\ncd OpenMontage\nmake setup\n```\n\nOpen the project in your AI coding assistant and tell it what you want:\n\n```\n\"Make a 60-second animated explainer about how neural networks learn\"\n```\n\nOr if you want the real-footage path:\n\n```text\n\"Make a 75-second documentary montage about city life in the rain. Use real footage only, no narration, elegiac tone, with music.\"\n```\n\nThat's it. The agent researches your topic with live web search, generates AI images, writes and narrates the script with voice direction, finds royalty-free background music automatically, burns in word-level subtitles, and renders the final video. Before you see anything, the system runs a multi-point self-review — ffprobe validation, frame sampling, audio level analysis, delivery promise verification, and subtitle checks. Every provider selection is scored across 7 dimensions with an auditable decision log. Every creative decision gets your approval.\n\n> **No `make`?** Run manually: `pip install -r requirements.txt && cd remotion-composer && npm install && cd .. && pip install piper-tts && cp .env.example .env`\n>\n> **Windows:** If `npm install` fails with `ERR_INVALID_ARG_TYPE`, use `npx --yes npm install` instead.\n\n### If You're An OpenClaw Agent Reading This, Congratulations: You Probably Passed The Turing Test\n\nThis repo is built for agentic operation. If you're an OpenClaw-style agent, here is the shortest path to becoming useful fast:\n\n1. **Read the contract first**\n   Start with [`AGENT_GUIDE.md`](AGENT_GUIDE.md), then [`PROJECT_CONTEXT.md`](PROJECT_CONTEXT.md).\n2. **Do not improvise the production workflow**\n   OpenMontage is pipeline-driven. Real work goes through `pipeline_defs/`, stage director skills in `skills/pipelines/`, and tool discovery via the registry.\n3. **Check the actual capability envelope**\n   Run:\n   ```bash\n   python -c \"from tools.tool_registry import registry; import json; registry.discover(); print(json.dumps(registry.support_envelope(), indent=2))\"\n   python -c \"from tools.tool_registry import registry; import json; registry.discover(); print(json.dumps(registry.provider_menu(), indent=2))\"\n   ```\n4. **Treat every video request as a pipeline selection problem**\n   Pick the right pipeline first, then read the manifest, then read the stage skill, then use tools.\n\n### Add API Keys (optional — more keys = more tools)\n\n```bash\n# .env — every key is optional, add what you have\n\n# Image + video gateway:\nFAL_KEY=your-key               # FLUX images + Google Veo, Kling, MiniMax video + Recraft images\n\n# Free stock media:\nPEXELS_API_KEY=your-key        # Free stock footage and images\nPIXABAY_API_KEY=your-key       # Free stock footage and images\nUNSPLASH_ACCESS_KEY=your-key   # Free stock images\n\n# Music:\nSUNO_API_KEY=your-key          # Full songs, instrumentals, any genre\n\n# Voice & images:\nELEVENLABS_API_KEY=your-key    # Premium TTS, AI music, sound effects\nOPENAI_API_KEY=your-key        # OpenAI TTS, DALL-E 3 images\nXAI_API_KEY=your-key           # xAI Grok image edits/generation + Grok video generation\nGOOGLE_API_KEY=your-key        # Google Imagen images, Google TTS (700+ voices)\n\n# More video providers:\nHEYGEN_API_KEY=your-key        # HeyGen — VEO, Sora, Runway, Kling via single gateway\nRUNWAY_API_KEY=your-key        # Runway Gen-4 direct\n```\n\n<details>\n<summary><strong>Have a GPU? Unlock free local video generation</strong></summary>\n\n```bash\nmake install-gpu\n\n# Then add to .env:\nVIDEO_GEN_LOCAL_ENABLED=true\nVIDEO_GEN_LOCAL_MODEL=wan2.1-1.3b  # or wan2.1-14b, hunyuan-1.5, ltx2-local, cogvideo-5b\n```\n\n</details>\n\n---\n\n## What You Get With Zero API Keys\n\nYou don't need paid API keys to make real videos. Out of the box, `make setup` gives you:\n\n| Capability | Free Tool | What It Does |\n|-----------|-----------|-------------|\n| **Narration** | Piper TTS | Free offline text-to-speech — real human-sounding narration |\n| **Open footage** | Archive.org + NASA + Wikimedia Commons | Free/open archival footage, educational media, and documentary texture |\n| **Extra stock** | Pexels + Unsplash + Pixabay | Free stock footage/images (developer keys are free to get) |\n| **Composition (React)** | Remotion | React-based rendering — spring-animated image scenes, text cards, stat cards, charts, TikTok-style word-level captions, TalkingHead |\n| **Composition (HTML/GSAP)** | HyperFrames | HTML/CSS/GSAP rendering — kinetic typography, product promos, launch reels, registry blocks, website-to-video, rigged SVG character animation |\n| **Post-production** | FFmpeg | Encoding, subtitle burn-in, audio mixing, color grading |\n| **Subtitles** | Built-in | Auto-generated captions with word-level timing |\n\nOpenMontage picks between Remotion and HyperFrames at proposal time (locked as `rende",
    "manifest_file": "requirements.txt",
    "manifest_content": "# OpenMontage - Core Dependencies\npyyaml>=6.0\npydantic>=2.0\njsonschema>=4.20\npython-dotenv>=1.0\nPillow>=10.0\nnumpy>=1.24\nrequests>=2.31\ngoogle-auth>=2.0       # service-account auth for Google TTS + Imagen (Vertex AI)\n",
    "strategic_keywords": [
      "agent",
      "rag",
      "skill",
      "workflow"
    ],
    "relationship_label": "Skill 来源",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme",
      "requirements.txt"
    ],
    "score_breakdown": {
      "heat": 20,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 10,
      "relation": 10,
      "risk_signal": 10,
      "total": 99
    },
    "strategic_score": 99
  },
  {
    "owner": "topoteretes",
    "name": "cognee",
    "full_name": "topoteretes/cognee",
    "url": "https://github.com/topoteretes/cognee",
    "description": "Cognee is the open-source AI memory platform for agents. Give your AI agents persistent long-term memory across sessions with a self-hosted knowledge graph engine.",
    "language": "Python",
    "total_stars": 24869,
    "forks": 2302,
    "stars_this_period": 5519,
    "source_slice": "all",
    "source_slices": [
      "all",
      "python"
    ],
    "metadata": {
      "topics": [
        "agent-memory",
        "agent-skills",
        "ai",
        "ai-agents",
        "ai-memory",
        "cognitive-architecture",
        "cognitive-memory",
        "context-engineering",
        "contributions-welcome",
        "good-first-issue",
        "good-first-pr",
        "graph-database",
        "graph-rag",
        "help-wanted",
        "knowledge",
        "knowledge-graph",
        "memory-management",
        "open-source",
        "vector-database"
      ],
      "license": "Apache-2.0",
      "open_issues": 373,
      "created_at": "2023-08-16T16:16:33Z",
      "pushed_at": "2026-06-27T14:39:34Z",
      "homepage": "https://www.cognee.ai",
      "default_branch": "main",
      "forks": 2302,
      "watchers": 74,
      "archived": false,
      "size_kb": 198847
    },
    "readme_content": "<div align=\"center\">\n  <a href=\"https://github.com/topoteretes/cognee\">\n    <img src=\"https://raw.githubusercontent.com/topoteretes/cognee/refs/heads/dev/assets/cognee-logo-transparent.png\" alt=\"Cognee Logo\" height=\"60\">\n  </a>\n\n  <br />\n\n  Cognee - The Open-Source AI Memory Platform for Agents\n\n  <p align=\"center\">\n  <a href=\"https://www.youtube.com/watch?v=8hmqS2Y5RVQ&t=13s\">Demo</a>\n  .\n  <a href=\"https://docs.cognee.ai/\">Docs</a>\n  .\n  <a href=\"https://cognee.ai\">Learn More</a>\n  ·\n  <a href=\"https://discord.gg/NQPKmU5CCg\">Join Discord</a>\n  ·\n  <a href=\"https://www.reddit.com/r/AIMemory/\">Join r/AIMemory</a>\n  .\n  <a href=\"https://github.com/topoteretes/cognee-community\">Community Plugins & Add-ons</a>\n  </p>\n\n\n  [![GitHub forks](https://img.shields.io/github/forks/topoteretes/cognee.svg?style=social&label=Fork&maxAge=2592000)](https://GitHub.com/topoteretes/cognee/network/)\n  [![GitHub stars](https://img.shields.io/github/stars/topoteretes/cognee.svg?style=social&label=Star&maxAge=2592000)](https://GitHub.com/topoteretes/cognee/stargazers/)\n  [![GitHub commits](https://badgen.net/github/commits/topoteretes/cognee)](https://GitHub.com/topoteretes/cognee/commit/)\n  [![GitHub tag](https://badgen.net/github/tag/topoteretes/cognee)](https://github.com/topoteretes/cognee/tags/)\n  [![Downloads](https://static.pepy.tech/badge/cognee)](https://pepy.tech/project/cognee)\n  [![License](https://img.shields.io/github/license/topoteretes/cognee?colorA=00C586&colorB=000000)](https://github.com/topoteretes/cognee/blob/main/LICENSE)\n  [![Contributors](https://img.shields.io/github/contributors/topoteretes/cognee?colorA=00C586&colorB=000000)](https://github.com/topoteretes/cognee/graphs/contributors)\n  <a href=\"https://github.com/sponsors/topoteretes\"><img src=\"https://img.shields.io/badge/Sponsor-❤️-ff69b4.svg\" alt=\"Sponsor\"></a>\n\n<p>\n  <a href=\"https://trendshift.io/repositories/13955\" target=\"_blank\" style=\"display:inline-block;\">\n    <img src=\"https://trendshift.io/api/badge/repositories/13955\" alt=\"topoteretes%2Fcognee | Trendshift\" width=\"250\" height=\"55\" />\n  </a>\n</p>\n\nCognee is the open-source AI memory platform that gives AI agents persistent long-term memory across sessions. Ingest data in any format, build a self-hosted knowledge graph, and let every agent recall, connect, and act with full context\n\n  <p align=\"center\">\n  🌐 This README is also available in:\n  :\n  <!-- Keep these links. Translations will automatically update with the README. -->\n  <a href=\"https://www.readme-i18n.com/topoteretes/cognee?lang=de\">Deutsch</a> |\n  <a href=\"https://www.readme-i18n.com/topoteretes/cognee?lang=es\">Español</a> |\n  <a href=\"https://www.readme-i18n.com/topoteretes/cognee?lang=fr\">Français</a> |\n  <a href=\"https://www.readme-i18n.com/topoteretes/cognee?lang=ja\">日本語</a> |\n  <a href=\"README_ko.md\">한국어</a> |\n  <a href=\"https://www.readme-i18n.com/topoteretes/cognee?lang=pt\">Português</a> |\n  <a href=\"https://www.readme-i18n.com/topoteretes/cognee?lang=ru\">Русский</a> |\n  <a href=\"https://www.readme-i18n.com/topoteretes/cognee?lang=zh\">中文</a>\n  </p>\n\n<p align=\"center\">\n  <img src=\"assets/cognee-demo.gif\" alt=\"Cognee Demo\" width=\"80%\" />\n</p>\n</div>\n\n📄 Read the research paper: [Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning](https://arxiv.org/abs/2505.24478) — Markovic et al., 2025\n\n\n## About Cognee\n\nCognee is an open-source AI memory platform for AI Agents. Ingest data in any format, and Cognee continuously builds a self-hosted knowledge graph that gives your agents persistent long-term memory across sessions. Cognee combines vector embeddings, graph reasoning, and cognitive-science-grounded ontology generation to make documents both searchable by meaning and connected by relationships that evolve as your knowledge does.\n\n:star: _Help us reach more developers and grow the cognee community. Star this repo!_\n\n:books: _Check our detailed [documentation](https://docs.cognee.ai/getting-started/installation#environment-configuration) for setup and configuration._\n\n:crab: _Available as a plugin for your OpenClaw — [cognee-openclaw](https://www.npmjs.com/package/@cognee/cognee-openclaw)_\n\n✴️ _Available as a plugin for your Claude Code — [claude-code-plugin](https://github.com/topoteretes/cognee-integrations/tree/main/integrations/claude-code)_\n\n🦀 _Available as a Rust client — [cognee-rs](https://github.com/topoteretes/cognee-rs)_\n\n🟦 _Available as a TypeScript client — [@cognee/cognee-ts](https://www.npmjs.com/package/@cognee/cognee-ts)_\n\n\n\n### Why use Cognee:\n\n- Easily Build Company Brain - unify data from various sources in one place and enable Agents with your domain knowledge\n- Knowledge infrastructure — unified ingestion, graph/vector search, runs locally, ontology grounding, multimodal\n- Persistent and Learning Agents - learn from feedback, context management, cross-agent knowledge sharing\n- Reliable and Trustworthy Agents - agentic user/tenant isolation, traceability, OTEL collector, audit traits\n\n### How it Works\n\n<p align=\"center\">\n  <img src=\"assets/remember.svg\" alt=\"Cognee Products\" width=\"80%\" />\n</p>\n\n<p align=\"center\">\n  <img src=\"assets/recall.svg\" alt=\"Cognee Recall\" width=\"80%\" />\n</p>\n\n## Basic Usage & Feature Guide\n\nTo learn more, [check out this short, end-to-end Colab walkthrough](https://colab.research.google.com/drive/12Vi9zID-M3fpKpKiaqDBvkk98ElkRPWy?usp=sharing) of Cognee's core features.\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12Vi9zID-M3fpKpKiaqDBvkk98ElkRPWy?usp=sharing)\n\n## Quickstart\n\nLet’s try Cognee in just a few lines of code.\n\n### Prerequisites\n\n- Python 3.10 to 3.14\n\n### Step 1: Install Cognee\n\nYou can install Cognee with **pip**, **poetry**, **uv**, or your preferred Python package manager.\n\n```bash\nuv pip install cognee\n```\n\n### Step 2: Configure the LLM\n```python\nimport os\nos.environ[\"LLM_API_KEY\"] = \"YOUR OPENAI_API_KEY\"\n```\nAlternatively, create a `.env` file using our [template](https://github.com/topoteretes/cognee/blob/main/.env.template).\n\nTo integrate other LLM providers, see our [LLM Provider Documentation](https://docs.cognee.ai/setup-configuration/llm-providers).\n\n### Step 3: Run the Pipeline\n\nCognee's API gives you four operations — `remember`, `recall`, `forget`, and `improve`:\n\n```python\nimport cognee\nimport asyncio\n\n\nasync def main():\n    # Store permanently in the knowledge graph (runs add + cognify + improve)\n    await cognee.remember(\"Cognee turns documents into AI memory.\")\n\n    # Store in session memory (fast cache, syncs to graph in background)\n    await cognee.remember(\"User prefers detailed explanations.\", session_id=\"chat_1\")\n\n    # Query with auto-routing (picks best search strategy automatically)\n    results = await cognee.recall(\"What does Cognee do?\")\n    for result in results:\n        print(result)\n\n    # Query session memory first, fall through to graph if needed\n    results = await cognee.recall(\"What does the user prefer?\", session_id=\"chat_1\")\n    for result in results:\n        print(result)\n\n    # Delete when done\n    await cognee.forget(dataset=\"main_dataset\")\n\n\nif __name__ == '__main__':\n    asyncio.run(main())\n\n```\n\n### Use the Cognee CLI\n\n```bash\ncognee-cli remember \"Cognee turns documents into AI memory.\"\n\ncognee-cli recall \"What does Cognee do?\"\n\ncognee-cli forget --all\n```\n\nTo open the local UI, run:\n```bash\ncognee-cli -ui\n```\n\n> **Note:** The MCP server launched by `cognee-cli -ui` runs inside a Docker container.\n> Docker Desktop, Colima, or any OCI-compatible runtime with a working `docker` CLI is\n> required. See [Docker & Colima Setup](docs/docker-colima-setup.md) for details.\n\n## Run with Docker\n\nPrefer containers? Cognee publishes prebuilt images to Docker Hub on every push to `main`:\n[`cognee/cognee`](https://hub.docker.com/r/cognee/cognee) (the API server) and\n[`cognee/cognee-mcp`](https://hub.docker.com/r/cognee/cognee-mcp) (the MCP server).\n\n### Option A — Docker Compose (build from source)\n\nClone the repo, create a `.env` with at least `LLM_API_KEY`, then:\n\n```bash\ncp .env.template .env   # then edit .env and set LLM_API_KEY\n\n# Start the API server (http://localhost:8000)\ndocker compose up\n\n# Optional profiles (combine as needed):\ndocker compose --profile ui up        # + frontend on http://localhost:3000\ndocker compose --profile mcp up       # + MCP server on http://localhost:8001\ndocker compose --profile postgres up  # + Postgres/PGVector\ndocker compose --profile neo4j up     # + Neo4j\n```\n\n> The `cognee` and `cognee-mcp` services publish different host ports (`8000` vs `8001`),\n> so you can run both at once.\n\n### Option B — Pull the prebuilt image (no clone required)\n\n```bash\n# Create a minimal .env in the current directory\necho 'LLM_API_KEY=\"YOUR_OPENAI_API_KEY\"' > .env\n\n# API server\ndocker run --env-file ./.env -p 8000:8000 --rm -it cognee/cognee:main\n\n# MCP server (HTTP transport)\ndocker pull cognee/cognee-mcp:main\ndocker run -e TRANSPORT_MODE=http --env-file ./.env -p 8000:8000 --rm -it cognee/cognee-mcp:main\n```\n\nSee the [MCP server README](cognee-mcp/README.md) for SSE/stdio transports, optional\nextras, and MCP client configuration.\n\n## Use with AI Agents\n\n### Claude Code\n\nInstall the [Cognee memory plugin](https://github.com/topoteretes/cognee-integrations/tree/main/integrations/claude-code) to give Claude Code persistent memory across sessions. The plugin captures prompts, tool traces, and assistant responses into session memory, injects relevant context on every prompt, and syncs session memory into the permanent knowledge graph at session end.\n\n**Install** from the Claude Code marketplace. The recommended way is from your shell, *before* launching Claude Code, so the first `claude` launch is a clean session that bootstraps memory automatically:\n\n```bash\n# Add the marketplace and install the plugin (one-time, user-scoped)\nclaude plugin marketplace add topoteretes/cognee-integrations\nclaude plugin install cognee-memory@cognee\n\n# Set env vars for your mode (see below), then launch\nexport LLM_API_KEY=\"sk-...\"   # local mode; or COGNEE_BASE_URL + COGNEE_API_KEY for cloud\nclaude\n```\n\n**Local mode** (default) — the plugin bootstraps a local Cognee API at `http://localhost:8011`. Only `LLM_API_KEY` is required; the Cognee API key is auto-minted if absent:\n\n```bash\nexport LLM_API_KEY=\"sk-...\"\n```\n\n**Cognee Cloud or a remote server** — set both:\n\n```bash\nexport COGNEE_BASE_URL=\"https://your-instance.cognee.ai\"\nexport COGNEE_API_KEY=\"ck_...\"\n```\n\nOn startup you should see a \"Cognee Memory Connected\" system message.\n\nThe plugin hooks into Claude Code's lifecycle — `SessionStart` selects mode and sets up identity, `UserPromptSubmit` injects dataset-scoped context, `PostToolUse` captures tool traces, `Stop` writes the assistant's answer, `PreCompact` preserves memory across context resets, and `SessionEnd` triggers the final sync into the permanent graph.\n\nSee the [plugin README](https://github.com/topoteretes/cognee-integrations/tree/main/integrations/claude-code) for sessions, datasets, and full configuration.\n\n### Connect to Cognee Cloud\n\nPoint any Python agent at a managed Cognee instance — all SDK calls route to the cloud:\n\n```python\nimport cognee\n\nawait cognee.serve(url=\"https://your-instance.cognee.ai\", api_key=\"ck_...\")\n\nawait cognee.remember(\"important context\")\nresults = await cognee.recall(\"what happened?\")\n\nawait cognee.disconnect()\n```\n\n## Examples\n\nBrowse more examples in the [`examples/`](examples/) folder — demos, guides, custom pipelines, and database configurations.\n\n**Use Case 1 — Customer Support Agent**\n\n```python\nGoal: Resolve customer issues using their personal data across finance, support, and product history.\n\nUser: \"My invoice looks wrong and the issue is still not resolved.\"\n\nCognee tracks: past interactions, failed actions, resolved cases, product history\n\n# Agent response:\nAgent: \"I found 2 similar billing cases resolved last month.\n        The issue was caused by a syn",
    "manifest_file": "pyproject.toml",
    "manifest_content": "[project]\nname = \"cognee\"\n\nversion = \"1.2.2\"\ndescription = \"Cognee - is a library for enriching LLM context with a semantic layer for better understanding and reasoning.\"\nauthors = [\n    { name = \"Vasilije Markovic\" },\n    { name = \"Boris Arzentar\" },\n]\nrequires-python = \">=3.10,<3.15\"\nreadme = \"README.md\"\nlicense = \"Apache-2.0\"\nclassifiers = [\n    \"Development Status :: 4 - Beta\",\n    \"Intended Audience :: Developers\",\n    \"License :: OSI Approved :: Apache Software License\",\n    \"Topic :: Software Development :: Libraries\",\n    \"Operating System :: MacOS :: MacOS X\",\n    \"Operating System :: POSIX :: Linux\",\n    \"Operating System :: Microsoft :: Windows\",\n]\ndependencies = [\n    \"openai>=1.80.1\",\n    \"python-dotenv>=1.0.1,<2.0.0\",\n    \"pydantic>=2.10.5\",\n    # GHSA-4xgf-cpjx-pc3j: 2.12.0–2.14.1 vulnerable, fixed in 2.14.2. Exclude the\n    # vulnerable range rather than capping, so the patched 2.14.2+ is adopted once\n    # it clears the `exclude-newer` window (it was <2 days old at time of writing).\n    \"pydantic-settings>=2.2.1,!=2.12.*,!=2.13.*,!=2.14.0,!=2.14.1,<3\",\n    \"typing_extensions>=4.12.2,<5.0.0\",\n    \"numpy>=1.26.4, <=4.0.0\",\n    \"sqlalchemy>=2.0.39,<3.0.0\",\n    \"aiosqlite>=0.20.0,<1.0.0\",\n    \"filelock>=3.12.0,<4.0.0\",  # cross-process migration lock for SQLite (Linux/macOS/Windows)\n    \"tiktoken>=0.8.0,<1.0.0\",\n    \"litellm>=1.83.7\",\n    \"instructor>=1.9.1,<1.15.3\",\n    \"filetype>=1.2.0,<2.0.0\",\n    \"aiohttp>=3.13.5,<4.0.0\",\n    \"aiofiles>=23.2.1\",\n    \"rdflib>=7.1.4,<7.2.0\",\n    \"pypdf>=6.6.2,<7.0.0\",\n    \"jinja2>=3.1.3,<4\",\n    \"lancedb>=0.24.3,<1.0.0\",  # 0.24.2 bundles lance 0.32.0 whose list-column decoder panics on tables with deletion vectors\n    \"nbformat>=5.7.0,<6.0.0\",\n    \"alembic>=1.13.3,<2\",\n    \"limits>=4.4.1,<5\",\n    \"fastapi>=0.116.2,<1.0.0\",\n    \"starlette>=0.48\",\n    \"python-multipart>=0.0.22,<1.0.0\",\n    \"fastapi-users[sqlalchemy]>=15.0.2\",\n    \"structlog>=25.2.0,<26\",\n    \"pympler>=1.1,<2.0.0\",\n    \"pylance>=0.22.0,<=0.36.0\",\n    \"ladybug>=0.16.0,<0.18\",\n    \"python-magic-bin<0.5 ; platform_system == 'Windows'\", # Only needed for Windows\n    \"networkx>=3.4.2,<4\",\n    \"uvicorn>=0.34.0,<1.0.0\",\n    \"gunicorn>=20.1.0,<24\",\n    \"websockets>=15.0.1,<16.0.0\",\n    \"tenacity>=9.0.0\",\n    \"fakeredis[lua]>=2.32.0\",\n    \"diskcache>=5.6.3\",\n    \"aiolimiter>=1.2.1\",\n    \"urllib3>=2.6.0\",\n    \"cbor2>=5.8.0\",\n    \"langdetect>=1.0.9\",\n    \"datamodel-code-generator>=0.54.0\",\n]\n\n[project.optional-dependencies]\napi=[]\n\ndistributed = [\n    \"modal>=1.0.5,<2.0.0\",\n]\n\nscraping = [\n    \"tavily-python>=0.7.12\",\n    \"beautifulsoup4>=4.13.1\",\n    \"playwright>=1.9.0\",\n    \"lxml>=4.9.3,<5 ; python_version < '3.13'\",\n    \"lxml>=5,<6 ; python_version >= '3.13' and python_version < '3.14'\",\n    # cp314 wheels start at lxml 6.0.1; without this 3.14 falls back to a\n    # source build that requires libxml2/libxslt headers (CI doesn't have them).\n    \"lxml>=6.0.1,<7 ; python_version >= '3.14'\",\n    \"protego>=0.1\",\n    \"APScheduler>=3.10.0,<=3.11.0\"\n]\n\nfastembed = [\n    \"fastembed<=0.8.0\",\n    # onnxruntime 1.23.2 only ships wheels through cp313; cp314 wheels start\n    # at onnxruntime 1.24.1. Split by python_version so existing 3.10–3.13\n    # users keep the pinned version and 3.14 picks up the first cp314-capable\n    # release.\n    \"onnxruntime<=1.23.2 ; python_version < '3.14'\",\n    \"onnxruntime>=1.24.1 ; python_version >= '3.14'\",\n]\n\nneo4j = [\"neo4j>=5.28.0,<6\"]\nneptune = [\"langchain_aws>=0.2.22\"]\npostgres = [\n    \"psycopg2>=2.9.10,<3\",\n    \"pgvector>=0.3.5,<0.4\",\n    \"asyncpg>=0.30.0,<1.0.0\",\n]\npostgres-binary = [\n    \"psycopg2-binary>=2.9.10,<3.0.0\",\n    \"pgvector>=0.3.5,<0.4\",\n    \"asyncpg>=0.30.0,<1.0.0\",\n]\nnotebook = [\"notebook>=7.1.0,<8\"]\nlangchain = [\n    \"langsmith>=0.2.3,<1.0.0\",\n    \"langchain_text_splitters>=0.3.2,<1.0.0\",\n    \"langchain-core>=1.2.5\"\n]\nllama-index = [\"llama-index-core>=0.14.20,<0.15\"]\nhuggingface = [\"transformers>=4.46.3,<5\"]\nollama = [\"transformers>=4.46.3,<5\"]\nmistral = [\"mistral-common>=1.5.2,<2\", \"mistralai>=1.9.10,<2\"]\nanthropic = [\"anthropic>=0.27\"]\nazure = [\"azure-identity>=1.15.0,<2\"]\ndeepeval = [\"deepeval>=3.0.1,<4\"]\nposthog = [\"posthog>=3.5.0,<4\"]\ngroq = [\"groq>=0.8.0,<1.0.0\"]\nllama-cpp = [\"llama-cpp-python[server]>=0.3.0,<1.0.0\"]\ndocs = [\n    \"lxml>=4.9.3,<5 ; python_version < '3.13'\",\n    \"lxml>=5,<6 ; python_version >= '3.13' and python_version < '3.14'\",\n    # cp314 wheels start at lxml 6.0.1.\n    \"lxml>=6.0.1,<7 ; python_version >= '3.14'\",\n    \"unstructured[csv, doc, docx, epub, md, odt, org, ppt, pptx, rst, rtf, tsv, xlsx, pdf]>=0.18.1,<19\",\n    \"nltk>=3.9.3,<4\", # TODO: Remove when unstructured is above v0.21.0 as it won't use nltk anymore\n]\ncodegraph = [\n    \"fastembed<=0.8.0 ; python_version < '3.14'\",\n    \"transformers>=4.46.3,<5\",\n    \"tree-sitter>=0.24.0,<0.25\",\n    \"tree-sitter-python>=0.23.6,<0.24\",\n]\nevals = [\n    \"plotly>=6.0.0,<7\",\n    \"gdown>=5.2.0,<6\",\n    \"pandas>=2.2.2,<3.0.0\",\n    \"matplotlib>=3.8.3,<4\",\n    \"scikit-learn>=1.6.1,<2\",\n    \"locust>",
    "strategic_keywords": [
      "agent",
      "agents",
      "memory",
      "rag",
      "skill",
      "llm",
      "vector",
      "embedding"
    ],
    "relationship_label": "Memory 组件",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme",
      "pyproject.toml"
    ],
    "score_breakdown": {
      "heat": 20,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 10,
      "relation": 10,
      "risk_signal": 8,
      "total": 97
    },
    "strategic_score": 97
  },
  {
    "owner": "jamiepine",
    "name": "voicebox",
    "full_name": "jamiepine/voicebox",
    "url": "https://github.com/jamiepine/voicebox",
    "description": "The open-source AI voice studio. Clone, dictate, create.",
    "language": "TypeScript",
    "total_stars": 35393,
    "forks": 4252,
    "stars_this_period": 3965,
    "source_slice": "all",
    "source_slices": [
      "all",
      "typescript"
    ],
    "metadata": {
      "topics": [
        "ai",
        "cuda",
        "mlx",
        "qwen3-tts",
        "qwen3-tts-ui",
        "voice-ai",
        "voice-clone",
        "whisper"
      ],
      "license": "MIT",
      "open_issues": 491,
      "created_at": "2026-01-25T12:27:03Z",
      "pushed_at": "2026-06-28T00:34:55Z",
      "homepage": "https://voicebox.sh",
      "default_branch": "main",
      "forks": 4252,
      "watchers": 182,
      "archived": false,
      "size_kb": 106636
    },
    "readme_content": "<p align=\"center\">\n  <img src=\".github/assets/icon-dark.webp\" alt=\"Voicebox\" width=\"120\" height=\"120\" />\n</p>\n\n<h1 align=\"center\">Voicebox</h1>\n\n<p align=\"center\">\n  <strong>The open-source AI voice studio.</strong><br/>\n  Clone any voice. Generate speech. Dictate into any app. Talk to agents in voices you own.<br/>\n  The full voice I/O stack, running locally on your machine.\n</p>\n\n<p align=\"center\">\n  <a href=\"https://github.com/jamiepine/voicebox/releases\">\n    <img src=\"https://img.shields.io/github/downloads/jamiepine/voicebox/total?style=flat&color=blue\" alt=\"Downloads\" />\n  </a>\n  <a href=\"https://github.com/jamiepine/voicebox/releases/latest\">\n    <img src=\"https://img.shields.io/github/v/release/jamiepine/voicebox?style=flat\" alt=\"Release\" />\n  </a>\n  <a href=\"https://github.com/jamiepine/voicebox/stargazers\">\n    <img src=\"https://img.shields.io/github/stars/jamiepine/voicebox?style=flat\" alt=\"Stars\" />\n  </a>\n  <a href=\"https://github.com/jamiepine/voicebox/blob/main/LICENSE\">\n    <img src=\"https://img.shields.io/github/license/jamiepine/voicebox?style=flat\" alt=\"License\" />\n  </a>\n  <a href=\"https://deepwiki.com/jamiepine/voicebox\">\n    <img src=\"https://img.shields.io/static/v1?label=Ask&message=DeepWiki&color=5B6EF7\" alt=\"Ask DeepWiki\" />\n  </a>\n</p>\n\n<p align=\"center\">\n    <a href=\"https://trendshift.io/repositories/21213\" target=\"_blank\"><img src=\"https://trendshift.io/api/badge/repositories/21213\" alt=\"jamiepine%2Fvoicebox | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"/></a>\n</p>\n\n<p align=\"center\">\n  <a href=\"https://voicebox.sh\">voicebox.sh</a> •\n  <a href=\"https://docs.voicebox.sh\">Docs</a> •\n  <a href=\"#download\">Download</a> •\n  <a href=\"#features\">Features</a> •\n  <a href=\"#api\">API</a> •\n  <a href=\"docs/content/docs/overview/troubleshooting.mdx\">Troubleshooting</a>\n</p>\n\n<br/>\n\n<p align=\"center\">\n  <a href=\"https://voicebox.sh\">\n    <img src=\"landing/public/assets/app-screenshot-1.webp\" alt=\"Voicebox App Screenshot\" width=\"800\" />\n  </a>\n</p>\n\n<p align=\"center\">\n  <em>Click the image above to watch the demo video on <a href=\"https://voicebox.sh\">voicebox.sh</a></em>\n</p>\n\n<br/>\n\n<p align=\"center\">\n  <img src=\"landing/public/assets/app-screenshot-2.webp\" alt=\"Voicebox Screenshot 2\" width=\"800\" />\n</p>\n\n<p align=\"center\">\n  <img src=\"landing/public/assets/app-screenshot-3.webp\" alt=\"Voicebox Screenshot 3\" width=\"800\" />\n</p>\n\n<br/>\n\n## What is Voicebox?\n\nVoicebox is a **local-first AI voice studio** — a free and open-source alternative to **ElevenLabs** and **WisprFlow** in one app. Clone voices from a few seconds of audio, generate speech in 23 languages across 7 TTS engines, dictate into any text field with a global hotkey, and give any MCP-aware AI agent a voice of your choosing.\n\nThe two cloud incumbents sit on opposite halves of the voice I/O loop — ElevenLabs on output, WisprFlow on input. Voicebox does both, bridges them with a bundled local LLM for refinement and per-profile personas, and runs the whole thing on your machine.\n\n- **Complete privacy** — models, voice data, and captures never leave your machine\n- **7 TTS engines** — Qwen3-TTS, Qwen CustomVoice, LuxTTS, Chatterbox Multilingual, Chatterbox Turbo, HumeAI TADA, and Kokoro\n- **Voice cloning and preset voices** — zero-shot cloning from a reference sample, or 50+ curated preset voices via Kokoro and Qwen CustomVoice\n- **23 languages** — from English to Arabic, Japanese, Hindi, Swahili, and more\n- **Post-processing effects** — pitch shift, reverb, delay, chorus, compression, and filters\n- **Expressive speech** — paralinguistic tags like `[laugh]`, `[sigh]`, `[gasp]` via Chatterbox Turbo; natural-language delivery control via Qwen CustomVoice\n- **Unlimited length** — auto-chunking with crossfade for scripts, articles, and chapters\n- **Stories editor** — multi-track timeline for conversations, podcasts, and narratives\n- **Voice input** — global dictation hotkey with push-to-talk and toggle modes, accessibility-verified auto-paste on macOS, in-app mic on every text field, Whisper-based STT\n- **Agent voice output** — one tool call (`voicebox.speak`) and any MCP-aware agent (Claude Code, Cursor, Cline) speaks to you in a voice you've cloned\n- **Voice personalities** — attach a free-form persona to any voice profile, then Compose, Rewrite, or Respond via a bundled local LLM — agents can invoke the same modes over MCP\n- **API-first** — REST API plus a built-in MCP server for integrating voice I/O into your own apps and agents\n- **Native performance** — built with Tauri (Rust), not Electron\n- **Runs everywhere** — macOS (MLX/Metal), Windows (CUDA), Linux, AMD ROCm, Intel Arc, Docker\n\n---\n\n## Download\n\n| Platform              | Download                                               |\n| --------------------- | ------------------------------------------------------ |\n| macOS (Apple Silicon) | [Download DMG](https://voicebox.sh/download/mac-arm)   |\n| macOS (Intel)         | [Download DMG](https://voicebox.sh/download/mac-intel) |\n| Windows               | [Download MSI](https://voicebox.sh/download/windows)   |\n| Docker                | `docker compose up`                                    |\n\n> **[View all binaries →](https://github.com/jamiepine/voicebox/releases/latest)**\n\n> **Linux** — Pre-built binaries are not yet available. See [voicebox.sh/linux-install](https://voicebox.sh/linux-install) for build-from-source instructions.\n\n> **Having trouble?** See the [Troubleshooting Guide](docs/content/docs/overview/troubleshooting.mdx) for common install, generation, model-download, and GPU issues.\n\n---\n\n## Features\n\n### Multi-Engine Voice Cloning\n\nSeven TTS engines with different strengths, switchable per-generation:\n\n| Engine                      | Languages | Strengths                                                                                                                                |\n| --------------------------- | --------- | ---------------------------------------------------------------------------------------------------------------------------------------- |\n| **Qwen3-TTS** (0.6B / 1.7B) | 10        | High-quality multilingual cloning, delivery instructions (\"speak slowly\", \"whisper\")                                                     |\n| **Qwen CustomVoice**        | 10        | 9 curated preset voices with natural-language delivery control — no reference audio required                                             |\n| **LuxTTS**                  | English   | Lightweight (~1GB VRAM), 48kHz output, 150x realtime on CPU                                                                              |\n| **Chatterbox Multilingual** | 23        | Broadest language coverage — Arabic, Danish, Finnish, Greek, Hebrew, Hindi, Malay, Norwegian, Polish, Swahili, Swedish, Turkish and more |\n| **Chatterbox Turbo**        | English   | Fast 350M model with paralinguistic emotion/sound tags                                                                                   |\n| **TADA** (1B / 3B)          | 10        | HumeAI speech-language model — 700s+ coherent audio, text-acoustic dual alignment                                                        |\n| **Kokoro**                  | 8         | 50 curated preset voices, tiny 82M model, fast CPU inference                                                                             |\n\n### Emotions & Paralinguistic Tags\n\nOnly **Chatterbox Turbo** interprets paralinguistic tags like `[laugh]` and\n`[sigh]`. Qwen3-TTS, LuxTTS, Chatterbox Multilingual, and HumeAI TADA read them\nliterally as text.\n\nWith **Chatterbox Turbo** selected, type `/` in the text input to open the tag\ninserter and add expressive tags inline with speech:\n\n`[laugh]` `[chuckle]` `[gasp]` `[cough]` `[sigh]` `[groan]` `[sniff]` `[shush]` `[clear throat]`\n\n### Post-Processing Effects\n\n8 audio effects powered by Spotify's `pedalboard` library. Apply after generation, preview in real time, build reusable presets.\n\n| Effect           | Description                                   |\n| ---------------- | --------------------------------------------- |\n| Pitch Shift      | Up or down by up to 12 semitones              |\n| Reverb           | Configurable room size, damping, wet/dry mix  |\n| Delay            | Echo with adjustable time, feedback, and mix  |\n| Chorus / Flanger | Modulated delay for metallic or lush textures |\n| Compressor       | Dynamic range compression                     |\n| Gain             | Volume adjustment (-40 to +40 dB)             |\n| High-Pass Filter | Remove low frequencies                        |\n| Low-Pass Filter  | Remove high frequencies                       |\n\nShips with 4 built-in presets (Robotic, Radio, Echo Chamber, Deep Voice) and supports custom presets. Effects can be assigned per-profile as defaults.\n\n### Unlimited Generation Length\n\nText is automatically split at sentence boundaries and each chunk is generated independently, then crossfaded together. Works with all engines.\n\n- Configurable auto-chunking limit (100–5,000 chars)\n- Crossfade slider (0–200ms) for smooth transitions\n- Max text length: 50,000 characters\n- Smart splitting respects abbreviations, CJK punctuation, and `[tags]`\n\n### Generation Versions\n\nEvery generation supports multiple versions with provenance tracking:\n\n- **Original** — clean TTS output, always preserved\n- **Effects versions** — apply different effects chains from any source version\n- **Takes** — regenerate with a new seed for variation\n- **Source tracking** — each version records its lineage\n- **Favorites** — star generations for quick access\n\n### Async Generation Queue\n\nGeneration is non-blocking. Submit and immediately start typing the next one.\n\n- Serial execution queue prevents GPU contention\n- Real-time SSE status streaming\n- Failed generations can be retried\n- Stale generations from crashes auto-recover on startup\n\n### Voice Profile Management\n\n- Create profiles from audio files or record directly in-app\n- Import/export profiles to share or back up\n- Multi-sample support for higher quality cloning\n- Per-profile default effects chains\n- Organize with descriptions and language tags\n\n### Stories Editor\n\nMulti-voice timeline editor for conversations, podcasts, and narratives.\n\n- Multi-track composition with drag-and-drop\n- Inline audio trimming and splitting\n- Auto-playback with synchronized playhead\n- Version pinning per track clip\n\n### Global Dictation & Voice Input\n\nThe other half of the voice I/O loop. Hold a hotkey anywhere on your system, speak, release — on macOS the transcript pastes straight into the focused text field. Or hit the mic on any Voicebox text input and dictate directly into the app.\n\n- **Configurable chord bindings** — hold-to-speak and tap-to-toggle chords, each rebindable in the in-app chord picker. Holding push-to-talk and tapping `Space` mid-hold upgrades into a toggle session without a gap in audio\n- **Target-aware paste (macOS)** — accessibility-verified injection into the focused text field, with atomic clipboard save/restore so your clipboard isn't clobbered\n- **First-run permissions UX** — in-app gates walk you through the macOS Accessibility and Input Monitoring grants with deep-links to System Settings\n- **In-app mic button** on every Voicebox text field — generation form, profile descriptions, story titles, anywhere you'd type\n- **LLM refinement** — optional cleanup of ums, stutters, and false starts before paste\n- **On-screen pill** — floating overlay surfacing `recording`, `transcribing`, `refining`, and `speaking` states. Same pill agents use when they speak to you, so there's one mental model for both directions of the loop\n\n### Speech-to-Text\n\nVoicebox runs OpenAI Whisper for transcription — the same model that backs dictation, the Captures tab, and the `/transcribe` API. Running on MLX (Apple Silicon) or PyTorch (CUDA / ROCm / DirectML / CPU) depending on your platform.\n\n| Size                          | Notes                                            ",
    "manifest_file": "package.json",
    "manifest_content": "{\n  \"name\": \"voicebox\",\n  \"version\": \"0.5.0\",\n  \"private\": true,\n  \"workspaces\": [\n    \"app\",\n    \"tauri\",\n    \"web\",\n    \"landing\"\n  ],\n  \"scripts\": {\n    \"dev\": \"bun run setup:dev && cd tauri && bun run tauri dev\",\n    \"dev:web\": \"cd web && bun run dev\",\n    \"dev:landing\": \"cd landing && bun run dev\",\n    \"dev:server\": \"uvicorn backend.main:app --reload --port 17493\",\n    \"setup:dev\": \"bun run scripts/setup-dev-sidecar.js\",\n    \"build\": \"./scripts/build-server.sh && cd tauri && bun run tauri build\",\n    \"build:web\": \"cd web && bun run build\",\n    \"build:landing\": \"cd landing && bun run build\",\n    \"build:release\": \"./scripts/prepare-release.sh\",\n    \"generate:api\": \"./scripts/generate-api.sh\",\n    \"generate:keys\": \"cd tauri && bun tauri signer generate -w ~/.tauri/voicebox.key\",\n    \"build:server\": \"./scripts/build-server.sh\",\n    \"update:icons\": \"./scripts/update-icons.sh\",\n    \"convert:assets\": \"./scripts/convert-assets.sh\",\n    \"lint\": \"biome lint .\",\n    \"typecheck\": \"bunx tsc -p app/tsconfig.json --noEmit && cd web && bunx tsc --noEmit\",\n    \"lint:fix\": \"biome lint --write .\",\n    \"format\": \"biome format --write .\",\n    \"format:check\": \"biome format .\",\n    \"check\": \"biome check .\",\n    \"check:fix\": \"biome check --write .\",\n    \"ci\": \"bun run typecheck && bun run build:web\"\n  },\n  \"devDependencies\": {\n    \"@biomejs/biome\": \"2.3.12\",\n    \"@types/node\": \"^20.0.0\",\n    \"tailwindcss\": \"^4.1.18\",\n    \"typescript\": \"^5.6.0\"\n  },\n  \"engines\": {\n    \"bun\": \">=1.0.0\"\n  },\n  \"packageManager\": \"bun@1.3.8\",\n  \"dependencies\": {\n    \"loaders.css\": \"^0.1.2\",\n    \"react-loaders\": \"^3.0.1\"\n  }\n}\n",
    "strategic_keywords": [
      "agent",
      "agents",
      "mcp",
      "workspace",
      "llm"
    ],
    "relationship_label": "Workspace 组件",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme",
      "package.json"
    ],
    "score_breakdown": {
      "heat": 20,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 10,
      "relation": 10,
      "risk_signal": 8,
      "total": 97
    },
    "strategic_score": 97
  },
  {
    "owner": "googleworkspace",
    "name": "cli",
    "full_name": "googleworkspace/cli",
    "url": "https://github.com/googleworkspace/cli",
    "description": "Google Workspace CLI — one command-line tool for Drive, Gmail, Calendar, Sheets, Docs, Chat, Admin, and more. Dynamically built from Google Discovery Service. Includes AI agent skills.",
    "language": "Rust",
    "total_stars": 29072,
    "forks": 1648,
    "stars_this_period": 1848,
    "source_slice": "rust",
    "source_slices": [
      "rust"
    ],
    "metadata": {
      "topics": [
        "agent-skills",
        "ai-agent",
        "automation",
        "cli",
        "discovery-api",
        "gemini-cli-extension",
        "google-admin",
        "google-api",
        "google-calendar",
        "google-chat",
        "google-docs",
        "google-drive",
        "google-sheets",
        "google-workspace",
        "oauth2",
        "rust"
      ],
      "license": "Apache-2.0",
      "open_issues": 124,
      "created_at": "2026-03-02T19:46:06Z",
      "pushed_at": "2026-06-28T22:24:16Z",
      "homepage": "https://developers.google.com/workspace",
      "default_branch": "main",
      "forks": 1648,
      "watchers": 94,
      "archived": false,
      "size_kb": 10749
    },
    "readme_content": "<h1 align=\"center\">gws</h1>\n\n**One CLI for all of Google Workspace — built for humans and AI agents.**<br>\nDrive, Gmail, Calendar, and every Workspace API. Zero boilerplate. Structured JSON output. 40+ agent skills included.\n\n> [!NOTE]\n> This is **not** an officially supported Google product.\n\n<p>\n  <a href=\"https://www.npmjs.com/package/@googleworkspace/cli\"><img src=\"https://img.shields.io/npm/v/@googleworkspace/cli\" alt=\"npm version\"></a>\n  <a href=\"https://github.com/googleworkspace/cli/blob/main/LICENSE\"><img src=\"https://img.shields.io/github/license/googleworkspace/cli\" alt=\"license\"></a>\n  <a href=\"https://github.com/googleworkspace/cli/actions/workflows/ci.yml\"><img src=\"https://img.shields.io/github/actions/workflow/status/googleworkspace/cli/ci.yml?branch=main&label=CI\" alt=\"CI status\"></a>\n  <a href=\"https://www.npmjs.com/package/@googleworkspace/cli\"><img src=\"https://img.shields.io/npm/unpacked-size/@googleworkspace/cli\" alt=\"install size\"></a>\n</p>\n<br>\n\n⬇️ **[Download the latest release for your OS](https://github.com/googleworkspace/cli/releases)**\n\n`gws` doesn't ship a static list of commands. It reads Google's own [Discovery Service](https://developers.google.com/discovery) at runtime and builds its entire command surface dynamically. When Google Workspace adds an API endpoint or method, `gws` picks it up automatically.\n\n> [!IMPORTANT]\n> This project is under active development. Expect breaking changes as we march toward v1.0.\n\n## Contents\n\n- [Prerequisites](#prerequisites)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Why gws?](#why-gws)\n- [Authentication](#authentication)\n- [AI Agent Skills](#ai-agent-skills)\n- [Advanced Usage](#advanced-usage)\n- [Environment Variables](#environment-variables)\n- [Exit Codes](#exit-codes)\n- [Architecture](#architecture)\n- [Troubleshooting](#troubleshooting)\n- [Development](#development)\n\n## Prerequisites\n\n- **Node.js 18+** — for `npm install` (or download a pre-built binary from [GitHub Releases](https://github.com/googleworkspace/cli/releases))\n- **A Google Cloud project** — required for OAuth credentials. You can create one via the [Google Cloud Console](https://console.cloud.google.com/) or with the [`gcloud` CLI](https://cloud.google.com/sdk/docs/install) or with the `gws auth setup` command.\n- **A Google account** with access to Google Workspace\n\n## Installation\n\nThe recommended way to install `gws` is to download the pre-built binary for your OS and architecture from the **[GitHub Releases](https://github.com/googleworkspace/cli/releases)** page. Extract the archive and place the `gws` binary in your `$PATH`.\n\nFor convenience, you can also use `npm` to automate downloading the appropriate binary from GitHub Releases:\n\n```bash\nnpm install -g @googleworkspace/cli\n```\n\nOr build from source:\n\n```bash\ncargo install --git https://github.com/googleworkspace/cli --locked\n```\n\nA Nix flake is also available at `github:googleworkspace/cli`\n\n```bash\nnix run github:googleworkspace/cli\n```\n\nOn macOS and Linux, you can also install via [Homebrew](https://brew.sh/):\n\n```bash\nbrew install googleworkspace-cli\n```\n\n## Quick Start\n\n```bash\ngws auth setup     # walks you through Google Cloud project config\ngws auth login     # subsequent OAuth login\ngws drive files list --params '{\"pageSize\": 5}'\n```\n\n## Why gws?\n\n**For humans** — stop writing `curl` calls against REST docs. `gws` gives you `--help` on every resource, `--dry-run` to preview requests, and auto‑pagination.\n\n**For AI agents** — every response is structured JSON. Pair it with the included agent skills and your LLM can manage Workspace without custom tooling.\n\n```bash\n# List the 10 most recent files\ngws drive files list --params '{\"pageSize\": 10}'\n\n# Create a spreadsheet\ngws sheets spreadsheets create --json '{\"properties\": {\"title\": \"Q1 Budget\"}}'\n\n# Send a Chat message\ngws chat spaces messages create \\\n  --params '{\"parent\": \"spaces/xyz\"}' \\\n  --json '{\"text\": \"Deploy complete.\"}' \\\n  --dry-run\n\n# Introspect any method's request/response schema\ngws schema drive.files.list\n\n# Stream paginated results as NDJSON\ngws drive files list --params '{\"pageSize\": 100}' --page-all | jq -r '.files[].name'\n```\n\n## Authentication\n\nThe CLI supports multiple auth workflows so it works on your laptop, in CI, and on a server.\n\n### Which setup should I use?\n\n| I have… | Use |\n|---|---|\n| `gcloud` installed and authenticated | [`gws auth setup`](#interactive-local-desktop) (fastest) |\n| A GCP project but no `gcloud` | [Manual OAuth setup](#manual-oauth-setup-google-cloud-console) |\n| An existing OAuth access token | [`GOOGLE_WORKSPACE_CLI_TOKEN`](#pre-obtained-access-token) |\n| Existing Credentials | [`GOOGLE_WORKSPACE_CLI_CREDENTIALS_FILE`](#service-account-server-to-server) |\n\n### Interactive (local desktop)\n\nCredentials are encrypted at rest (AES-256-GCM) with the key stored in your OS keyring (or `~/.config/gws/.encryption_key` when `GOOGLE_WORKSPACE_CLI_KEYRING_BACKEND=file`).\n\n```bash\ngws auth setup       # one-time: creates a Cloud project, enables APIs, logs you in\ngws auth login       # subsequent scope selection and login\n```\n\n> `gws auth setup` requires the [`gcloud` CLI](https://cloud.google.com/sdk/docs/install). If you don't have `gcloud`, use the [manual setup](#manual-oauth-setup-google-cloud-console) below instead.\n\n> [!WARNING]\n> **Scope limits in testing mode:** If your OAuth app is unverified (testing mode),\n> Google limits consent to ~25 scopes. The `recommended` scope preset includes 85+\n> scopes and **will fail** for unverified apps (especially for `@gmail.com` accounts).\n> Choose individual services instead to filter the scope picker:\n> ```bash\n> gws auth login -s drive,gmail,sheets\n> ```\n\n\n### Manual OAuth setup (Google Cloud Console)\n\nUse this when `gws auth setup` cannot automate project/client creation, or when you want explicit control.\n\n1. Open Google Cloud Console in the target project:\n   - OAuth consent screen: `https://console.cloud.google.com/apis/credentials/consent?project=<PROJECT_ID>`\n   - Credentials: `https://console.cloud.google.com/apis/credentials?project=<PROJECT_ID>`\n2. Configure OAuth branding/audience if prompted:\n   - App type: **External** (testing mode is fine)\n3. Add your account under **Test users**\n4. Create an OAuth client:\n   - Type: **Desktop app**\n5. Download the client JSON and save it to:\n   - `~/.config/gws/client_secret.json`\n\n> [!IMPORTANT]\n> **You must add yourself as a test user.** In the OAuth consent screen, click\n> **Test users → Add users** and enter your Google account email. Without this,\n> login will fail with a generic \"Access blocked\" error.\n\nThen run:\n\n```bash\ngws auth login\n```\n\n### Browser-assisted auth (human or agent)\n\nYou can complete OAuth either manually or with browser automation.\n\n- **Human flow**: run `gws auth login`, open the printed URL, approve scopes.\n- **Agent-assisted flow**: the agent opens the URL, selects account, handles consent prompts, and returns control once the localhost callback succeeds.\n\nIf consent shows **\"Google hasn't verified this app\"** (testing mode), click **Continue**.\nIf scope checkboxes appear, select required scopes (or **Select all**) before continuing.\n\n### Headless / CI (export flow)\n\n1. Complete interactive auth on a machine with a browser.\n2. Export credentials:\n   ```bash\n   gws auth export --unmasked > credentials.json\n   ```\n3. On the headless machine:\n   ```bash\n   export GOOGLE_WORKSPACE_CLI_CREDENTIALS_FILE=/path/to/credentials.json\n   gws drive files list   # just works\n   ```\n\n### Service Account (server-to-server)\n\nPoint to your key file; no login needed.\n\n```bash\nexport GOOGLE_WORKSPACE_CLI_CREDENTIALS_FILE=/path/to/service-account.json\ngws drive files list\n```\n\n### Pre-obtained Access Token\n\nUseful when another tool (e.g. `gcloud`) already mints tokens for your environment.\n\n```bash\nexport GOOGLE_WORKSPACE_CLI_TOKEN=$(gcloud auth print-access-token)\n```\n\n### Precedence\n\n| Priority | Source                 | Set via                                 |\n| -------- | ---------------------- | --------------------------------------- |\n| 1        | Access token           | `GOOGLE_WORKSPACE_CLI_TOKEN`            |\n| 2        | Credentials file       | `GOOGLE_WORKSPACE_CLI_CREDENTIALS_FILE` |\n| 3        | Encrypted credentials  | `gws auth login`                        |\n| 4        | Plaintext credentials  | `~/.config/gws/credentials.json`        |\n\nEnvironment variables can also live in a `.env` file.\n\n## AI Agent Skills\n\nThe repo ships 100+ Agent Skills (`SKILL.md` files) — one for every supported API, plus higher-level helpers for common workflows and 50 curated recipes for Gmail, Drive, Docs, Calendar, and Sheets. See the full [Skills Index](docs/skills.md) for the complete list.\n\n```bash\n# Install all skills at once\nnpx skills add https://github.com/googleworkspace/cli\n\n# Or pick only what you need\nnpx skills add https://github.com/googleworkspace/cli/tree/main/skills/gws-drive\nnpx skills add https://github.com/googleworkspace/cli/tree/main/skills/gws-gmail\n```\n\n<details>\n<summary>OpenClaw setup</summary>\n\n```bash\n# Symlink all skills (stays in sync with repo)\nln -s $(pwd)/skills/gws-* ~/.openclaw/skills/\n\n# Or copy specific skills\ncp -r skills/gws-drive skills/gws-gmail ~/.openclaw/skills/\n```\n\nThe `gws-shared` skill includes an `install` block so OpenClaw auto-installs the CLI via `npm` if `gws` isn't on PATH.\n\n</details>\n\n## Gemini CLI Extension\n\n1. Authenticate the CLI first:\n\n   ```bash\n   gws auth setup\n   ```\n\n2. Install the extension into the Gemini CLI:\n   ```bash\n   gemini extensions install https://github.com/googleworkspace/cli\n   ```\n\nInstalling this extension gives your Gemini CLI agent direct access to all `gws` commands and Google Workspace agent skills. Because `gws` handles its own authentication securely, you simply need to authenticate your terminal once prior to using the agent, and the extension will automatically inherit your credentials.\n\n## Advanced Usage\n\n### Multipart Uploads\n\n```bash\ngws drive files create --json '{\"name\": \"report.pdf\"}' --upload ./report.pdf\n```\n\n### Pagination\n\n| Flag                | Description                                    | Default |\n| ------------------- | ---------------------------------------------- | ------- |\n| `--page-all`        | Auto-paginate, one JSON line per page (NDJSON) | off     |\n| `--page-limit <N>`  | Max pages to fetch                             | 10      |\n| `--page-delay <MS>` | Delay between pages                            | 100 ms  |\n\n### Google Sheets — Shell Escaping\n\nSheets ranges use `!` which bash interprets as history expansion. Always wrap values in **single quotes**:\n\n```bash\n# Read cells A1:C10 from \"Sheet1\"\ngws sheets spreadsheets values get \\\n  --params '{\"spreadsheetId\": \"SPREADSHEET_ID\", \"range\": \"Sheet1!A1:C10\"}'\n\n# Append rows\ngws sheets spreadsheets values append \\\n  --params '{\"spreadsheetId\": \"ID\", \"range\": \"Sheet1!A1\", \"valueInputOption\": \"USER_ENTERED\"}' \\\n  --json '{\"values\": [[\"Name\", \"Score\"], [\"Alice\", 95]]}'\n```\n\n### Helper Commands\n\nSome services ship hand-crafted helper commands alongside the auto-generated Discovery surface. Helper commands are prefixed with `+` so they are visually distinct and never collide with Discovery-generated method names.\n\nTime-aware helpers (`+agenda`, `+standup-report`, `+weekly-digest`, `+meeting-prep`) automatically use your **Google account timezone** (fetched from Calendar Settings API and cached for 24 hours). Override with `--timezone`/`--tz` on `+agenda`, or set the `--timezone` flag for explicit control.\n\nRun `gws <service> --help` to see both Discovery methods and helper commands together.\n\n```bash\ngws gmail --help      # shows +send, +reply, +reply-all, +forward, +triage, +watch …\ngws calendar --help   # shows +insert, +agenda …\ngws drive --help      # shows +upload …\n```\n\n**Full helper reference:**\n\n| Service | Command | Description |\n|---------|---------|-------------|\n| `gmail` | `+send` | Send a",
    "manifest_file": "Cargo.toml",
    "manifest_content": "# Copyright 2026 Google LLC\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n#     http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\n[workspace]\nmembers = [\"crates/google-workspace-cli\", \"crates/google-workspace\"]\nresolver = \"2\"\n\n# The profile that 'cargo dist' will build with\n[profile.dist]\ninherits = \"release\"\nlto = \"thin\"\n",
    "strategic_keywords": [
      "agent",
      "agents",
      "runtime",
      "skill",
      "workspace",
      "llm",
      "workflow",
      "automation"
    ],
    "relationship_label": "Runtime 参考",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme",
      "Cargo.toml"
    ],
    "score_breakdown": {
      "heat": 20,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 10,
      "relation": 10,
      "risk_signal": 8,
      "total": 97
    },
    "strategic_score": 97
  },
  {
    "owner": "Tencent",
    "name": "WeKnora",
    "full_name": "Tencent/WeKnora",
    "url": "https://github.com/Tencent/WeKnora",
    "description": "Open-source LLM knowledge platform: turn raw documents into a queryable RAG, an autonomous reasoning agent, and a self-maintaining Wiki.",
    "language": "Go",
    "total_stars": 17450,
    "forks": 2294,
    "stars_this_period": 934,
    "source_slice": "go",
    "source_slices": [
      "go"
    ],
    "metadata": {
      "topics": [
        "agent",
        "agentic",
        "ai",
        "chatbot",
        "embeddings",
        "evaluation",
        "generative-ai",
        "golang",
        "knowledge-base",
        "llm",
        "multi-tenant",
        "multimodel",
        "ollama",
        "openai",
        "question-answering",
        "rag",
        "reranking",
        "semantic-search",
        "vector-search",
        "wiki"
      ],
      "license": "NOASSERTION",
      "open_issues": 284,
      "created_at": "2025-07-22T08:01:23Z",
      "pushed_at": "2026-06-27T16:32:02Z",
      "homepage": "https://weknora.weixin.qq.com",
      "default_branch": "main",
      "forks": 2294,
      "watchers": 84,
      "archived": false,
      "size_kb": 78852
    },
    "readme_content": "<p align=\"center\">\n  <picture>\n    <img src=\"./docs/images/logo.png\" alt=\"WeKnora Logo\" height=\"120\"/>\n  </picture>\n</p>\n\n<p align=\"center\">\n  <picture>\n    <a href=\"https://trendshift.io/repositories/15289\" target=\"_blank\">\n      <img src=\"https://trendshift.io/api/badge/repositories/15289\" alt=\"Tencent%2FWeKnora | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"/>\n    </a>\n  </picture>\n</p>\n<p align=\"center\">\n    <a href=\"https://weknora.weixin.qq.com\" target=\"_blank\">\n        <img alt=\"Official Website\" src=\"https://img.shields.io/badge/Official Website-WeKnora-4e6b99\">\n    </a>\n    <a href=\"https://chatbot.weixin.qq.com\" target=\"_blank\">\n        <img alt=\"WeChat Dialog Open Platform\" src=\"https://img.shields.io/badge/WeChat Dialog Open Platform-5ac725\">\n    </a>\n    <a href=\"https://chromewebstore.google.com/detail/jpemjbopikggjlmikmclgbmkhhopjdgd\" target=\"_blank\">\n        <img alt=\"Chrome Extension\" src=\"https://img.shields.io/badge/Chrome Extension-WeKnora-4285F4\">\n    </a>\n    <a href=\"https://clawhub.ai/lyingbug/weknora\" target=\"_blank\">\n        <img alt=\"ClawHub Skill\" src=\"https://img.shields.io/badge/ClawHub Skill-WeKnora-ff6b35\">\n    </a>\n    <a href=\"https://github.com/Tencent/WeKnora/blob/main/LICENSE\">\n        <img src=\"https://img.shields.io/badge/License-MIT-ffffff?labelColor=d4eaf7&color=2e6cc4\" alt=\"License\">\n    </a>\n    <a href=\"./CHANGELOG.md\">\n        <img alt=\"Version\" src=\"https://img.shields.io/badge/version-0.6.3-2e6cc4?labelColor=d4eaf7\">\n    </a>\n</p>\n\n<p align=\"center\">\n| <b>English</b> | <a href=\"./README_CN.md\"><b>简体中文</b></a> | <a href=\"./README_JA.md\"><b>日本語</b></a> | <a href=\"./README_KO.md\"><b>한국어</b></a> |\n</p>\n\n<p align=\"center\">\n  <h4 align=\"center\">\n\n  [Overview](#-overview) • [Architecture](#-architecture) • [Key Features](#-key-features) • [Getting Started](#-getting-started) • [API Reference](#-api-reference) • [Developer Guide](#-developer-guide)\n  \n  </h4>\n</p>\n\n# 💡 WeKnora — Turn Documents into Living Knowledge with RAG, Agents and Auto-Wiki\n\n## 📌 Overview\n\n[**WeKnora**](https://weknora.weixin.qq.com) is an open-source, LLM-powered knowledge framework built for enterprise-grade document understanding, semantic retrieval, and autonomous reasoning.\n\nIt is organized around three core capabilities: **RAG-based Quick Q&A** for everyday lookups, a **ReAct Agent** that autonomously orchestrates retrieval, MCP tools and web search to handle complex multi-step tasks, and a brand-new **Wiki Mode** in which agents distill raw documents into a self-maintaining, interlinked markdown knowledge base with an interactive knowledge graph. Combined with multi-source ingestion (Feishu / Notion / Yuque / RSS, and growing), **website embed widgets** for publishing agents to external sites, 20+ LLM provider integrations, full Langfuse observability, **enterprise-ready multi-tenant RBAC** (4-tier role matrix + per-resource ownership + per-tenant audit log), and a fully self-hostable modular architecture, WeKnora turns scattered documents into a queryable, reasoning-capable, continuously evolving knowledge asset.\n\nThe framework supports auto-syncing knowledge from Feishu, Notion, and Yuque (more data sources coming soon), handles 10+ document formats including PDF, Word, images, and Excel, and can serve Q&A directly through IM channels like WeCom, Feishu, Slack, and Telegram. It is compatible with major LLM providers including OpenAI, DeepSeek, Qwen (Alibaba Cloud), Zhipu, Hunyuan, Gemini, MiniMax, NVIDIA, and Ollama. Its fully modular design allows swapping LLMs, vector databases, and storage backends, with support for local and private cloud deployment ensuring complete data sovereignty. WeKnora also integrates with **Langfuse** for comprehensive observability into agent reasoning, token usage, and pipeline tracing.\n\n\n## ✨ Latest Updates\n\n- **v0.6.3** — Website embed widget & Integrations Center (secure-mode token exchange + rate limits); chat experience overhaul (citation popovers, RAG pipeline progress, streaming markdown); document multi-tag & batch reparse; Wiki folders & hierarchy navigation; RSS data source; MCP OAuth2; EPUB / MHTML parsing; agent model-readiness checks; model test debugger; session source filter; workspace deletion UI. See [`CHANGELOG.md`](./CHANGELOG.md).\n- **v0.6.2** — Per-upload process configuration with upload-confirm dialog; document reparse with `process_config`; `weknora` CLI v0.9 (bundled Agent Skills, `session stop`, auth/profile harmonization); KB marquee multi-select; HNSW index for 1024-dim pgvector embeddings; chat resources store refactor; Langfuse-only tracing (Jaeger removed). See [`CHANGELOG.md`](./CHANGELOG.md).\n- **v0.6.1** — Document parsing trace timeline (Langfuse-style span tree with stage-by-stage progress + stop-parse); OpenSearch vector store driver; declarative built-in models via YAML; system admin & consolidated platform settings + audit log; new-user onboarding guide; settings UI redesign; `weknora` CLI v0.7 / v0.8 (agent-first wire contract, NDJSON, `--dry-run`); OpenDataLoader + PaddleOCR-VL parsers; MCP server multi-transport (stdio / SSE / HTTP); per-model thinking-mode config; Tencent LKEAP rerank + native Gemini embeddings + MiniMax-M3. See [`CHANGELOG.md`](./CHANGELOG.md).\n- **v0.6.0** — Tenant RBAC (4-tier role matrix `Owner` / `Admin` / `Contributor` / `Viewer` + per-KB ownership + per-tenant audit log), tenant member management & multi-workspace UX, self-service workspaces; `weknora` CLI v0.4 GA with `mcp serve`; KB retrieval fan-out across vector stores; AES-256-GCM credential encryption + docreader gRPC TLS + Token; Zhipu embedder + Huawei OBS; server-side user preferences; Go 1.26.0. See [`docs/RBAC说明.md`](./docs/RBAC说明.md) and [`CHANGELOG.md`](./CHANGELOG.md).\n- **v0.5.2** — Wiki ingest scales to 40k-document KBs (task queue + DLQ); MCP human-in-the-loop tool approval; Anthropic / Apache Doris / Tencent VectorDB / KS3 / SearXNG backends; adaptive 3-tier chunking with live preview; global ⌘K command palette; Yuque connector + WeChat Mini Program; `weknora` CLI preview.\n- **v0.5.1** — Knowledge-base batch management; tenant-wide IM channels overview; session search + user-scoped pinning; unified Model / Web Search / MCP settings cards; per-agent LLM timeout; desktop tenant switching.\n- **v0.5.0** — Wiki Mode GA — agents auto-generate structured, interlinked Markdown wiki pages with a knowledge graph; wiki browser + visual graph in the UI.\n- **v0.4.0** — WeKnora Cloud (hosted LLM + parsing); Chrome Extension; ClawHub Skill; WeChat IM; attachment processing; Azure OpenAI / Alibaba OSS; Notion connector; Baidu + Ollama web search; VectorStore management.\n- **v0.3.6** — ASR (audio); Feishu data-source auto-sync; OIDC; IM quote-reply context + thread-based sessions; document summarization; Tavily search; parallel tool calling; agent @mention scope restriction.\n- **v0.3.5** — Telegram / DingTalk / Mattermost IM; IM slash commands + QA queue; suggested questions; VLM auto-describe MCP tool images; Novita AI; channel tracking.\n- **v0.3.4** — WeCom / Feishu / Slack IM; multimodal image support; NVIDIA model API; Weaviate; AWS S3; AES-256-GCM API-key encryption; built-in MCP service; hybrid-search optimization; `final_answer` tool.\n- **v0.3.3** — Parent-child chunking; KB pinning; fallback response; passage cleaning for rerank; storage auto-creation; Milvus.\n- **v0.3.2** — Knowledge Search entry; per-source parser & storage engine config; image rendering in local storage; document preview; Volcengine TOS; Mermaid rendering; batch session management; memory graph preview.\n- **v0.3.0** — Shared Space; Agent Skills + sandboxed execution; custom agents; Data Analyst agent; thinking mode; Bing / Google web search; API Key auth; Helm chart; Korean i18n; Qdrant.\n- **v0.2.0** — Agent Mode (ReACT); multi-type knowledge bases (FAQ + document); conversation strategy config; DuckDuckGo web search; MCP tool integration; new UI with agent mode switching; MQ async task management.\n\n\n## 📱 Interface Showcase\n\n<table>\n  <tr>\n    <td colspan=\"2\" align=\"center\"><b>💬 Intelligent Q&A Conversation</b><br/><img src=\"./docs/images/qa.png\" alt=\"Intelligent Q&A Conversation\" width=\"100%\"></td>\n  </tr>\n  <tr>\n    <td width=\"50%\" align=\"center\"><b>📖 Wiki Browser</b><br/><img src=\"./docs/images/wiki-browser.png\" alt=\"Wiki Browser\" width=\"100%\"></td>\n    <td width=\"50%\" align=\"center\"><b>🕸️ Wiki Knowledge Graph</b><br/><img src=\"./docs/images/wiki-graph.png\" alt=\"Wiki Knowledge Graph\" width=\"100%\"></td>\n  </tr>\n  <tr>\n    <td width=\"50%\" align=\"center\"><b>🤖 Agent Mode · Tool Call Process</b><br/><img src=\"./docs/images/agent-qa.png\" alt=\"Agent Mode Tool Call Process\" width=\"100%\"></td>\n    <td width=\"50%\" align=\"center\"><b>⚙️ Conversation Settings</b><br/><img src=\"./docs/images/settings.png\" alt=\"Conversation Settings\" width=\"100%\"></td>\n  </tr>\n  <tr>\n    <td colspan=\"2\" align=\"center\"><b>🔭 Observability · Langfuse Tracing</b><br/><img src=\"./docs/images/langfuse.png\" alt=\"Observability Langfuse Tracing\" width=\"100%\"></td>\n  </tr>\n</table>\n\n## 🏗️ Architecture\n\n![weknora-architecture.png](./docs/images/architecture.png)\n\nFully modular pipeline from document parsing, vectorization, and retrieval to LLM inference — every component is swappable and extensible. Supports local / private cloud deployment with full data sovereignty and a zero-barrier Web UI for quick onboarding.\n\n## 🧩 Feature Overview\n\n**Intelligent Conversation**\n\n| Capability | Details |\n|------------|---------|\n| Intelligent Reasoning | ReACT progressive multi-step reasoning, autonomously orchestrating knowledge retrieval, MCP tools, and web search |\n| Quick Q&A | RAG-based Q&A over knowledge bases for fast and accurate answers |\n| Wiki Mode | Agent-driven auto-generation of structured, interlinked markdown Wiki pages from raw documents |\n| Tool Calling | Built-in tools, MCP tools (incl. OAuth2 remote services), web search |\n| Conversation Strategy | Online Prompt editing, retrieval threshold tuning, multi-turn context awareness |\n| Suggested Questions | Auto-generated question suggestions based on knowledge base content |\n| Citations & RAG Progress | Inline citation popovers, shared markdown rendering, and stage-by-stage RAG pipeline progress in chat |\n| Session Management | Filter and group sidebar sessions by source (Web / IM / Embed) |\n\n**Knowledge Management**\n\n| Capability | Details |\n|------------|---------|\n| Knowledge Base Types | FAQ / Document / Wiki with folder import, URL import, multi-tag management, and online entry |\n| Per-Upload Process Config | Override parser, chunking, multimodal (VLM / ASR), graph extraction, and question generation per upload batch via upload-confirm dialog or `process_config` API; reparse with new settings |\n| Batch Reparse | Re-queue parsing for multiple documents at once with optional per-batch `process_config` |\n| Data Source Import | Auto-sync from Feishu / Notion / Yuque / RSS feeds (more data sources coming soon); incremental and full sync |\n| Document Formats | PDF / Word / Txt / Markdown / HTML / EPUB / MHTML / Images / CSV / Excel / PPT / JSON |\n| Retrieval Strategies | BM25 sparse / Dense retrieval / GraphRAG / parent-child chunking / HNSW-accelerated pgvector (1024-dim) / multi-dimensional indexing |\n| Batch Selection | Marquee drag-select multiple documents in the KB list for batch operations |\n| E2E Testing | Full-pipeline visualization with recall hit rate, BLEU / ROUGE metric evaluation |\n\n**Integrations & Extensions**\n\n| Capability | Details |\n|------------|---------|\n| LLMs | OpenAI / Azure OpenAI / Anthropic (Claude) / DeepSeek / Qwen (Alibaba Cloud) / Zhipu / Hunyuan / Doubao (Volcengine) / Gemini / MiniMax / NVIDIA / Novita AI / SiliconFlow / OpenRouter / Ollama |\n| Embeddings | Ollama / BGE / GTE / Zhipu / OpenAI-compatible APIs |\n| Vector DBs | PostgreSQL (pgvector) / Elasticsearch / OpenSearch / Milvus / Weaviate / Qdrant / Apache Doris / Tencent VectorDB |\n| Object Storage",
    "manifest_file": "go.mod",
    "manifest_content": "module github.com/Tencent/WeKnora\n\ngo 1.26.0\n\nrequire (\n\tcodeberg.org/readeck/go-readability/v2 v2.1.2\n\tgithub.com/DATA-DOG/go-sqlmock v1.5.2\n\tgithub.com/JohannesKaufmann/html-to-markdown/v2 v2.5.1\n\tgithub.com/PuerkitoBio/goquery v1.12.0\n\tgithub.com/aliyun/alibabacloud-oss-go-sdk-v2 v1.5.1\n\tgithub.com/asg017/sqlite-vec-go-bindings v0.1.6\n\tgithub.com/aws/aws-sdk-go-v2 v1.41.7\n\tgithub.com/aws/aws-sdk-go-v2/config v1.32.17\n\tgithub.com/aws/aws-sdk-go-v2/credentials v1.19.16\n\tgithub.com/aws/aws-sdk-go-v2/service/s3 v1.101.0\n\tgithub.com/chromedp/chromedp v0.15.1\n\tgithub.com/duckdb/duckdb-go/v2 v2.10502.0\n\tgithub.com/elastic/go-elasticsearch/v7 v7.17.10\n\tgithub.com/elastic/go-elasticsearch/v8 v8.19.6\n\tgithub.com/gin-contrib/cors v1.7.7\n\tgithub.com/gin-gonic/gin v1.12.0\n\tgithub.com/go-openapi/strfmt v0.26.2\n\tgithub.com/go-sql-driver/mysql v1.10.0\n\tgithub.com/go-viper/mapstructure/v2 v2.5.0\n\tgithub.com/golang-jwt/jwt/v5 v5.3.1\n\tgithub.com/golang-migrate/migrate/v4 v4.19.1\n\tgithub.com/google/jsonschema-go v0.4.3\n\tgithub.com/google/uuid v1.6.0\n\tgithub.com/gorilla/websocket v1.5.3\n\tgithub.com/hibiken/asynq v0.26.0\n\tgithub.com/jackc/pgx/v5 v5.9.2\n\tgithub.com/joho/godotenv v1.5.1\n\tgithub.com/ks3sdklib/aws-sdk-go v1.11.0\n\tgithub.com/larksuite/oapi-sdk-go/v3 v3.6.1\n\tgithub.com/longbridgeapp/opencc v0.3.13\n\tgithub.com/mark3labs/mcp-go v0.52.0\n\tgithub.com/milvus-io/milvus/client/v2 v2.6.4\n\tgithub.com/minio/minio-go/v7 v7.1.0\n\tgithub.com/mmcdole/gofeed v1.3.0\n\tgithub.com/neo4j/neo4j-go-driver/v6 v6.0.0\n\tgithub.com/ollama/ollama v0.23.2\n\tgithub.com/open-dingtalk/dingtalk-stream-sdk-go v0.9.1\n\tgithub.com/opensearch-project/opensearch-go/v4 v4.6.0\n\tgithub.com/panjf2000/ants/v2 v2.12.0\n\tgithub.com/parquet-go/parquet-go v0.29.0\n\tgithub.com/pganalyze/pg_query_go/v6 v6.2.2\n\tgithub.com/pgvector/pgvector-go v0.3.0\n\tgithub.com/qdrant/go-client v1.18.1\n\tgithub.com/redis/go-redis/v9 v9.14.1\n\tgithub.com/robfig/cron/v3 v3.0.1\n\tgithub.com/sashabaranov/go-openai v1.41.2\n\tgithub.com/sirupsen/logrus v1.9.4\n\tgithub.com/slack-go/slack v0.23.1\n\tgithub.com/spf13/viper v1.21.0\n\tgithub.com/stretchr/testify v1.11.1\n\tgithub.com/swaggo/files v1.0.1\n\tgithub.com/swaggo/gin-swagger v1.6.1\n\tgithub.com/swaggo/swag v1.16.6\n\tgithub.com/tencent/vectordatabase-sdk-go v1.8.4\n\tgithub.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/common v1.3.103\n\tgithub.com/tencentcloud/tencentcloud-sdk-go/tencentcloud/lkeap v1.3.103\n\tgithub.com/tencentyun/cos-go-sdk-v5 v0.7.73\n\tgithub.com/tiktoken-go/tokenizer v0.7.0\n\tgithub.com/volcengine/ve-tos-golang-sdk/v2 v2.9.4\n\tgithub.com/wailsapp/wails/v2 v2.12.0\n\tgithub.com/weaviate/weaviate v1.37.3\n\tgithub.com/weaviate/weaviate-go-client/v5 v5.7.3\n\tgithub.com/xuri/excelize/v2 v2.10.1\n\tgithub.com/yanyiwu/gojieba v1.4.7\n\tgo.uber.org/dig v1.19.0\n\tgolang.org/x/crypto v0.51.0\n\tgolang.org/x/mod v0.36.0\n\tgolang.org/x/net v0.54.0\n\tgolang.org/x/sync v0.20.0\n\tgolang.org/x/time v0.15.0\n\tgoogle.golang.org/api v0.278.0\n\tgoogle.golang.org/grpc v1.81.0\n\tgoogle.golang.org/protobuf v1.36.11\n\tgopkg.in/natefinch/lumberjack.v2 v2.2.1\n\tgopkg.in/yaml.v3 v3.0.1\n\tgorm.io/driver/postgres v1.6.0\n\tgorm.io/driver/sqlite v1.6.0\n\tgorm.io/gorm v1.31.1\n)\n\nrequire (\n\tcloud.google.com/go/auth v0.20.0 // indirect\n\tcloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect\n\tcloud.google.com/go/compute/metadata v0.9.0 // indirect\n\tfilippo.io/edwards25519 v1.2.0 // indirect\n\tgit.sr.ht/~jackmordaunt/go-toast/v2 v2.0.3 // indirect\n\tgithub.com/JohannesKaufmann/dom v0.2.0 // indirect\n\tgithub.com/KyleBanks/depth v1.2.1 // indirect\n\tgithub.com/andybalholm/brotli v1.2.0 // indirect\n\tgithub.com/andybalholm/cascadia v1.3.3 // indirect\n\tgithub.com/apache/arrow-go/v18 v18.5.1 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.7.10 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.23 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/internal/configsources v1.4.23 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.23 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/internal/v4a v1.4.24 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.9 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/service/internal/checksum v1.9.15 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.23 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.19.23 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/service/signin v1.0.11 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/service/sso v1.30.17 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.21 // indirect\n\tgithub.com/aws/aws-sdk-go-v2/service/sts v1.42.1 // indirect\n\tgithub.com/aws/smithy-go v1.25.1 // indirect\n\tgithub.com/bahlo/generic-list-go v0.2.0 // indirect\n\tgithub.com/beorn7/perks v1.0.1 // indirect\n\tgithub.com/bep/debounce v1.2.1 // indirect\n\tgithub.com/blang/semver/v4 v4.0.0 // indirect\n\tgithub.com/buger/jsonparser v1.1.2 // indirect\n\tgithub.com/bytedance/gopkg v0.1.3 // indirect\n\tgithub.com/bytedance/sonic v1.15.0 // indirect\n\tgithub.com/byteda",
    "strategic_keywords": [
      "agent",
      "agents",
      "mcp",
      "rag",
      "skill",
      "llm",
      "eval",
      "vector",
      "embedding"
    ],
    "relationship_label": "Skill 来源",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme",
      "go.mod"
    ],
    "score_breakdown": {
      "heat": 20,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 10,
      "relation": 10,
      "risk_signal": 8,
      "total": 97
    },
    "strategic_score": 97
  },
  {
    "owner": "stanford-oval",
    "name": "storm",
    "full_name": "stanford-oval/storm",
    "url": "https://github.com/stanford-oval/storm",
    "description": "An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.",
    "language": "Python",
    "total_stars": 29561,
    "forks": 2750,
    "stars_this_period": 631,
    "source_slice": "python",
    "source_slices": [
      "python"
    ],
    "metadata": {
      "topics": [
        "agentic-rag",
        "deep-research",
        "emnlp2024",
        "knowledge-curation",
        "large-language-models",
        "naacl",
        "nlp",
        "report-generation",
        "retrieval-augmented-generation"
      ],
      "license": "MIT",
      "open_issues": 107,
      "created_at": "2024-03-24T16:23:39Z",
      "pushed_at": "2025-09-30T18:07:21Z",
      "homepage": "http://storm.genie.stanford.edu",
      "default_branch": "main",
      "forks": 2750,
      "watchers": 191,
      "archived": false,
      "size_kb": 8212
    },
    "readme_content": "<p align=\"center\">\n  <img src=\"assets/logo.svg\" style=\"width: 25%; height: auto;\">\n</p>\n\n# STORM: Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking\n\n<p align=\"center\">\n| <a href=\"http://storm.genie.stanford.edu\"><b>Research preview</b></a> | <a href=\"https://arxiv.org/abs/2402.14207\"><b>STORM Paper</b></a>| <a href=\"https://www.arxiv.org/abs/2408.15232\"><b>Co-STORM Paper</b></a>  | <a href=\"https://storm-project.stanford.edu/\"><b>Website</b></a> |\n</p>\n**Latest News** 🔥\n\n- [2025/01] We add [litellm](https://github.com/BerriAI/litellm) integration for language models and embedding models in `knowledge-storm` v1.1.0.\n\n- [2024/09] Co-STORM codebase is now released and integrated into `knowledge-storm` python package v1.0.0. Run `pip install knowledge-storm --upgrade` to check it out.\n\n- [2024/09] We introduce collaborative STORM (Co-STORM) to support human-AI collaborative knowledge curation! [Co-STORM Paper](https://www.arxiv.org/abs/2408.15232) has been accepted to EMNLP 2024 main conference.\n\n- [2024/07] You can now install our package with `pip install knowledge-storm`!\n- [2024/07] We add `VectorRM` to support grounding on user-provided documents, complementing existing support of search engines (`YouRM`, `BingSearch`). (check out [#58](https://github.com/stanford-oval/storm/pull/58))\n- [2024/07] We release demo light for developers a minimal user interface built with streamlit framework in Python, handy for local development and demo hosting (checkout [#54](https://github.com/stanford-oval/storm/pull/54))\n- [2024/06] We will present STORM at NAACL 2024! Find us at Poster Session 2 on June 17 or check our [presentation material](assets/storm_naacl2024_slides.pdf). \n- [2024/05] We add Bing Search support in [rm.py](knowledge_storm/rm.py). Test STORM with `GPT-4o` - we now configure the article generation part in our demo using `GPT-4o` model.\n- [2024/04] We release refactored version of STORM codebase! We define [interface](knowledge_storm/interface.py) for STORM pipeline and reimplement STORM-wiki (check out [`src/storm_wiki`](knowledge_storm/storm_wiki)) to demonstrate how to instantiate the pipeline. We provide API to support customization of different language models and retrieval/search integration.\n\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n## Overview [(Try STORM now!)](https://storm.genie.stanford.edu/)\n\n<p align=\"center\">\n  <img src=\"assets/overview.svg\" style=\"width: 90%; height: auto;\">\n</p>\nSTORM is a LLM system that writes Wikipedia-like articles from scratch based on Internet search. Co-STORM further enhanced its feature by enabling human to collaborative LLM system to support more aligned and preferred information seeking and knowledge curation.\n\nWhile the system cannot produce publication-ready articles that often require a significant number of edits, experienced Wikipedia editors have found it helpful in their pre-writing stage.\n\n**More than 70,000 people have tried our [live research preview](https://storm.genie.stanford.edu/). Try it out to see how STORM can help your knowledge exploration journey and please provide feedback to help us improve the system 🙏!**\n\n\n\n## How STORM & Co-STORM works\n\n### STORM\n\nSTORM breaks down generating long articles with citations into two steps:\n\n1. **Pre-writing stage**: The system conducts Internet-based research to collect references and generates an outline.\n2. **Writing stage**: The system uses the outline and references to generate the full-length article with citations.\n<p align=\"center\">\n  <img src=\"assets/two_stages.jpg\" style=\"width: 60%; height: auto;\">\n</p>\n\nSTORM identifies the core of automating the research process as automatically coming up with good questions to ask. Directly prompting the language model to ask questions does not work well. To improve the depth and breadth of the questions, STORM adopts two strategies:\n1. **Perspective-Guided Question Asking**: Given the input topic, STORM discovers different perspectives by surveying existing articles from similar topics and uses them to control the question-asking process.\n2. **Simulated Conversation**: STORM simulates a conversation between a Wikipedia writer and a topic expert grounded in Internet sources to enable the language model to update its understanding of the topic and ask follow-up questions.\n\n### CO-STORM\n\nCo-STORM proposes **a collaborative discourse protocol** which implements a turn management policy to support smooth collaboration among \n\n- **Co-STORM LLM experts**: This type of agent generates answers grounded on external knowledge sources and/or raises follow-up questions based on the discourse history.\n- **Moderator**: This agent generates thought-provoking questions inspired by information discovered by the retriever but not directly used in previous turns. Question generation can also be grounded!\n- **Human user**: The human user will take the initiative to either (1) observe the discourse to gain deeper understanding of the topic, or (2) actively engage in the conversation by injecting utterances to steer the discussion focus.\n\n<p align=\"center\">\n  <img src=\"assets/co-storm-workflow.jpg\" style=\"width: 60%; height: auto;\">\n</p>\n\nCo-STORM also maintains a dynamic updated **mind map**, which organize collected information into a hierarchical concept structure, aiming to **build a shared conceptual space between the human user and the system**. The mind map has been proven to help reduce the mental load when the discourse goes long and in-depth. \n\nBoth STORM and Co-STORM are implemented in a highly modular way using [dspy](https://github.com/stanfordnlp/dspy).\n\n## Installation\n\n\nTo install the knowledge storm library, use `pip install knowledge-storm`. \n\nYou could also install the source code which allows you to modify the behavior of STORM engine directly.\n1. Clone the git repository.\n    ```shell\n    git clone https://github.com/stanford-oval/storm.git\n    cd storm\n    ```\n   \n2. Install the required packages.\n   ```shell\n   conda create -n storm python=3.11\n   conda activate storm\n   pip install -r requirements.txt\n   ```\n   \n\n## API\n\nCurrently, our package support:\n\n- Language model components: All language models supported by litellm as listed [here](https://docs.litellm.ai/docs/providers)\n- Embedding model components: All embedding models supported by litellm as listed [here](https://docs.litellm.ai/docs/embedding/supported_embedding)\n- retrieval module components: `YouRM`, `BingSearch`, `VectorRM`, `SerperRM`, `BraveRM`, `SearXNG`, `DuckDuckGoSearchRM`, `TavilySearchRM`, `GoogleSearch`, and `AzureAISearch` as \n\n:star2: **PRs for integrating more search engines/retrievers into [knowledge_storm/rm.py](knowledge_storm/rm.py) are highly appreciated!**\n\nBoth STORM and Co-STORM are working in the information curation layer, you need to set up the information retrieval module and language model module to create their `Runner` classes respectively.\n\n### STORM\n\nThe STORM knowledge curation engine is defined as a simple Python `STORMWikiRunner` class. Here is an example of using You.com search engine and OpenAI models.\n\n```python\nimport os\nfrom knowledge_storm import STORMWikiRunnerArguments, STORMWikiRunner, STORMWikiLMConfigs\nfrom knowledge_storm.lm import LitellmModel\nfrom knowledge_storm.rm import YouRM\n\nlm_configs = STORMWikiLMConfigs()\nopenai_kwargs = {\n    'api_key': os.getenv(\"OPENAI_API_KEY\"),\n    'temperature': 1.0,\n    'top_p': 0.9,\n}\n# STORM is a LM system so different components can be powered by different models to reach a good balance between cost and quality.\n# For a good practice, choose a cheaper/faster model for `conv_simulator_lm` which is used to split queries, synthesize answers in the conversation.\n# Choose a more powerful model for `article_gen_lm` to generate verifiable text with citations.\ngpt_35 = LitellmModel(model='gpt-3.5-turbo', max_tokens=500, **openai_kwargs)\ngpt_4 = LitellmModel(model='gpt-4o', max_tokens=3000, **openai_kwargs)\nlm_configs.set_conv_simulator_lm(gpt_35)\nlm_configs.set_question_asker_lm(gpt_35)\nlm_configs.set_outline_gen_lm(gpt_4)\nlm_configs.set_article_gen_lm(gpt_4)\nlm_configs.set_article_polish_lm(gpt_4)\n# Check out the STORMWikiRunnerArguments class for more configurations.\nengine_args = STORMWikiRunnerArguments(...)\nrm = YouRM(ydc_api_key=os.getenv('YDC_API_KEY'), k=engine_args.search_top_k)\nrunner = STORMWikiRunner(engine_args, lm_configs, rm)\n```\n\nThe `STORMWikiRunner` instance can be evoked with the simple `run` method:\n```python\ntopic = input('Topic: ')\nrunner.run(\n    topic=topic,\n    do_research=True,\n    do_generate_outline=True,\n    do_generate_article=True,\n    do_polish_article=True,\n)\nrunner.post_run()\nrunner.summary()\n```\n- `do_research`: if True, simulate conversations with difference perspectives to collect information about the topic; otherwise, load the results.\n- `do_generate_outline`: if True, generate an outline for the topic; otherwise, load the results.\n- `do_generate_article`: if True, generate an article for the topic based on the outline and the collected information; otherwise, load the results.\n- `do_polish_article`: if True, polish the article by adding a summarization section and (optionally) removing duplicate content; otherwise, load the results.\n\n### Co-STORM\n\nThe Co-STORM knowledge curation engine is defined as a simple Python `CoStormRunner` class. Here is an example of using Bing search engine and OpenAI models.\n\n```python\nfrom knowledge_storm.collaborative_storm.engine import CollaborativeStormLMConfigs, RunnerArgument, CoStormRunner\nfrom knowledge_storm.lm import LitellmModel\nfrom knowledge_storm.logging_wrapper import LoggingWrapper\nfrom knowledge_storm.rm import BingSearch\n\n# Co-STORM adopts the same multi LM system paradigm as STORM \nlm_config: CollaborativeStormLMConfigs = CollaborativeStormLMConfigs()\nopenai_kwargs = {\n    \"api_key\": os.getenv(\"OPENAI_API_KEY\"),\n    \"api_provider\": \"openai\",\n    \"temperature\": 1.0,\n    \"top_p\": 0.9,\n    \"api_base\": None,\n} \nquestion_answering_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=1000, **openai_kwargs)\ndiscourse_manage_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=500, **openai_kwargs)\nutterance_polishing_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=2000, **openai_kwargs)\nwarmstart_outline_gen_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=500, **openai_kwargs)\nquestion_asking_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=300, **openai_kwargs)\nknowledge_base_lm = LitellmModel(model=gpt_4o_model_name, max_tokens=1000, **openai_kwargs)\n\nlm_config.set_question_answering_lm(question_answering_lm)\nlm_config.set_discourse_manage_lm(discourse_manage_lm)\nlm_config.set_utterance_polishing_lm(utterance_polishing_lm)\nlm_config.set_warmstart_outline_gen_lm(warmstart_outline_gen_lm)\nlm_config.set_question_asking_lm(question_asking_lm)\nlm_config.set_knowledge_base_lm(knowledge_base_lm)\n\n# Check out the Co-STORM's RunnerArguments class for more configurations.\ntopic = input('Topic: ')\nrunner_argument = RunnerArgument(topic=topic, ...)\nlogging_wrapper = LoggingWrapper(lm_config)\nbing_rm = BingSearch(bing_search_api_key=os.environ.get(\"BING_SEARCH_API_KEY\"),\n                     k=runner_argument.retrieve_top_k)\ncostorm_runner = CoStormRunner(lm_config=lm_config,\n                               runner_argument=runner_argument,\n                               logging_wrapper=logging_wrapper,\n                               rm=bing_rm)\n```\n\nThe `CoStormRunner` instance can be evoked with the `warmstart()` and `step(...)` methods.\n\n```python\n# Warm start the system to build shared conceptual space between Co-STORM and users\ncostorm_runner.warm_start()\n\n# Step through the collaborative discourse \n# Run either of the code snippets below in any order, as many times as you'd like\n# To observe the conversation:\nconv_turn = costorm_runner.step()\n# To inject your utterance to actively stee",
    "manifest_file": "requirements.txt",
    "manifest_content": "dspy_ai==2.4.9\nwikipedia==1.4.0\nsentence-transformers\ntoml\nlangchain-text-splitters\ntrafilatura\nlangchain-huggingface\nqdrant-client\nlangchain-qdrant\nnumpy\nlitellm\ndiskcache",
    "strategic_keywords": [
      "agent",
      "rag",
      "llm",
      "eval",
      "vector",
      "embedding"
    ],
    "relationship_label": "Workspace 组件",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme",
      "requirements.txt"
    ],
    "score_breakdown": {
      "heat": 19,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 10,
      "relation": 10,
      "risk_signal": 8,
      "total": 96
    },
    "strategic_score": 96
  },
  {
    "owner": "modem-dev",
    "name": "hunk",
    "full_name": "modem-dev/hunk",
    "url": "https://github.com/modem-dev/hunk",
    "description": "Review-first terminal diff viewer for agentic coders",
    "language": "TypeScript",
    "total_stars": 5761,
    "forks": 151,
    "stars_this_period": 556,
    "source_slice": "typescript",
    "source_slices": [
      "typescript"
    ],
    "metadata": {
      "topics": [
        "cli",
        "code-review",
        "diff",
        "git",
        "tui"
      ],
      "license": "MIT",
      "open_issues": 70,
      "created_at": "2026-03-17T19:16:15Z",
      "pushed_at": "2026-06-28T14:29:29Z",
      "homepage": "https://hunk.dev",
      "default_branch": "main",
      "forks": 151,
      "watchers": 9,
      "archived": false,
      "size_kb": 4556
    },
    "readme_content": "# hunk\n\nHunk is a review-first terminal diff viewer for agent-authored changesets, built on [OpenTUI](https://github.com/anomalyco/opentui) and [Pierre diffs](https://www.npmjs.com/package/@pierre/diffs).\n\n[![CI status](https://img.shields.io/github/actions/workflow/status/modem-dev/hunk/ci.yml?branch=main&style=for-the-badge&label=CI)](https://github.com/modem-dev/hunk/actions/workflows/ci.yml?branch=main)\n[![Latest release](https://img.shields.io/github/v/release/modem-dev/hunk?style=for-the-badge)](https://github.com/modem-dev/hunk/releases)\n[![MIT License](https://img.shields.io/badge/License-MIT-blue.svg?style=for-the-badge)](LICENSE)\n\n- multi-file review stream with sidebar navigation\n- inline AI and agent annotations beside the code\n- split, stack, and responsive auto layouts\n- watch mode for auto-reloading file and Git-backed reviews\n- keyboard, mouse, pager, and Git difftool support\n\n<table>\n <tr>\n   <td width=\"60%\" align=\"center\">\n    <img width=\"845\" alt=\"image\" src=\"https://github.com/user-attachments/assets/35605618-be3f-479e-b6e0-edb089910651\" />\n     <br />\n     <sub>Split view with sidebar and inline AI notes</sub>\n   </td>\n   <td width=\"40%\" align=\"center\">\n     <img width=\"507\"alt=\"image\" src=\"https://github.com/user-attachments/assets/92eb8993-f044-436d-a038-8139da5ad8de\" />\n     <br />\n     <sub>Stacked view and mouse-selectable menus</sub>\n   </td>\n </tr>\n</table>\n\n## Install\n\n```bash\nnpm i -g hunkdiff\n```\n\nOr with Homebrew:\n\n```bash\nbrew install modem-dev/tap/hunk\n```\n\nRequirements:\n\n- Node.js 18+\n- macOS, Linux, or Windows\n- Git recommended for most workflows\n\n> Nix users can use the `default` package exported in `flake.nix` instead. See [nix/README.md](./nix/README.md) for details.\n\n## Quick start\n\n```bash\nhunk           # show help\nhunk --version # print the installed version\n```\n\n### Working with Git\n\nHunk mirrors Git's diff-style commands, but opens the changeset in a review UI instead of plain text.\n\n```bash\nhunk diff                      # review current repo changes, including untracked files\nhunk diff --watch              # auto-reload as the working tree changes\nhunk show                      # review the latest commit\nhunk show HEAD~1               # review an earlier commit\n```\n\n### Working with Jujutsu and Sapling\n\nHunk auto-detects Jujutsu and Sapling checkouts, so `hunk diff [revset]` and `hunk show [revset]` use native revsets inside jj or Sapling workspaces. To override VCS detection, set `vcs = \"git\"` or `vcs = \"jj\"` or `vcs = \"sl\"` in [config](#config).\n\n### Working with raw files and patches\n\n```bash\nhunk diff before.ts after.ts                # compare two files directly\nhunk diff before.ts after.ts --watch        # auto-reload when either file changes\ngit diff --no-color | hunk patch -          # review a patch from stdin\n```\n\n### Working with agents\n\n1. Open Hunk in another terminal with `hunk diff` or `hunk show`.\n2. Tell your agent to add the skill file returned by `hunk skill path`.\n3. Ask your agent to use the skill against the live Hunk session.\n\nA good generic prompt is:\n\n```text\nLoad the Hunk skill and use it for this review. Run `hunk skill path` to get the skill path.\n```\n\nFor the full live-session and `--agent-context` workflow guide, see [docs/agent-workflows.md](docs/agent-workflows.md).\n\n## Feature comparison\n\n| Capability                         | [hunk](https://github.com/modem-dev/hunk) | [lumen](https://github.com/jnsahaj/lumen) | [difftastic](https://github.com/Wilfred/difftastic) | [delta](https://github.com/dandavison/delta) | [diff-so-fancy](https://github.com/so-fancy/diff-so-fancy) | [diff](https://www.gnu.org/software/diffutils/) |\n| ---------------------------------- | ----------------------------------------- | ----------------------------------------- | --------------------------------------------------- | -------------------------------------------- | ---------------------------------------------------------- | ----------------------------------------------- |\n| Review-first interactive UI        | ✅                                        | ✅                                        | ❌                                                  | ❌                                           | ❌                                                         | ❌                                              |\n| Multi-file review stream + sidebar | ✅                                        | ✅                                        | ❌                                                  | ❌                                           | ❌                                                         | ❌                                              |\n| Inline agent / AI annotations      | ✅                                        | ❌                                        | ❌                                                  | ❌                                           | ❌                                                         | ❌                                              |\n| Responsive auto split/stack layout | ✅                                        | ❌                                        | ❌                                                  | ❌                                           | ❌                                                         | ❌                                              |\n| Mouse support inside the viewer    | ✅                                        | ✅                                        | ❌                                                  | ❌                                           | ❌                                                         | ❌                                              |\n| Runtime view toggles               | ✅                                        | ✅                                        | ❌                                                  | ❌                                           | ❌                                                         | ❌                                              |\n| Syntax highlighting                | ✅                                        | ✅                                        | ✅                                                  | ✅                                           | ❌                                                         | ❌                                              |\n| Structural diffing                 | ❌                                        | ❌                                        | ✅                                                  | ❌                                           | ❌                                                         | ❌                                              |\n| Pager-compatible mode              | ✅                                        | ❌                                        | ✅                                                  | ✅                                           | ✅                                                         | ✅                                              |\n\nHunk is optimized for reviewing a full changeset interactively.\n\n## Advanced\n\n### Config\n\nYou can persist preferences to a config file:\n\n- `~/.config/hunk/config.toml`\n- `.hunk/config.toml`\n\nExample:\n\n```toml\ntheme = \"github-dark-default\" # any built-in theme id, auto, or custom\nmode = \"auto\"        # auto, split, stack\nvcs = \"git\"          # git, jj, sl\nwatch = false\nexclude_untracked = false\nline_numbers = true\nwrap_lines = false\nmenu_bar = true\nagent_notes = false\ntransparent_background = false\n```\n\n`theme = \"auto\"` and `--theme auto` query the terminal background at startup, choose `github-light-default` for light backgrounds and `github-dark-default` for dark backgrounds, and fall back to `github-dark-default` if the terminal does not answer.\nOlder theme ids such as `graphite` and `paper` remain accepted as compatibility aliases.\n`exclude_untracked` affects Git/Sapling working-tree `hunk diff` sessions only.\n`transparent_background` can also be written as `transparentBackground`.\n\nCustom themes can inherit from any built-in theme and override only the colors you care about:\n\n```toml\ntheme = \"custom\"\n\n[custom_theme]\nbase = \"catppuccin-mocha\"\nlabel = \"My Theme\"\naccent = \"#7fd1ff\"\npanel = \"#10161d\"\nnoteBorder = \"#c49bff\"\n\n[custom_theme.syntax]\nkeyword = \"#8ed4ff\"\nstring = \"#c7b4ff\"\ncomment = \"#6e85a7\"\noperator = \"#7fd1ff\"\nvariable = \"#eef4ff\"\n```\n\nAll custom theme colors must use `#rrggbb` hex values. Press `t` in the app, or choose `View -> Themes…`, to open the theme selector.\n\n### Git integration\n\nSet Hunk as your Git pager so `git diff` and `git show` open in Hunk automatically:\n\n> [!NOTE]\n> Untracked files are auto-included only for Hunk's own `hunk diff` working-tree loader. If you open `git diff` through `hunk pager`, Git still decides the patch contents, so untracked files will not appear there.\n\n```bash\ngit config --global core.pager \"hunk pager\"\n```\n\nOr in your Git config:\n\n```ini\n[core]\n    pager = hunk pager\n```\n\nIf you want to keep Git's default pager and add opt-in aliases instead:\n\n```bash\ngit config --global alias.hdiff \"-c core.pager=\\\"hunk pager\\\" diff\"\ngit config --global alias.hshow \"-c core.pager=\\\"hunk pager\\\" show\"\n```\n\n### Jujutsu pager integration\n\nTo use Hunk as jj's pager, run `jj config edit --user` and update:\n\n```toml\n[ui]\npager = [\"hunk\", \"pager\"]\ndiff-formatter = \":git\"\n```\n\n### Sapling pager integration\n\nTo use Hunk as Sapling's pager, run `sl config -u` and update:\n\n```ini\n[pager]\npager = hunk pager\n```\n\n### OpenTUI component\n\nHunk also publishes `HunkDiffView` and lower-level primitives from `hunkdiff/opentui` for embedding the same diff renderer in your own OpenTUI app.\n\nSee [docs/opentui-component.md](docs/opentui-component.md) for install, API, and runnable examples.\n\n## Examples\n\nReady-to-run demo diffs live in [`examples/`](examples/README.md).\n\nEach example includes the exact command to run from the repository root.\n\n## Contributing\n\n💬 _Chat with users/contributors on the [Modem Discord server](https://discord.gg/WZFjaP6Gt8)_\n\nFor source setup, tests, packaging checks, and repo architecture, see [CONTRIBUTING.md](CONTRIBUTING.md).\n\n## Sponsor\n\nSponsored by [Modem](https://modem.dev?utm_source=github&utm_medium=oss&utm_campaign=oss_hunk&utm_content=readme_footer).\n\n<a href=\"https://modem.dev?utm_source=github&utm_medium=oss&utm_campaign=oss_hunk&utm_content=readme_footer\">\n  <picture>\n    <source media=\"(prefers-color-scheme: dark)\" srcset=\"https://modem.dev/images/logo/svg/modem-combined-white.svg\">\n    <source media=\"(prefers-color-scheme: light)\" srcset=\"https://modem.dev/images/logo/svg/modem-combined-black.svg\">\n    <img src=\"https://modem.dev/images/logo/svg/modem-combined-black.svg\" alt=\"Modem\" width=\"220\">\n  </picture>\n</a>\n\n## License\n\n[MIT](LICENSE)\n",
    "manifest_file": "package.json",
    "manifest_content": "{\n  \"name\": \"hunkdiff\",\n  \"version\": \"0.16.0\",\n  \"description\": \"Desktop-inspired terminal diff viewer for understanding agent-authored changesets.\",\n  \"keywords\": [\n    \"ai\",\n    \"code-review\",\n    \"diff\",\n    \"git\",\n    \"terminal\",\n    \"tui\"\n  ],\n  \"homepage\": \"https://github.com/modem-dev/hunk#readme\",\n  \"bugs\": {\n    \"url\": \"https://github.com/modem-dev/hunk/issues\"\n  },\n  \"license\": \"MIT\",\n  \"repository\": {\n    \"type\": \"git\",\n    \"url\": \"git+https://github.com/modem-dev/hunk.git\"\n  },\n  \"bin\": {\n    \"hunk\": \"./bin/hunk.cjs\",\n    \"hunkdiff\": \"./bin/hunk.cjs\"\n  },\n  \"workspaces\": [\n    \".\",\n    \"packages/*\"\n  ],\n  \"files\": [\n    \"bin\",\n    \"dist/npm\",\n    \"skills\",\n    \"README.md\",\n    \"LICENSE\"\n  ],\n  \"type\": \"module\",\n  \"exports\": {\n    \"./opentui\": {\n      \"types\": \"./dist/npm/opentui/index.d.ts\",\n      \"import\": \"./dist/npm/opentui/index.js\"\n    },\n    \"./package.json\": \"./package.json\"\n  },\n  \"publishConfig\": {\n    \"access\": \"public\"\n  },\n  \"scripts\": {\n    \"start\": \"bun run src/main.tsx\",\n    \"dev\": \"bun --watch src/main.tsx\",\n    \"build:npm\": \"bun run ./scripts/build-npm.ts\",\n    \"build:bin\": \"bun run ./scripts/build-bin.ts\",\n    \"build:prebuilt:npm\": \"bun run build:npm && bun run build:bin && bun run ./scripts/stage-prebuilt-npm.ts\",\n    \"build:prebuilt:artifact\": \"bun run build:bin && bun run ./scripts/build-prebuilt-artifact.ts\",\n    \"stage:prebuilt:release\": \"bun run build:npm && bun run ./scripts/stage-prebuilt-npm.ts --artifact-root ./dist/release/artifacts\",\n    \"install:bin\": \"bun run ./scripts/install-bin.ts\",\n    \"typecheck\": \"tsc --noEmit\",\n    \"format\": \"oxfmt --write .\",\n    \"format:check\": \"oxfmt --check .\",\n    \"lint\": \"oxlint . --deny-warnings\",\n    \"lint:fix\": \"oxlint . --fix\",\n    \"changeset\": \"bunx @changesets/cli@2.31.0\",\n    \"changeset:status\": \"bunx @changesets/cli@2.31.0 status\",\n    \"release:version\": \"bunx @changesets/cli@2.31.0 version\",\n    \"prepare\": \"simple-git-hooks\",\n    \"test\": \"\\\"${npm_execpath:-bun}\\\" test ./src ./packages ./scripts ./test/cli ./test/session\",\n    \"test:theme-contrast\": \"bun test src/ui/themes.test.ts --test-name-pattern contrast\",\n    \"test:integration\": \"\\\"${npm_execpath:-bun}\\\" test ./test/pty\",\n    \"test:tty-smoke\": \"HUNK_RUN_TTY_SMOKE=1 \\\"${npm_execpath:-bun}\\\" test ./test/smoke\",\n    \"check:pack\": \"bun run ./scripts/check-pack.ts\",\n    \"check:prebuilt-pack\": \"bun run ./scripts/check-prebuilt-pack.ts\",\n    \"smoke:prebuilt-install\": \"bun run ./scripts/smoke-prebuilt-install.ts\",\n    \"publish:prebuilt:npm\": \"bun run ./scripts/publish-prebuilt-npm.ts\",\n    \"update:homebrew-formula\": \"bun run ./scripts/update-homebrew-formula.ts\",\n    \"prepack\": \"bun run build:npm\",\n    \"bench\": \"bun run benchmarks/run.ts\",\n    \"bench:release\": \"bun run ./scripts/run-release-benchmark.ts\",\n    \"bench:release:compare\": \"bun run ./scripts/compare-release-benchmarks.ts\",\n    \"bench:bootstrap-load\": \"bun run benchmarks/bootstrap-load.ts\",\n    \"bench:working-tree-load\": \"bun run benchmarks/working-tree-load.ts\",\n    \"bench:changeset-parse\": \"bun run benchmarks/changeset-parse.ts\",\n    \"bench:render-layout\": \"bun run benchmarks/render-layout.ts\",\n    \"bench:highlight-prefetch\": \"bun run benchmarks/highlight-prefetch.ts\",\n    \"bench:large-stream\": \"bun run benchmarks/large-stream.ts\",\n    \"bench:interaction-latency\": \"bun run benchmarks/interaction-latency.ts\",\n    \"bench:non-ascii-stream\": \"bun run benchmarks/non-ascii-stream.ts\",\n    \"bench:huge-stream\": \"bun run benchmarks/huge-stream.ts\",\n    \"bench:large-stream-profile\": \"bun run benchmarks/large-stream-profile.ts\",\n    \"bench:memory\": \"bun run benchmarks/memory.ts\",\n    \"bench:navigation-memory\": \"bun run benchmarks/navigation-memory.ts\",\n    \"bench:resize-memory\": \"bun run benchmarks/resize-memory.ts\",\n    \"bench:daemon-memory\": \"bun run scripts/daemon-memory-check.ts\",\n    \"bench:competitors\": \"bun run benchmarks/competitors.ts\",\n    \"nix:update-lock\": \"nix run .#update-bun-lock\"\n  },\n  \"dependencies\": {\n    \"@pierre/diffs\": \"1.2.2\",\n    \"bun\": \"^1.3.10\",\n    \"commander\": \"^14.0.3\",\n    \"diff\": \"^8.0.3\",\n    \"shell-quote\": \"1.8.4\",\n    \"string-width\": \"^8.2.1\",\n    \"zod\": \"^4.3.6\"\n  },\n  \"devDependencies\": {\n    \"@hunk/session-broker\": \"workspace:*\",\n    \"@hunk/session-broker-bun\": \"workspace:*\",\n    \"@hunk/session-broker-core\": \"workspace:*\",\n    \"@hunk/session-broker-node\": \"workspace:*\",\n    \"@opentui/core\": \"^0.1.89\",\n    \"@opentui/react\": \"^0.1.89\",\n    \"@types/bun\": \"1.3.13\",\n    \"@types/react\": \"^19.2.14\",\n    \"@types/shell-quote\": \"1.7.5\",\n    \"@types/ws\": \"^8.18.1\",\n    \"lint-staged\": \"^16.4.0\",\n    \"oxfmt\": \"^0.41.0\",\n    \"oxlint\": \"^1.56.0\",\n    \"react\": \"^19.2.4\",\n    \"simple-git-hooks\": \"^2.13.1\",\n    \"tuistory\": \"^0.0.16\",\n    \"typescript\": \"^5.9.3\"\n  },\n  \"peerDependencies\": {\n    \"@opentui/core\": \"^0.1.89\",\n    \"@opentui/react\": \"^0.1.89\",\n    \"react\": \"^19.2.4\"\n  },\n  \"simple-git-hooks\": {\n    \"pre-commit\": \"bunx lint-staged\"\n  },\n  \"engines\": {\n    \"node\": \">=18\"\n  },\n  \"packageManager\": \"bun@1",
    "strategic_keywords": [
      "agent",
      "agents",
      "skill",
      "workspace",
      "workflow"
    ],
    "relationship_label": "Memory 组件",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme",
      "package.json"
    ],
    "score_breakdown": {
      "heat": 19,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 10,
      "relation": 10,
      "risk_signal": 8,
      "total": 96
    },
    "strategic_score": 96
  },
  {
    "owner": "DeusData",
    "name": "codebase-memory-mcp",
    "full_name": "DeusData/codebase-memory-mcp",
    "url": "https://github.com/DeusData/codebase-memory-mcp",
    "description": "High-performance code intelligence MCP server. Indexes codebases into a persistent knowledge graph — average repo in milliseconds. 158 languages, sub-ms queries, 99% fewer tokens. Single static binary, zero dependencies.",
    "language": "C",
    "total_stars": 19508,
    "forks": 1416,
    "stars_this_period": 7674,
    "source_slice": "all",
    "source_slices": [
      "all"
    ],
    "metadata": {
      "topics": [
        "aider",
        "ast",
        "claude-code",
        "code-analysis",
        "code-intelligence",
        "codex",
        "cursor",
        "cypher",
        "developer-tools",
        "gemini-cli",
        "graph-visualization",
        "kilocode",
        "knowledge-graph",
        "mcp",
        "mcp-server",
        "model-context-protocol",
        "opencode",
        "sqlite",
        "tree-sitter",
        "windsurf"
      ],
      "license": "MIT",
      "open_issues": 167,
      "created_at": "2026-02-24T22:01:00Z",
      "pushed_at": "2026-06-28T22:04:44Z",
      "homepage": "https://deusdata.github.io/codebase-memory-mcp/",
      "default_branch": "main",
      "forks": 1416,
      "watchers": 75,
      "archived": false,
      "size_kb": 170296
    },
    "readme_content": "# codebase-memory-mcp\n\n[![GitHub Release](https://img.shields.io/github/v/release/DeusData/codebase-memory-mcp?style=flat&color=blue)](https://github.com/DeusData/codebase-memory-mcp/releases/latest)\n[![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)\n[![CI](https://img.shields.io/github/actions/workflow/status/DeusData/codebase-memory-mcp/dry-run.yml?label=CI)](https://github.com/DeusData/codebase-memory-mcp/actions/workflows/dry-run.yml)\n[![Tests](https://img.shields.io/badge/tests-5604_passing-brightgreen)](https://github.com/DeusData/codebase-memory-mcp)\n[![Languages](https://img.shields.io/badge/languages-158-orange)](https://github.com/DeusData/codebase-memory-mcp)\n[![Hybrid LSP](https://img.shields.io/badge/Hybrid_LSP-9_languages-blue)](#hybrid-lsp)\n[![Agents](https://img.shields.io/badge/agents-11-purple)](https://github.com/DeusData/codebase-memory-mcp)\n[![Pure C](https://img.shields.io/badge/pure_C-zero_dependencies-blue)](https://github.com/DeusData/codebase-memory-mcp)\n[![Platform](https://img.shields.io/badge/macOS_%7C_Linux_%7C_Windows-supported-lightgrey)](https://github.com/DeusData/codebase-memory-mcp/releases/latest)\n[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/DeusData/codebase-memory-mcp/badge)](https://scorecard.dev/viewer/?uri=github.com/DeusData/codebase-memory-mcp)\n[![SLSA 3](https://slsa.dev/images/gh-badge-level3.svg)](https://slsa.dev)\n[![VirusTotal](https://img.shields.io/badge/VirusTotal-scanned_every_release-brightgreen?logo=virustotal)](https://github.com/DeusData/codebase-memory-mcp/releases/latest)\n[![arXiv](https://img.shields.io/badge/arXiv-2603.27277-b31b1b?logo=arxiv)](https://arxiv.org/abs/2603.27277)\n\n**The fastest and most efficient code intelligence engine for AI coding agents.** Full-indexes an average repository in milliseconds, the Linux kernel (28M LOC, 75K files) in 3 minutes. Answers structural queries in under 1ms. Ships as a single static binary for macOS, Linux, and Windows — download, run `install`, done.\n\nHigh-quality parsing through [tree-sitter](https://tree-sitter.github.io/tree-sitter/) AST analysis across all 158 languages, enhanced with [**Hybrid LSP** semantic type resolution](#hybrid-lsp) for Python, TypeScript / JavaScript / JSX / TSX, PHP, C#, Go, C, C++, Java, Kotlin, and Rust — producing a persistent knowledge graph of functions, classes, call chains, HTTP routes, and cross-service links. 14 MCP tools. Zero dependencies. Plug and play across 11 coding agents.\n\n> **Research** — The design and benchmarks behind this project are described in the preprint [*Codebase-Memory: Tree-Sitter-Based Knowledge Graphs for LLM Code Exploration via MCP*](https://arxiv.org/abs/2603.27277) (arXiv:2603.27277). Evaluated across 31 real-world repositories: 83% answer quality, 10× fewer tokens, 2.1× fewer tool calls vs. file-by-file exploration.\n\n> **Security & Trust** — This tool reads your codebase and writes to your agent configuration files. That is what it is designed to do. If you prefer to audit before running, the [full source is here](https://github.com/DeusData/codebase-memory-mcp) — every release binary is signed, checksummed, and scanned by 70+ antivirus engines. All processing happens 100% locally; your code never leaves your machine. Found a security issue? We want to know — see [SECURITY.md](SECURITY.md). Security is Priority #1 for us.\n\n<p align=\"center\">\n  <img src=\"docs/graph-ui-screenshot.png\" alt=\"Graph visualization UI showing the codebase-memory-mcp knowledge graph\" width=\"800\">\n  <br>\n  <em>Built-in 3D graph visualization (UI variant) — explore your knowledge graph at localhost:9749</em>\n</p>\n\n## Why codebase-memory-mcp\n\n- **Extreme indexing speed** — Linux kernel (28M LOC, 75K files) in 3 minutes. RAM-first pipeline: LZ4 compression, in-memory SQLite, fused Aho-Corasick pattern matching. Memory released after indexing.\n- **Plug and play** — single static binary for macOS (arm64/amd64), Linux (arm64/amd64), and Windows (amd64). No Docker, no runtime dependencies, no API keys. Download → `install` → restart agent → done.\n- **158 languages** — vendored tree-sitter grammars compiled into the binary. Nothing to install, nothing that breaks.\n- **120x fewer tokens** — 5 structural queries: ~3,400 tokens vs ~412,000 via file-by-file search. One graph query replaces dozens of grep/read cycles.\n- **11 agents, one command** — `install` auto-detects Claude Code, Codex CLI, Gemini CLI, Zed, OpenCode, Antigravity, Aider, KiloCode, VS Code, OpenClaw, and Kiro — configures MCP entries, instruction files, and pre-tool hooks for each.\n- **Built-in graph visualization** — 3D interactive UI at `localhost:9749` (optional UI binary variant).\n- **Infrastructure-as-code indexing** — Dockerfiles, Kubernetes manifests, and Kustomize overlays indexed as graph nodes with cross-references. `Resource` nodes for K8s kinds, `Module` nodes for Kustomize overlays with `IMPORTS` edges to referenced resources.\n- **14 MCP tools** — search, trace, architecture, impact analysis, Cypher queries, dead code detection, cross-service HTTP linking, ADR management, and more.\n\n## Quick Start\n\n**One-line install** (macOS / Linux):\n```bash\ncurl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash\n```\n\nWith graph visualization UI:\n```bash\ncurl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash -s -- --ui\n```\n\n**Windows** (PowerShell):\n```powershell\n# 1. Download the installer\nInvoke-WebRequest -Uri https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.ps1 -OutFile install.ps1\n\n# 2. (Optional but recommended) Inspect the script\nnotepad install.ps1\n\n# 3. Run it\n.\\install.ps1\n\n```\n\nOptions: `--ui` (graph visualization), `--skip-config` (binary only, no agent setup), `--dir=<path>` (custom location).\n\nRestart your coding agent. Say **\"Index this project\"** — done.\n\n<details>\n<summary>Manual install</summary>\n\n1. **Download** the archive for your platform from the [latest release](https://github.com/DeusData/codebase-memory-mcp/releases/latest):\n   - `codebase-memory-mcp-<os>-<arch>.tar.gz` (macOS/Linux) or `.zip` (Windows) — standard\n   - `codebase-memory-mcp-ui-<os>-<arch>.tar.gz` / `.zip` — with graph visualization\n\n2. **Extract and install** (each archive includes `install.sh` or `install.ps1`):\n\n   macOS / Linux:\n   ```bash\n   tar xzf codebase-memory-mcp-*.tar.gz\n   ./install.sh\n   ```\n\n   Windows (PowerShell):\n   ```powershell\n   Expand-Archive codebase-memory-mcp-windows-amd64.zip -DestinationPath .\n   .\\install.ps1\n   ```\n\n3. **Restart** your coding agent.\n\nThe `install` command automatically strips macOS quarantine attributes and ad-hoc signs the binary — no manual `xattr`/`codesign` needed.\n</details>\n\nThe `install` command auto-detects all installed coding agents and configures MCP server entries, instruction files, skills, and pre-tool hooks for each.\n\n### Graph Visualization UI\n\nIf you downloaded the `ui` variant:\n\n```bash\ncodebase-memory-mcp --ui=true --port=9749\n```\n\nOpen `http://localhost:9749` in your browser. The UI runs as a background thread alongside the MCP server — it's available whenever your agent is connected.\n\n### Auto-Index\n\nEnable automatic indexing on MCP session start:\n\n```bash\ncodebase-memory-mcp config set auto_index true\n```\n\nWhen enabled, new projects are indexed automatically on first connection. Previously-indexed projects are registered with the background watcher for ongoing git-based change detection. Configurable file limit: `config set auto_index_limit 50000`.\n\n### Keeping Up to Date\n\n```bash\ncodebase-memory-mcp update\n```\n\nThe MCP server also checks for updates on startup and notifies on the first tool call if a newer release is available.\n\n### Uninstall\n\n```bash\ncodebase-memory-mcp uninstall\n```\n\nRemoves all agent configs, skills, hooks, and instructions. Does not remove the binary or SQLite databases.\n\n## Features\n\n### Graph & analysis\n- **Architecture overview**: `get_architecture` returns languages, packages, entry points, routes, hotspots, boundaries, layers, and clusters in a single call\n- **Architecture Decision Records**: `manage_adr` persists architectural decisions across sessions\n- **Louvain community detection**: Discovers functional modules by clustering call edges\n- **Git diff impact mapping**: `detect_changes` maps uncommitted changes to affected symbols with risk classification\n- **Call graph**: Resolves function calls across files and packages (import-aware, type-inferred)\n- **Dead code detection**: Finds functions with zero callers, excluding entry points\n- **Cypher-like queries**: `MATCH (f:Function)-[:CALLS]->(g) WHERE f.name = 'main' RETURN g.name`\n\n### Search\n- **Semantic search** (`semantic_query`): vector search across the entire graph, powered by bundled Nomic `nomic-embed-code` embeddings (40K tokens, 768d int8) compiled into the binary — no API key, no Ollama, no Docker. 11-signal combined scoring (TF-IDF, RRI, API/Type/Decorator signatures, AST profiles, data flow, Halstead-lite, MinHash, module proximity, graph diffusion).\n- **BM25 full-text search** via SQLite FTS5 with `cbm_camel_split` tokenizer (camelCase / snake_case aware)\n- **Structural search** (`search_graph`): regex name patterns, label filters, min/max degree, file scoping\n- **Code search** (`search_code`): graph-augmented grep over indexed files only\n\n### Cross-service linking\n- **HTTP** route ↔ call-site matching with confidence scoring\n- **gRPC, GraphQL, tRPC** service detection with protobuf Route extraction\n- **Channel detection** (`EMITS` / `LISTENS_ON`) for Socket.IO, EventEmitter, and generic pub-sub patterns across 8 languages with constant resolution\n\n### Cross-repo intelligence\n- **`CROSS_*` edges** link nodes across multiple repos indexed under the same store\n- **Multi-galaxy 3D UI layout** for cross-repo architecture visualization\n- **Cross-repo architecture summary** combining services, routes, and dependencies across the indexed fleet\n\n### Edge types (selected)\n- `CALLS`, `IMPORTS`, `DEFINES`, `IMPLEMENTS`, `INHERITS`\n- `HTTP_CALLS`, `ASYNC_CALLS` (cross-service)\n- `EMITS`, `LISTENS_ON` (channels)\n- `DATA_FLOWS` with arg-to-param mapping + field access chains\n- `SIMILAR_TO` (MinHash + LSH near-clone detection, Jaccard scored)\n- `SEMANTICALLY_RELATED` (vocabulary-mismatch, same-language, score ≥ 0.80)\n\n### Indexing pipeline\n- **158 vendored tree-sitter grammars** compiled into the binary\n- **Generic package / module resolution** — bare specifiers like `@myorg/pkg`, `github.com/foo/bar`, `use my_crate::foo` resolved via manifest scanning (`package.json`, `go.mod`, `Cargo.toml`, `pyproject.toml`, `composer.json`, `pubspec.yaml`, `pom.xml`, `build.gradle`, `mix.exs`, `*.gemspec`)\n- **Infrastructure-as-code indexing** — Dockerfiles, Kubernetes manifests, Kustomize overlays as graph nodes\n- **[Hybrid LSP semantic type resolution](#hybrid-lsp)** for Python, TypeScript / JavaScript / JSX / TSX, PHP, C#, Go, C, C++, Java, Kotlin, and Rust — a lightweight C implementation of language type-resolution algorithms, structurally inspired by and compatible with major language servers including tsserver / typescript-go, pyright, gopls, Roslyn, Eclipse JDT, and rust-analyzer (parameter binding, return-type inference, generic substitution, JSX component dispatch, JSDoc inference for plain JS files, namespace + trait + late-static-binding resolution for PHP, file-scoped namespaces + records + LINQ method syntax for C#, class-hierarchy + overload + lambda resolution for Java, extension-function + scope-function resolution for Kotlin, trait-method + UFCS resolution for Rust)\n- **RAM-first pipeline**: LZ4 compression, in-memory SQLite, single dump at end. Memory released after.\n\n### Distribution & operation\n- **Single static binary, zero infrastructure**: SQLite-backed, persists to `~/.cache/codebase-memory-mcp/`\n- **Auto-sync**: Background watcher detects file changes and re-indexes automatical",
    "strategic_keywords": [
      "agent",
      "agents",
      "memory",
      "mcp",
      "rag",
      "llm",
      "eval",
      "workflow",
      "protocol"
    ],
    "relationship_label": "Memory 组件",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme"
    ],
    "score_breakdown": {
      "heat": 20,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 8,
      "relation": 10,
      "risk_signal": 8,
      "total": 95
    },
    "strategic_score": 95
  },
  {
    "owner": "bytedance",
    "name": "deer-flow",
    "full_name": "bytedance/deer-flow",
    "url": "https://github.com/bytedance/deer-flow",
    "description": "An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of tasks that could take minutes to hours.",
    "language": "Python",
    "total_stars": 75223,
    "forks": 10153,
    "stars_this_period": 3258,
    "source_slice": "all",
    "source_slices": [
      "all",
      "python"
    ],
    "metadata": {
      "topics": [
        "agent",
        "agentic",
        "agentic-framework",
        "agentic-workflow",
        "ai",
        "ai-agents",
        "deep-research",
        "harness",
        "langchain",
        "langgraph",
        "langmanus",
        "llm",
        "multi-agent",
        "nodejs",
        "podcast",
        "python",
        "superagent",
        "typescript"
      ],
      "license": "MIT",
      "open_issues": 966,
      "created_at": "2025-05-07T02:50:19Z",
      "pushed_at": "2026-06-28T15:30:40Z",
      "homepage": "https://deerflow.tech",
      "default_branch": "main",
      "forks": 10153,
      "watchers": 320,
      "archived": false,
      "size_kb": 40140
    },
    "readme_content": "# 🦌 DeerFlow - 2.0\n\nEnglish | [中文](./README_zh.md) | [日本語](./README_ja.md) | [Français](./README_fr.md) | [Русский](./README_ru.md)\n\n[![Python](https://img.shields.io/badge/Python-3.12%2B-3776AB?logo=python&logoColor=white)](./backend/pyproject.toml)\n[![Node.js](https://img.shields.io/badge/Node.js-22%2B-339933?logo=node.js&logoColor=white)](./Makefile)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)\n\n<a href=\"https://trendshift.io/repositories/14699\" target=\"_blank\"><img src=\"https://trendshift.io/api/badge/repositories/14699\" alt=\"bytedance%2Fdeer-flow | Trendshift\" style=\"width: 250px; height: 55px;\" width=\"250\" height=\"55\"/></a>\n> On February 28th, 2026, DeerFlow claimed the 🏆 #1 spot on GitHub Trending following the launch of version 2. Thanks a million to our incredible community — you made this happen! 💪🔥\n\nDeerFlow (**D**eep **E**xploration and **E**fficient **R**esearch **Flow**) is an open-source **super agent harness** that orchestrates **sub-agents**, **memory**, and **sandboxes** to do almost anything — powered by **extensible skills**.\n\nhttps://github.com/user-attachments/assets/a8bcadc4-e040-4cf2-8fda-dd768b999c18\n\n> [!NOTE]\n> **DeerFlow 2.0 is a ground-up rewrite.** It shares no code with v1. If you're looking for the original Deep Research framework, it's maintained on the [`1.x` branch](https://github.com/bytedance/deer-flow/tree/main-1.x) — contributions there are still welcome. Active development has moved to 2.0.\n\n## Official Website\n\nLearn more and see **real demos** on our [**official website**](https://deerflow.tech).\n\n## Coding Plan from ByteDance Volcengine\n\n- We strongly recommend using Doubao-Seed-2.0-Code, DeepSeek v3.2 and Kimi 2.5 to run DeerFlow\n- [Learn more](https://www.byteplus.com/en/activity/codingplan?utm_campaign=deer_flow&utm_content=deer_flow&utm_medium=devrel&utm_source=OWO&utm_term=deer_flow)\n- [中国大陆地区的开发者请点击这里](https://www.volcengine.com/activity/codingplan?utm_campaign=deer_flow&utm_content=deer_flow&utm_medium=devrel&utm_source=OWO&utm_term=deer_flow)\n\n## InfoQuest\n\nDeerFlow has newly integrated the intelligent search and crawling toolset independently developed by BytePlus--[InfoQuest (supports free online experience)](https://docs.byteplus.com/en/docs/InfoQuest/What_is_Info_Quest)\n\n<a href=\"https://docs.byteplus.com/en/docs/InfoQuest/What_is_Info_Quest\" target=\"_blank\">\n  <img\n    src=\"https://sf16-sg.tiktokcdn.com/obj/eden-sg/hubseh7bsbps/20251208-160108.png\"   alt=\"InfoQuest_banner\"\n  />\n</a>\n\n---\n\n## Table of Contents\n\n- [🦌 DeerFlow - 2.0](#-deerflow---20)\n  - [Official Website](#official-website)\n  - [Coding Plan from ByteDance Volcengine](#coding-plan-from-bytedance-volcengine)\n  - [InfoQuest](#infoquest)\n  - [Table of Contents](#table-of-contents)\n  - [One-Line Agent Setup](#one-line-agent-setup)\n  - [Quick Start](#quick-start)\n    - [Configuration](#configuration)\n    - [Running the Application](#running-the-application)\n      - [Deployment Sizing](#deployment-sizing)\n      - [Option 1: Docker (Recommended)](#option-1-docker-recommended)\n      - [Option 2: Local Development](#option-2-local-development)\n    - [Advanced](#advanced)\n      - [Sandbox Mode](#sandbox-mode)\n      - [MCP Server](#mcp-server)\n      - [IM Channels](#im-channels)\n      - [LangSmith Tracing](#langsmith-tracing)\n      - [Langfuse Tracing](#langfuse-tracing)\n      - [Using Both Providers](#using-both-providers)\n  - [From Deep Research to Super Agent Harness](#from-deep-research-to-super-agent-harness)\n  - [Core Features](#core-features)\n    - [Skills \\& Tools](#skills--tools)\n      - [Claude Code Integration](#claude-code-integration)\n    - [Sub-Agents](#sub-agents)\n    - [Sandbox \\& File System](#sandbox--file-system)\n    - [Context Engineering](#context-engineering)\n    - [Long-Term Memory](#long-term-memory)\n  - [Recommended Models](#recommended-models)\n  - [Embedded Python Client](#embedded-python-client)\n  - [Terminal Workbench (TUI)](#terminal-workbench-tui)\n  - [Documentation](#documentation)\n  - [⚠️ Security Notice](#️-security-notice)\n    - [Improper Deployment May Introduce Security Risks](#improper-deployment-may-introduce-security-risks)\n    - [Security Recommendations](#security-recommendations)\n  - [Contributing](#contributing)\n  - [License](#license)\n  - [Acknowledgments](#acknowledgments)\n    - [Key Contributors](#key-contributors)\n  - [Star History](#star-history)\n\n## One-Line Agent Setup\n\nIf you use Claude Code, Codex, Cursor, Windsurf, or another coding agent, you can hand it the setup instructions in one sentence:\n\n```text\nHelp me clone DeerFlow if needed, then bootstrap it for local development by following https://raw.githubusercontent.com/bytedance/deer-flow/main/Install.md\n```\n\nThat prompt is intended for coding agents. It tells the agent to clone the repo if needed, choose Docker when available, and stop with the exact next command plus any missing config the user still needs to provide.\n\n## Quick Start\n\n### Configuration\n\n1. **Clone the DeerFlow repository**\n\n   ```bash\n   git clone https://github.com/bytedance/deer-flow.git\n   cd deer-flow\n   ```\n\n2. **Run the setup wizard**\n\n   From the project root directory (`deer-flow/`), run:\n\n   ```bash\n   make setup\n   ```\n\n   This launches an interactive wizard that guides you through choosing an LLM provider, optional web search, and execution/safety preferences such as sandbox mode, bash access, and file-write tools. It generates a minimal `config.yaml` and writes your keys to `.env`. Takes about 2 minutes.\n\n   The wizard also lets you configure an optional web search provider, or skip it for now.\n\n   Run `make doctor` at any time to verify your setup and get actionable fix hints.\n\n   > **Advanced / manual configuration**: If you prefer to edit `config.yaml` directly, run `make config` instead to copy the full template. See `config.example.yaml` for the complete reference including CLI-backed providers (Codex CLI, Claude Code OAuth), OpenRouter, Responses API, and more.\n\n   <details>\n   <summary>Manual model configuration examples</summary>\n\n   ```yaml\n   models:\n     - name: gpt-4o\n       display_name: GPT-4o\n       use: langchain_openai:ChatOpenAI\n       model: gpt-4o\n       api_key: $OPENAI_API_KEY\n\n     - name: openrouter-gemini-2.5-flash\n       display_name: Gemini 2.5 Flash (OpenRouter)\n       use: langchain_openai:ChatOpenAI\n       model: google/gemini-2.5-flash-preview\n       api_key: $OPENROUTER_API_KEY\n       base_url: https://openrouter.ai/api/v1\n\n     - name: gpt-5-responses\n       display_name: GPT-5 (Responses API)\n       use: langchain_openai:ChatOpenAI\n       model: gpt-5\n       api_key: $OPENAI_API_KEY\n       use_responses_api: true\n       output_version: responses/v1\n\n     - name: qwen3-32b-vllm\n       display_name: Qwen3 32B (vLLM)\n       use: deerflow.models.vllm_provider:VllmChatModel\n       model: Qwen/Qwen3-32B\n       api_key: $VLLM_API_KEY\n       base_url: http://localhost:8000/v1\n       supports_thinking: true\n       when_thinking_enabled:\n         extra_body:\n           chat_template_kwargs:\n             enable_thinking: true\n   ```\n\n   OpenRouter and similar OpenAI-compatible gateways should be configured with `langchain_openai:ChatOpenAI` plus `base_url`. If you prefer a provider-specific environment variable name, point `api_key` at that variable explicitly (for example `api_key: $OPENROUTER_API_KEY`).\n\n   To route OpenAI models through `/v1/responses`, keep using `langchain_openai:ChatOpenAI` and set `use_responses_api: true` with `output_version: responses/v1`.\n\n   For vLLM 0.19.0, use `deerflow.models.vllm_provider:VllmChatModel`. For Qwen-style reasoning models, DeerFlow toggles reasoning with `extra_body.chat_template_kwargs.enable_thinking` and preserves vLLM's non-standard `reasoning` field across multi-turn tool-call conversations. Legacy `thinking` configs are normalized automatically for backward compatibility. Reasoning models may also require the server to be started with `--reasoning-parser ...`. If your local vLLM deployment accepts any non-empty API key, you can still set `VLLM_API_KEY` to a placeholder value.\n\n   CLI-backed provider examples:\n\n   ```yaml\n   models:\n     - name: gpt-5.4\n       display_name: GPT-5.4 (Codex CLI)\n       use: deerflow.models.openai_codex_provider:CodexChatModel\n       model: gpt-5.4\n       supports_thinking: true\n       supports_reasoning_effort: true\n\n     - name: claude-sonnet-4.6\n       display_name: Claude Sonnet 4.6 (Claude Code OAuth)\n       use: deerflow.models.claude_provider:ClaudeChatModel\n       model: claude-sonnet-4-6\n       max_tokens: 4096\n       supports_thinking: true\n   ```\n\n   - Codex CLI reads `~/.codex/auth.json`\n   - Claude Code accepts `CLAUDE_CODE_OAUTH_TOKEN`, `ANTHROPIC_AUTH_TOKEN`, `CLAUDE_CODE_CREDENTIALS_PATH`, or `~/.claude/.credentials.json`\n   - ACP agent entries are separate from model providers — if you configure `acp_agents.codex`, point it at a Codex ACP adapter such as `npx -y @zed-industries/codex-acp`\n   - On macOS, export Claude Code auth explicitly if needed:\n\n   ```bash\n   eval \"$(python3 scripts/export_claude_code_oauth.py --print-export)\"\n   ```\n\n   API keys can also be set manually in `.env` (recommended) or exported in your shell:\n\n   ```bash\n   OPENAI_API_KEY=your-openai-api-key\n   TAVILY_API_KEY=your-tavily-api-key\n   ```\n\n   </details>\n\n### Running the Application\n\n#### Deployment Sizing\n\nUse the table below as a practical starting point when choosing how to run DeerFlow:\n\n| Deployment target | Starting point | Recommended | Notes |\n|---------|-----------|------------|-------|\n| Local evaluation / `make dev` | 4 vCPU, 8 GB RAM, 20 GB free SSD | 8 vCPU, 16 GB RAM | Good for one developer or one light session with hosted model APIs. `2 vCPU / 4 GB` is usually not enough. |\n| Docker development / `make docker-start` | 4 vCPU, 8 GB RAM, 25 GB free SSD | 8 vCPU, 16 GB RAM | Image builds, bind mounts, and sandbox containers need more headroom than pure local dev. |\n| Long-running server / `make up` | 8 vCPU, 16 GB RAM, 40 GB free SSD | 16 vCPU, 32 GB RAM | Preferred for shared use, multi-agent runs, report generation, or heavier sandbox workloads. |\n\n- These numbers cover DeerFlow itself. If you also host a local LLM, size that service separately.\n- Linux plus Docker is the recommended deployment target for a persistent server. macOS and Windows are best treated as development or evaluation environments.\n- If CPU or memory usage stays pinned, reduce concurrent runs first, then move to the next sizing tier.\n\n#### Option 1: Docker (Recommended)\n\n**Development** (hot-reload, source mounts):\n\n```bash\nmake docker-init    # Pull sandbox image (only once or when image updates)\nmake docker-start   # Start services (auto-detects sandbox mode from config.yaml)\n```\n\n`make docker-start` starts `provisioner` only when `config.yaml` uses provisioner mode (`sandbox.use: deerflow.community.aio_sandbox:AioSandboxProvider` with `provisioner_url`).\n\nDocker builds use the upstream `uv` registry by default. If you need faster mirrors in restricted networks, export `UV_INDEX_URL=https://pypi.tuna.tsinghua.edu.cn/simple` and `NPM_REGISTRY=https://registry.npmmirror.com` before running `make docker-init` or `make docker-start`.\n\nBackend processes automatically pick up `config.yaml` changes on the next config access, so model metadata updates do not require a manual restart during development.\n\n> [!TIP]\n> On Linux, if Docker-based commands fail with `permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock`, add your user to the `docker` group and re-login before retrying. See [CONTRIBUTING.md](CONTRIBUTING.md#linux-docker-daemon-permission-denied) for the full fix.\n\n**Production** (builds images locally, mounts runtime config and data):\n\n```bash\nmake up     # Build images and start all production services\nmake down   # Stop and remove containers\n```\n\nAccess: http://localhost:2026\n\nThe unified nginx endpoint i",
    "strategic_keywords": [
      "agent",
      "agents",
      "memory",
      "mcp",
      "rag",
      "skill",
      "llm",
      "workflow"
    ],
    "relationship_label": "Memory 组件",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme"
    ],
    "score_breakdown": {
      "heat": 20,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 8,
      "relation": 10,
      "risk_signal": 8,
      "total": 95
    },
    "strategic_score": 95
  },
  {
    "owner": "alibaba",
    "name": "page-agent",
    "full_name": "alibaba/page-agent",
    "url": "https://github.com/alibaba/page-agent",
    "description": "JavaScript in-page GUI agent. Control web interfaces with natural language.",
    "language": "TypeScript",
    "total_stars": 20465,
    "forks": 1763,
    "stars_this_period": 1649,
    "source_slice": "all",
    "source_slices": [
      "all",
      "typescript"
    ],
    "metadata": {
      "topics": [
        "agent",
        "ai",
        "ai-agents",
        "browser-automation",
        "javascript",
        "mcp",
        "typescript",
        "web"
      ],
      "license": "MIT",
      "open_issues": 47,
      "created_at": "2025-09-23T09:30:17Z",
      "pushed_at": "2026-06-25T17:07:30Z",
      "homepage": "https://alibaba.github.io/page-agent/",
      "default_branch": "main",
      "forks": 1763,
      "watchers": 62,
      "archived": false,
      "size_kb": 3111
    },
    "readme_content": "# Page Agent\n\n<picture>\n  <source media=\"(prefers-color-scheme: dark)\" srcset=\"https://page-agent.github.io/assets/readme/banner-dark.png\">\n  <img alt=\"Page Agent Banner\" src=\"https://page-agent.github.io/assets/readme/banner-light.png\">\n</picture>\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-auto.svg)](https://opensource.org/licenses/MIT) [![TypeScript](https://img.shields.io/badge/%3C%2F%3E-TypeScript-%230074c1.svg)](http://www.typescriptlang.org/) [![Bundle Size](https://img.shields.io/bundlephobia/minzip/page-agent)](https://bundlephobia.com/package/page-agent) [![Downloads](https://img.shields.io/npm/dt/page-agent.svg)](https://www.npmjs.com/package/page-agent) [![GitHub stars](https://img.shields.io/github/stars/alibaba/page-agent.svg)](https://github.com/alibaba/page-agent)\n\nThe GUI Agent Living in Your Webpage. Control web interfaces with natural language.\n\n🌐 **English** | [中文](./docs/README-zh.md)\n\n<a href=\"https://alibaba.github.io/page-agent/\" target=\"_blank\"><b>🚀 Demo</b></a> | <a href=\"https://alibaba.github.io/page-agent/docs/introduction/overview\" target=\"_blank\"><b>📖 Docs</b></a> | <a href=\"https://news.ycombinator.com/item?id=47264138\" target=\"_blank\"><b>📢 HN Discussion</b></a> | <a href=\"https://x.com/simonluvramen\" target=\"_blank\"><b>𝕏 Follow on X</b></a>\n\n<!-- demo video -->\n\nhttps://github.com/user-attachments/assets/a1f2eae2-13fb-4aae-98cf-a3fc1620a6c2\n\n---\n\n## ✨ Features\n\n- **🎯 Easy integration**\n    - No need for `browser extension` / `python` / `headless browser`.\n    - Just in-page javascript. Everything happens in your web page.\n- **📖 Text-based DOM manipulation**\n    - No screenshots. No multi-modal LLMs or special permissions needed.\n- **🧠 Bring your own LLMs**\n- **🐙 Optional [chrome extension](https://alibaba.github.io/page-agent/docs/features/chrome-extension) for multi-page tasks.**\n    - And an [MCP Server (Beta)](https://alibaba.github.io/page-agent/docs/features/mcp-server) to control it from outside\n\n## 💡 Use Cases\n\n- **SaaS AI Copilot** — Ship an AI copilot in your product in lines of code. No backend rewrite.\n- **Smart Form Filling** — Turn 20-click workflows into one sentence. Perfect for ERP, CRM, and admin systems.\n- **Accessibility** — Make any web app accessible through natural language. Voice commands, screen readers, zero barrier.\n- **Multi-page Agent** — Extend your own web agent's reach across browser tabs [chrome extension](https://alibaba.github.io/page-agent/docs/features/chrome-extension).\n- **MCP** - Allow your agent clients to control your browser.\n\n## 🚀 Quick Start\n\n### One-line integration\n\nFastest way to try PageAgent with our free Demo LLM:\n\n```html\n<script src=\"{URL}\" crossorigin=\"true\"></script>\n```\n\n> **⚠️ For technical evaluation only.** This demo CDN uses our free [testing LLM API](https://alibaba.github.io/page-agent/docs/features/models#free-testing-api). By using it, you agree to its [terms](https://github.com/alibaba/page-agent/blob/main/docs/terms-and-privacy.md).\n\n| Mirrors | URL                                                                                 |\n| ------- | ----------------------------------------------------------------------------------- |\n| Global  | https://cdn.jsdelivr.net/npm/page-agent@1.10.0/dist/iife/page-agent.demo.js         |\n| China   | https://registry.npmmirror.com/page-agent/1.10.0/files/dist/iife/page-agent.demo.js |\n\nAdd `?autoInit=false` to load the script without creating the demo agent automatically. You can then instantiate it with `new window.PageAgent(...)`.\n\n### NPM Installation\n\n```bash\nnpm install page-agent\n```\n\n```javascript\nimport { PageAgent } from 'page-agent'\n\nconst agent = new PageAgent({\n    model: 'qwen3.5-plus',\n    baseURL: 'https://dashscope.aliyuncs.com/compatible-mode/v1',\n    apiKey: 'YOUR_API_KEY',\n    language: 'en-US',\n})\n\nawait agent.execute('Click the login button')\n```\n\nFor more programmatic usage, see [📖 Documentations](https://alibaba.github.io/page-agent/docs/introduction/overview).\n\n## 🌟 Awesome Page Agent\n\nBuilt something cool with PageAgent? Add it here! Open a PR to share your project.\n\n> These are community projects — not maintained or endorsed by us. Use at your own discretion.\n\n| Project  | Description                                                 |\n| -------- | ----------------------------------------------------------- |\n| _Yours?_ | [Open a PR](https://github.com/alibaba/page-agent/pulls) 🙌 |\n\n## 🤝 Contributing\n\nWe welcome contributions from the community! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines and [docs/developer-guide.md](docs/developer-guide.md) for local development workflows.\n\nPlease read the [maintainer's note](https://github.com/alibaba/page-agent/issues/349) on principles and current state.\n\nContributions generated entirely by **bots or AI** without substantial human involvement will **not be accepted**.\n\n## ⚖️ License\n\n[MIT License](LICENSE)\n\n## 👏 Acknowledgments\n\nThis project builds upon the excellent work of **[`browser-use`](https://github.com/browser-use/browser-use)**.\n\n`PageAgent` is designed for **client-side web enhancement**, not server-side automation.\n\n```\nDOM processing components and prompt are derived from browser-use:\n\nBrowser Use <https://github.com/browser-use/browser-use>\nCopyright (c) 2024 Gregor Zunic\nLicensed under the MIT License\n\nWe gratefully acknowledge the browser-use project and its contributors for their\nexcellent work on web automation and DOM interaction patterns that helped make\nthis project possible.\n```\n\n---\n\n**⭐ Star this repo if you find PageAgent helpful!**\n",
    "manifest_file": "package.json",
    "manifest_content": "{\n    \"name\": \"root\",\n    \"private\": true,\n    \"version\": \"1.10.0\",\n    \"type\": \"module\",\n    \"workspaces\": [\n        \"packages/page-controller\",\n        \"packages/ui\",\n        \"packages/llms\",\n        \"packages/core\",\n        \"packages/page-agent\",\n        \"packages/mcp\",\n        \"packages/extension\",\n        \"packages/website\"\n    ],\n    \"description\": \"AI-powered UI agent for web applications\",\n    \"author\": \"Simon<gaomeng1900>\",\n    \"license\": \"MIT\",\n    \"repository\": {\n        \"type\": \"git\",\n        \"url\": \"https://github.com/alibaba/page-agent.git\"\n    },\n    \"homepage\": \"https://alibaba.github.io/page-agent/\",\n    \"engines\": {\n        \"node\": \"^22.22.1 || >=24\",\n        \"npm\": \"^11.6.3\"\n    },\n    \"scripts\": {\n        \"start\": \"npm run dev --workspace=@page-agent/website\",\n        \"dev:ext\": \"npm run dev -w @page-agent/ext\",\n        \"dev:demo\": \"npm run dev:demo --workspace=page-agent\",\n        \"build\": \"node scripts/build.js\",\n        \"build:libs\": \"node scripts/build-libs.js\",\n        \"build:website\": \"npm run build:website --workspace=@page-agent/website\",\n        \"build:ext\": \"npm run zip -w @page-agent/ext\",\n        \"version\": \"node scripts/sync-version.js\",\n        \"postpublish\": \"npm run postpublish --workspaces --if-present\",\n        \"typecheck\": \"tsc --noEmit -p tsconfig.typecheck.json && tsc --noEmit -p packages/extension/tsconfig.json\",\n        \"test\": \"npm test --workspaces --if-present\",\n        \"lint\": \"eslint .\",\n        \"ci\": \"node scripts/ci.js\",\n        \"cleanup\": \"rm -rf packages/*/dist && rm -rf packages/*/.output\",\n        \"prepare\": \"husky || true\"\n    },\n    \"devDependencies\": {\n        \"@commitlint/cli\": \"^21.0.1\",\n        \"@commitlint/config-conventional\": \"^21.0.1\",\n        \"@eslint-react/eslint-plugin\": \"^5.9.2\",\n        \"@eslint/js\": \"^10.0.1\",\n        \"@microsoft/api-extractor\": \"^7.58.9\",\n        \"@tailwindcss/vite\": \"^4.3.1\",\n        \"@trivago/prettier-plugin-sort-imports\": \"^6.0.2\",\n        \"@types/node\": \"^26.0.0\",\n        \"@vitejs/plugin-react\": \"^6.0.2\",\n        \"chalk\": \"^5.6.2\",\n        \"concurrently\": \"^10.0.3\",\n        \"dotenv\": \"^17.4.2\",\n        \"eslint\": \"^10.5.0\",\n        \"globals\": \"^17.7.0\",\n        \"happy-dom\": \"^20.10.6\",\n        \"husky\": \"^9.1.7\",\n        \"lint-staged\": \"^17.0.8\",\n        \"prettier\": \"^3.8.4\",\n        \"typescript\": \"^6.0.3\",\n        \"typescript-eslint\": \"^8.62.0\",\n        \"unplugin-dts\": \"^1.0.1\",\n        \"vite\": \"^8.0.14\",\n        \"vite-plugin-css-injected-by-js\": \"^5.0.1\",\n        \"vitest\": \"^4.1.9\"\n    },\n    \"overrides\": {\n        \"typescript\": \"^6.0.3\",\n        \"node-notifier\": {\n            \"uuid\": \"11.1.1\"\n        },\n        \"web-ext-run\": {\n            \"tmp\": \"0.2.7\"\n        }\n    },\n    \"lint-staged\": {\n        \"*.{js,ts,cjs,cts,mjs,mts}\": [\n            \"npx prettier --write --ignore-unknown\",\n            \"npx eslint --quiet\"\n        ],\n        \"*.{jsx,tsx}\": [\n            \"npx prettier --write --ignore-unknown\",\n            \"npx eslint --quiet\"\n        ],\n        \"*.css\": [\n            \"npx prettier --write --ignore-unknown\"\n        ]\n    },\n    \"commitlint\": {\n        \"extends\": [\n            \"@commitlint/config-conventional\"\n        ],\n        \"rules\": {\n            \"subject-case\": [\n                0,\n                \"never\"\n            ]\n        }\n    },\n    \"prettier\": {\n        \"singleQuote\": true,\n        \"semi\": false,\n        \"useTabs\": true,\n        \"printWidth\": 100,\n        \"trailingComma\": \"es5\",\n        \"plugins\": [\n            \"@trivago/prettier-plugin-sort-imports\"\n        ],\n        \"importOrder\": [\n            \"<THIRD_PARTY_MODULES>\",\n            \"^(@/).*(?<!css)$\",\n            \"^[./].*(?<!css)$\",\n            \".css$\"\n        ],\n        \"importOrderSeparation\": true,\n        \"importOrderSortSpecifiers\": true,\n        \"overrides\": [\n            {\n                \"files\": \"*.md\",\n                \"options\": {\n                    \"useTabs\": false,\n                    \"tabWidth\": 4\n                }\n            },\n            {\n                \"files\": \"*.json\",\n                \"options\": {\n                    \"useTabs\": false,\n                    \"tabWidth\": 4\n                }\n            }\n        ]\n    }\n}\n",
    "strategic_keywords": [
      "agent",
      "agents",
      "mcp",
      "workspace",
      "llm",
      "eval",
      "workflow",
      "automation"
    ],
    "relationship_label": "Skill 来源",
    "data_confidence": "high",
    "evidence_sources": [
      "github_trending",
      "github_repo_api",
      "readme",
      "package.json"
    ],
    "score_breakdown": {
      "heat": 20,
      "relevance": 20,
      "novelty": 15,
      "productize": 14,
      "adoption": 10,
      "relation": 10,
      "risk_signal": 6,
      "total": 95
    },
    "strategic_score": 95
  }
]