GSoC 2025 Project Ideas

Explore our AI-focused project ideas for Google Summer of Code 2025. Build the future of intelligent systems, automation, and secure AI.

15 Projects

90-350 Hours

All Skill Levels

Size:

Difficulty:

Featured Projects

Featured

Large

~350 hours

Advanced

Multi-Agent AI Orchestration System

Design and implement a sophisticated multi-agent AI system that enables autonomous collaboration between specialized AI agents. This system will feature dynamic task delegation, agent communication protocols, and intelligent workflow management for complex problem-solving scenarios.

AI/ML

Infrastructure

TypeScript

Python

LangChain

OpenAI API

Redis

+2 more

Skills Required

AI/ML fundamentals

Distributed systems

API design

Real-time communication

Project Goals

Design agent communication protocol and message passing system

Implement agent registry with capability discovery

Create task decomposition and delegation engine

Build monitoring dashboard for agent interactions

Develop conflict resolution and consensus mechanisms

Implement agent memory and context sharing

Expected Outcome

A production-ready multi-agent framework that can orchestrate 10+ specialized agents for complex tasks with real-time monitoring and fault tolerance.

Mentors

AI Systems Lead

Backend Architecture Lead

Resources

• LangGraph Documentation

• Multi-Agent Systems: A Survey

• OpenAI Function Calling Guide

Featured

Large

~350 hours

Advanced

Hybrid Offline AI Layer (SLM + LLM)

Build a hybrid AI inference layer that intelligently routes requests between local Small Language Models (SLM) and cloud-based Large Language Models (LLM). This ensures privacy-first processing with secure reasoning capabilities, automatic fallback mechanisms, and optimized performance.

AI/ML

Security

Python

Ollama

llama.cpp

ONNX Runtime

FastAPI

+2 more

Skills Required

ML model optimization

Edge computing

API design

Performance tuning

Project Goals

Implement local SLM inference engine using Ollama/llama.cpp

Design intelligent routing logic based on query complexity

Create secure data classification system for PII detection

Build model quantization and optimization pipeline

Implement seamless fallback to cloud LLM when needed

Develop offline-first caching and sync mechanisms

Expected Outcome

A privacy-preserving AI system that processes 70% of requests locally with sub-100ms latency while seamlessly falling back to cloud for complex queries.

Mentors

ML Infrastructure Lead

Security Engineer

Resources

• Ollama Documentation

• llama.cpp Quantization Guide

• ONNX Runtime Optimization

Featured

Large

~350 hours

Intermediate

Visual AI Workflow Automation Engine

Create a no-code/low-code visual workflow builder for AI-powered automation. Users can drag-and-drop AI components, define triggers, and create complex automation pipelines. Includes version control, rollback capabilities, and execution monitoring.

AI/ML

Frontend

Backend

React

TypeScript

React Flow

Node.js

PostgreSQL

+2 more

Skills Required

React/TypeScript

Graph algorithms

Queue systems

UI/UX design

Project Goals

Build visual workflow editor with React Flow

Implement workflow execution engine with Bull Queue

Create AI node library (LLM, Vision, Audio, etc.)

Design version control system for workflows

Build real-time execution monitoring dashboard

Implement conditional branching and error handling

Expected Outcome

A visual workflow platform where users can create and deploy AI automation pipelines in minutes, with full observability and version control.

Mentors

Frontend Lead

DevOps Engineer

Resources

• React Flow Documentation

• Bull Queue Guide

• n8n Architecture Overview

All Project Ideas (15)

Medium

~175 hours

Advanced

Secure AI Reasoning Pipeline with Guardrails

Implement a comprehensive AI safety and security layer that validates, sanitizes, and monitors all AI interactions. Includes prompt injection detection, output filtering, bias detection, and compliance logging for enterprise deployments.

AI/ML

Security

Python

TypeScript

NeMo Guardrails

Presidio

PostgreSQL

+1 more

Skills Required

NLP

Security best practices

Logging/monitoring

Compliance knowledge

Project Goals

Implement prompt injection detection and prevention

Build PII detection and redaction pipeline using Presidio

Create output validation with configurable rules

Design audit logging for compliance (SOC2, GDPR)

Implement rate limiting and abuse detection

Build bias detection and fairness metrics dashboard

Expected Outcome

A plug-and-play security middleware that protects AI applications from common attacks while ensuring compliance and fairness.

Mentors

Security Lead

AI Ethics Researcher

Resources

• NeMo Guardrails Docs

• OWASP AI Security Guide

• Presidio Documentation

Medium

~175 hours

Intermediate

AI-Powered Code Review Assistant

Build an intelligent code review system that analyzes pull requests, suggests improvements, detects potential bugs, and ensures code quality standards. Integrates with GitHub/GitLab and learns from project-specific patterns.

AI/ML

Infrastructure

TypeScript

Python

GitHub API

Tree-sitter

OpenAI API

+1 more

Skills Required

AST parsing

GitHub integrations

Code analysis

ML fine-tuning

Project Goals

Build GitHub/GitLab webhook integration

Implement AST-based code analysis using Tree-sitter

Create intelligent comment generation system

Design learning pipeline from merged PRs

Build customizable rule engine for code standards

Implement auto-fix suggestions for common issues

Expected Outcome

A GitHub App that provides intelligent code review comments, reducing review time by 50% while catching more issues.

Mentors

DevTools Lead

ML Engineer

Resources

• Tree-sitter Documentation

• GitHub Apps Guide

• CodeBERT Paper

Medium

~175 hours

Intermediate

Context-Aware Embedding & RAG System

Design an advanced Retrieval-Augmented Generation (RAG) system with context-aware embeddings, hybrid search capabilities, and intelligent chunking strategies for accurate document retrieval and question answering.

AI/ML

Backend

Python

LangChain

Pinecone/Weaviate

PostgreSQL

FastAPI

+1 more

Skills Required

Vector databases

NLP

Information retrieval

API design

Project Goals

Implement intelligent document chunking strategies

Build hybrid search (semantic + keyword) system

Create context-aware re-ranking pipeline

Design multi-modal embedding support (text, images)

Implement query expansion and reformulation

Build evaluation pipeline with ground truth datasets

Expected Outcome

A production-ready RAG system achieving 90%+ retrieval accuracy with sub-second query latency.

Mentors

ML Engineer

Data Engineer

Resources

• LangChain RAG Guide

• Pinecone Best Practices

• MTEB Benchmark

Medium

~175 hours

Advanced

Real-Time AI Inference Optimization

Optimize AI model inference for real-time applications through techniques like model quantization, batching, caching, and speculative decoding. Focus on reducing latency while maintaining accuracy.

AI/ML

Infrastructure

Python

CUDA

TensorRT

vLLM

Triton Inference Server

+1 more

Skills Required

GPU programming

ML optimization

Systems programming

Benchmarking

Project Goals

Implement model quantization pipeline (INT8, INT4)

Build dynamic batching system for inference

Create speculative decoding implementation

Design KV-cache optimization strategies

Implement request scheduling and prioritization

Build comprehensive benchmarking suite

Expected Outcome

Achieve 3x inference speedup with <5% accuracy loss, documented with comprehensive benchmarks.

Mentors

ML Infrastructure Lead

GPU Engineer

Resources

• vLLM Paper

• TensorRT Optimization Guide

• Flash Attention Paper

Medium

~175 hours

Intermediate

Long-Term AI Memory & Knowledge Graph

Implement a persistent memory system for AI agents that enables long-term context retention, relationship tracking, and knowledge graph construction from conversations and interactions.

AI/ML

Backend

Python

Neo4j

LangChain

PostgreSQL

FastAPI

+1 more

Skills Required

Graph databases

NLP

Knowledge representation

API design

Project Goals

Design memory schema for episodic and semantic memory

Implement automatic entity extraction and linking

Build knowledge graph from conversation history

Create memory consolidation and forgetting mechanisms

Design context retrieval with relevance scoring

Implement memory visualization dashboard

Expected Outcome

A memory system that enables AI agents to recall relevant context from 10,000+ past interactions with 95%+ precision.

Mentors

AI Research Lead

Database Engineer

Resources

• Neo4j Graph Algorithms

• MemGPT Paper

• Entity Linking Survey

Large

~350 hours

Advanced

Privacy-Preserving Federated Learning

Build a federated learning infrastructure that enables model training across distributed data sources without centralizing sensitive data. Implements differential privacy and secure aggregation.

AI/ML

Security

Python

PySyft

TensorFlow Federated

gRPC

Docker

+1 more

Skills Required

Federated learning

Cryptography

Distributed systems

Privacy engineering

Project Goals

Implement federated averaging algorithm

Build secure aggregation protocol

Add differential privacy mechanisms

Create client SDK for edge devices

Design model versioning and deployment pipeline

Build monitoring dashboard for training progress

Expected Outcome

A federated learning platform enabling privacy-preserving model training across 100+ distributed nodes.

Mentors

ML Research Lead

Cryptography Expert

Resources

• TensorFlow Federated Guide

• PySyft Documentation

• Differential Privacy Paper

Medium

~175 hours

Intermediate

AI/LLM Observability & Tracing Platform

Create a comprehensive observability platform for AI applications with request tracing, cost tracking, latency monitoring, and quality metrics. Provides insights for debugging and optimization.

AI/ML

Infrastructure

TypeScript

OpenTelemetry

ClickHouse

Grafana

React

+1 more

Skills Required

Observability

Data visualization

API design

Performance analysis

Project Goals

Implement OpenTelemetry-based request tracing

Build cost tracking and token usage analytics

Create latency breakdown visualization

Design quality metrics (hallucination detection, coherence)

Implement alerting and anomaly detection

Build comparison tools for A/B testing prompts

Expected Outcome

An observability platform providing full visibility into AI application performance with actionable insights.

Mentors

Platform Engineer

Data Analyst

Resources

• OpenTelemetry Docs

• LangSmith Architecture

• ClickHouse Best Practices

Small

~90 hours

Intermediate

AI/LLM Testing & Evaluation Framework

Build a comprehensive testing framework for AI applications with automated evaluation, regression testing, prompt versioning, and benchmark suites for consistent quality assurance.

AI/ML

Infrastructure

Python

TypeScript

pytest

Jest

PostgreSQL

+1 more

Skills Required

Testing methodologies

CI/CD

NLP evaluation

API design

Project Goals

Design evaluation metrics for LLM outputs

Implement prompt regression testing system

Create benchmark suite with standard datasets

Build automated test generation from examples

Implement A/B testing infrastructure

Create CI/CD integration for continuous evaluation

Expected Outcome

A testing framework that catches 95% of prompt regressions before production deployment.

Mentors

QA Lead

ML Engineer

Resources

• DeepEval Documentation

• HELM Benchmark

• LLM Evaluation Survey

Small

~90 hours

Beginner Friendly

Natural Language to API Gateway

Create a system that translates natural language commands into structured API calls, enabling users to interact with complex systems using plain English. Includes intent parsing, parameter extraction, and execution.

AI/ML

Frontend

TypeScript

OpenAI API

React

Node.js

PostgreSQL

Skills Required

NLP basics

API design

React

Prompt engineering

Project Goals

Implement natural language intent classification

Build parameter extraction pipeline

Create API schema registry and matching

Design conversation context management

Implement confirmation and clarification flows

Build user-friendly chat interface

Expected Outcome

A chat interface that accurately translates 85%+ of natural language queries into correct API calls.

Mentors

Frontend Lead

NLP Engineer

Resources

• Function Calling Guide

• NL2Code Survey

• React Chat UI Libraries

Small

~90 hours

Beginner Friendly

AI-Powered Contributor-Issue Matching

Build an intelligent system that matches open source contributors with suitable issues based on their skills, experience, and interests. Uses ML to analyze past contributions and predict good matches.

AI/ML

Backend

Python

TypeScript

GitHub API

PostgreSQL

Sentence Transformers

Skills Required

GitHub API

Basic ML

API design

Database design

Project Goals

Build contributor profile from GitHub activity

Implement issue embedding and similarity search

Create skill extraction from commit history

Design recommendation ranking algorithm

Build API for integration with existing systems

Implement feedback loop for improving matches

Expected Outcome

A matching system that increases first-time contributor success rate by 40%.

Mentors

Backend Lead

ML Engineer

Resources

• GitHub API Documentation

• Recommendation Systems Survey

• Sentence Transformers

Small

~90 hours

Beginner Friendly

AI Documentation Generator & Updater

Create a tool that automatically generates and keeps documentation up-to-date by analyzing code changes, extracting docstrings, and generating human-readable documentation with examples.

AI/ML

Infrastructure

TypeScript

Python

Tree-sitter

OpenAI API

GitHub Actions

+1 more

Skills Required

AST parsing

Documentation tools

GitHub Actions

Technical writing

Project Goals

Implement code analysis using Tree-sitter

Build docstring extraction and enhancement

Create change detection and diff analysis

Generate usage examples from test files

Build GitHub Action for automated updates

Design interactive documentation preview

Expected Outcome

A documentation tool that keeps docs 90%+ in sync with code changes automatically.

Mentors

Developer Experience Lead

Technical Writer

Resources

• Tree-sitter Documentation

• Sphinx/MkDocs Guides

• Technical Writing Best Practices

Small

~90 hours

Beginner Friendly

Prompt Engineering & Versioning Platform

Build a collaborative platform for prompt engineering with version control, A/B testing, performance analytics, and team collaboration features. Think "GitHub for prompts".

AI/ML

Frontend

TypeScript

React

PostgreSQL

Redis

OpenAI API

Skills Required

React

Database design

Version control concepts

UI/UX

Project Goals

Build prompt editor with syntax highlighting

Implement version control with branching

Create A/B testing infrastructure

Design analytics dashboard for prompt performance

Build team collaboration features

Implement prompt templates and variables

Expected Outcome

A platform that enables teams to iterate on prompts 3x faster with full version history and analytics.

Mentors

Frontend Lead

Product Manager

Resources

• PromptLayer Architecture

• Git Internals

• React Query Documentation

Ready to Apply?

Check out our contributor guide to learn how to get started, set up your development environment, and submit a successful GSoC proposal.