← Back to Projects
Featured
2024
AI Vision

Screenshot-to-Code

AI-powered design to code converter using GPT-4 Vision and Claude Sonnet

2024
Year
AI Vision
Core Tech
React
Frontend
Python
Backend
Project Overview

Screenshot-to-Code is a self-hosted fork of the open-source abi/screenshot-to-code project, extended with custom features and deployed on TreeTank infrastructure. It transforms design screenshots into production-ready code — upload an image of any UI design and get clean, functional code in React, Vue, or HTML, powered by GPT-4 Vision and Claude Sonnet.

The fork adds custom deployment via Docker + Traefik with automatic SSL, integrated into the TreeTank CI/CD pipeline for continuous updates.

The Problem

Converting design mockups into code is time-consuming and repetitive. Developers spend hours recreating visual designs in HTML/CSS/JS, often struggling to match exact spacing, colors, and layouts. Designers and non-technical founders face barriers prototyping ideas without coding skills.

Key Challenges:

  • Manual translation from design to code is slow
  • Maintaining design-code consistency requires constant updates
  • Non-developers can't quickly prototype UI ideas
  • Existing tools require structured design files (Figma, Sketch)
The Solution

Screenshot-to-Code leverages AI vision models to analyze screenshots and generate corresponding code. The system understands visual layouts, component hierarchies, styling, and generates clean, semantic code ready for production use.

Core Features:

  • Upload any UI screenshot—from design tools, websites, or hand-drawn sketches
  • Generate code in multiple frameworks: React, Vue, HTML
  • Powered by GPT-4 Vision and Claude Sonnet for accurate interpretation
  • Clean, semantic output with modern CSS and component structure
  • Live preview and instant code generation in seconds
Technical Stack
Technologies powering the platform

AI & Machine Learning

GPT-4 Vision
Claude Sonnet
OpenAI API
Anthropic API

Dual AI model support for maximum accuracy. GPT-4 Vision excels at complex layouts, while Claude Sonnet provides excellent semantic understanding.

Frontend

React
TypeScript
Tailwind CSS
Vite

Modern React application with TypeScript for type safety and Tailwind for rapid UI development.

Backend

Python
FastAPI
WebSockets

Python backend with FastAPI for high-performance API and real-time streaming via WebSockets.

Infrastructure

Docker
Docker Compose
Traefik
Let's Encrypt

Containerized deployment with Traefik reverse proxy and automatic SSL certificates.

Key Learnings
Insights from building with AI vision models
  • Prompt Engineering Matters - Carefully crafted prompts significantly improve code quality and consistency
  • Model Selection is Context-Dependent - Different models excel at different design patterns; GPT-4 for complex layouts, Claude for semantic HTML
  • Real-time Streaming Enhances UX - WebSocket streaming makes AI generation feel instant and engaging
  • Vision Models Understand Design Intent - Modern AI can interpret not just pixels but semantic structure and hierarchy
Results & Impact

Speed & Efficiency

Reduced design-to-code time from hours to seconds. Enables rapid prototyping and iteration cycles.

Accessibility

Empowers non-developers to create functional prototypes from visual designs without coding knowledge.

Production Deployment

Live at s2c.treetank.net with automated CI/CD pipeline

Open Innovation

Demonstrates practical AI vision applications beyond typical use cases