# CleftAI Audio Transcription & Text Summarization API

## Overview
CleftAI provides a production-ready API service for audio transcription and text summarization using OpenAI's Whisper and GPT-4o models. The service converts audio files and text into well-formatted notes with automatic tag generation (exactly 3 comma-separated tags), supports note merging, appending, and updating operations, and includes API key authentication with asynchronous job processing.

## Base URL
https://api.cleftai.com/api

## Authentication
All endpoints require API key authentication using one of these methods:
- Authorization header: `Authorization: Bearer sk-proj-cleftai-your-api-key-here`
- Custom header: `X-API-Key: sk-proj-cleftai-your-api-key-here`
- Query parameter: `?api_key=sk-proj-cleftai-your-api-key-here`

API keys follow the OpenAI format pattern: `sk-proj-cleftai-*` and are managed manually for security.

## API Endpoints

### Audio Processing
**POST /api/audio/process**
- Uploads audio files for transcription using OpenAI Whisper
- Formats transcripts into organized notes with automatic tag generation (exactly 3 tags)
- Supported formats: mp3, wav, m4a, mp4, aac, ogg, webm (max 25MB)
- Parameters: audio_file (required), custom_instructions (optional), language (optional), webhook_url (optional)
- Returns: job_id for asynchronous processing

### Text Processing
**POST /api/text/summarize**
- Processes text directly for voice memo formatting and tag generation (exactly 3 tags)
- Parameters: text (required), custom_instructions (optional), webhook_url (optional)
- Returns: job_id for asynchronous processing

### Note Merging
**POST /api/notes/merge**
- Combines multiple existing notes into a single cohesive document
- Parameters: note_ids (array of UUIDs, min 2), custom_instructions (optional), webhook_url (optional)
- Returns: job_id for asynchronous processing

### Note Appending
**POST /api/notes/append**
- Transcribes audio file and returns content to append to an existing note
- Parameters: target_note_id (UUID), audio_file (required), custom_instructions (optional), language (optional), webhook_url (optional)
- Returns: job_id for asynchronous processing with target_note_id for app-side appending

### Note Updating
**POST /api/notes/update**
- Reprocesses an existing note with fresh AI analysis and custom instructions
- Updates content while preserving the same note UUID
- Parameters: note_id (UUID), custom_instructions (optional), webhook_url (optional)
- Returns: job_id for asynchronous processing with "reprocessed": true flag

### Job Status Tracking
**GET /api/jobs/{job_id}**
- Retrieves processing status and results for any job
- Returns: job status, completion data, note_id, tags (exactly 3), formatted content

### Authentication Status
**GET /api/auth/status**
- Verifies API key validity and returns authentication status
- Returns: authentication confirmation and truncated API key info

## Response Format
All successful processing jobs return:
```json
{
  "success": true,
  "job_id": "unique-job-identifier",
  "status": "completed",
  "data": {
    "note_id": "unique-note-uuid",
    "tags": "meeting, project, deadline",
    "summary": "# Formatted Notes\n\n## Key Points\n- Point 1\n- Point 2\n\n## Action Items\n- [ ] Task 1\n- [ ] Task 2",
    "transcription": "Full transcription (audio jobs only)",
    "processing_info": {
      "whisper_model": "whisper-1",
      "gpt_model": "gpt-4o",
      "processing_time": 4.2,
      "word_count": 150
    },
    "reprocessed": false
  }
}
```

For update operations, the response includes `"reprocessed": true` to indicate the note was reprocessed with fresh AI analysis.

## Key Features
- **UUID Note System**: Every processed note receives a unique UUID for identification and future operations
- **Asynchronous Processing**: All jobs are processed asynchronously with real-time status tracking
- **Custom Instructions**: Users can provide specific formatting instructions for tailored output
- **Language Support**: Optional language parameter for improved Whisper transcription accuracy (e.g., 'en', 'es', 'fr', 'de')
- **Webhook Support**: Optional webhook notifications for job completion
- **Voice Memo Formatting**: Specialized prompts convert transcripts to first-person note format
- **Automatic Tag Generation**: Creates exactly 3 relevant tags in comma-separated format for content categorization
- **Markdown Output**: Proper formatting with headings, checkboxes, and bullet points
- **Note Operations**: Merge multiple notes, append content to existing notes, or update notes with fresh AI processing
- **File Validation**: Audio format and size validation with detailed error messages and format detection

## Rate Limits & Restrictions
- Audio files: 25MB maximum size
- Processing timeout: 10 minutes for audio jobs, 2 minutes for text jobs
- Supported audio formats: mp3, wav, m4a, mp4, aac, ogg, webm
- Rate limits may apply based on API key tier

## Example Usage

### Audio Processing
```bash
curl -X POST https://api.cleftai.com/api/audio/process \
  -H "Authorization: Bearer sk-proj-cleftai-your-api-key-here" \
  -H "Content-Type: multipart/form-data" \
  -F "audio_file=@meeting.mp3" \
  -F "custom_instructions=Focus on action items and decisions" \
  -F "language=en"
```

### Text Processing
```bash
curl -X POST https://api.cleftai.com/api/text/summarize \
  -H "Authorization: Bearer sk-proj-cleftai-your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Meeting discussion about project timeline and budget",
    "custom_instructions": "Create organized notes with checkboxes"
  }'
```

### Note Merging
```bash
curl -X POST https://api.cleftai.com/api/notes/merge \
  -H "Authorization: Bearer sk-proj-cleftai-your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "note_ids": ["uuid1", "uuid2", "uuid3"],
    "custom_instructions": "Combine into cohesive summary"
  }'
```

### Note Appending
```bash
curl -X POST https://api.cleftai.com/api/notes/append \
  -H "Authorization: Bearer sk-proj-cleftai-your-api-key-here" \
  -H "Content-Type: multipart/form-data" \
  -F "audio_file=@additional_content.mp3" \
  -F "target_note_id=existing-note-uuid" \
  -F "custom_instructions=Format as bullet points" \
  -F "language=es"
```

### Job Status Check
```bash
curl -X GET https://api.cleftai.com/api/jobs/job-id-here \
  -H "Authorization: Bearer sk-proj-cleftai-your-api-key-here"
```

### Note Updating
```bash
curl -X POST https://api.cleftai.com/api/notes/update \
  -H "Authorization: Bearer sk-proj-cleftai-your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "note_id": "existing-note-uuid",
    "custom_instructions": "Focus more on actionable insights"
  }'
```

## Technical Implementation
- **Backend**: Express.js with TypeScript
- **AI Models**: OpenAI Whisper-1 for transcription, GPT-4o for text processing
- **Database**: PostgreSQL with Drizzle ORM
- **Authentication**: API key-based with multiple authentication methods
- **File Processing**: Multer for multipart uploads with validation
- **Error Handling**: Comprehensive error responses with detailed messages

## Use Cases
- Voice memo transcription and formatting
- Meeting notes automation
- Interview transcript processing
- Lecture recording summarization
- Note organization and management
- Content consolidation from multiple sources
- Research note compilation

This API is designed for developers building applications that need reliable audio transcription and intelligent text formatting capabilities with enterprise-grade features and scalability.