Speak Smart Flask App

Real-Time Speech Recognition and Feedback Tool

Overview

Developed a Flask web application to improve public speaking by providing real-time speech transcription and feedback. Integrated Google’s Speech-to-Text API for real-time audio transcription and leveraged GPT-based models for detailed feedback on grammar, clarity, and delivery.

Real-Time Transcription: Implemented live audio streaming and transcription with confidence score updates using Google’s Speech-to-Text API.
AI-Driven Feedback: Integrated GPT models and LanguageTool for real-time feedback, including grammar correction and improvement suggestions.
User-Centric Design: Built a user-friendly interface with Flask, enabling users to record, review, and improve their speaking performance with a seamless feedback loop.
Real-World Applications: Targeted use cases include public speaking practice, educational support for language learners, meeting transcriptions, and accessibility features for individuals with hearing impairments.

Data Processing Pipeline

Audio Capture:

Recorded and streamed in real-time.
Segmented into chunks for better processing.

Speech Recognition:

Audio chunks are sent to Google Cloud Speech-to-Text API.
Responses include transcripts and confidence scores.

Feedback Analysis:

Text analyzed for grammatical issues and contextual accuracy.
Feedback generated using OpenAI and LanguageTool.

Speak Smart Flask App

Real-Time Speech Recognition and Feedback Tool

Overview

Data Processing Pipeline

Result