Speak Smart Flask App
Real-Time Speech Recognition and Feedback Tool
Overview
Developed a Flask web application to improve public speaking by providing real-time speech transcription and feedback. Integrated Google’s Speech-to-Text API for real-time audio transcription and leveraged GPT-based models for detailed feedback on grammar, clarity, and delivery.
- Real-Time Transcription: Implemented live audio streaming and transcription with confidence score updates using Google’s Speech-to-Text API.
- AI-Driven Feedback: Integrated GPT models and LanguageTool for real-time feedback, including grammar correction and improvement suggestions.
- User-Centric Design: Built a user-friendly interface with Flask, enabling users to record, review, and improve their speaking performance with a seamless feedback loop.
- Real-World Applications: Targeted use cases include public speaking practice, educational support for language learners, meeting transcriptions, and accessibility features for individuals with hearing impairments.
Data Processing Pipeline
Audio Capture:
- Recorded and streamed in real-time.
- Segmented into chunks for better processing.
Speech Recognition:
- Audio chunks are sent to Google Cloud Speech-to-Text API.
- Responses include transcripts and confidence scores.
Feedback Analysis:
- Text analyzed for grammatical issues and contextual accuracy.
- Feedback generated using OpenAI and LanguageTool.
Result