370K Downloads in 14 Weeks: How We Built Arré Voice on Flutter
Every article about building social audio apps is theoretical. "Choose a real-time SDK." "Consider cross-platform." "Plan for moderation." Great. But nobody shows you what it actually looks like when a funded media company hands you a product spec, a 14-week window, and tells you the launch is already announced.
This is that story.
Arré (U Digital Content Pvt. Ltd.) is one of India's serious digital media companies — $10.2M in funding from Enam Holdings, a catalogue of original content, and a team that understood the social audio wave before most agencies in India were even Googling "Clubhouse alternative." In 2022, they identified a gap: no Indian platform was building social audio specifically for women creators. The opportunity was real. The deadline was fixed.
The Brief We Received
Vimal Kumar, VP Product at Arré, laid out the vision clearly: a social audio platform with two core content formats. First, Voicepods — 30-second audio clips with creator-recorded background music, voice effects, and one-tap publishing. Second, Jampods — live group audio rooms where up to 50 speakers could participate with thousands of listeners. Layer on top: a women-first moderation system, a personalised content discovery feed, and infrastructure capable of handling a celebrity launch (Ankur Tewari from Jam 8 / Pritam's studio was already lined up).
The hard constraint: 14 weeks from wireframes to production on both iOS and Android.
Building two separate native apps — Swift for iOS, Kotlin for Android — was never a realistic option under this timeline. We needed a single codebase that delivered native performance on both platforms. The answer was Flutter.
Why Flutter, and What That Decision Actually Costs You
We've shipped apps in React Native and Flutter. The honest comparison: React Native gives you a faster cold start if your team already lives in JavaScript, but you pay for it in the audio stack. When we've used React Native for audio-heavy features in other projects, we've consistently run into the bridge latency problem — the JS thread and native thread communicating asynchronously, which shows up as subtle jank in real-time audio UI.
Flutter's compiled Dart code runs on a single-threaded Dart VM with direct access to native APIs via platform channels. For real-time audio, this matters. You're not pushing data across a JS bridge every frame.
For Arré Voice, we evaluated Agora SDK integration on both frameworks. The Flutter path was cleaner. Agora's agora_rtc_engine Flutter plugin had already achieved production maturity by 2022, with documented support for concurrent speakers per channel. The decision was straightforward.
The Flutter codebase also reduced our effective timeline by roughly 30% versus two native builds — which is what made 14 weeks possible at all.
The Architecture We Built
Free Download: App Development Cost Estimator
Break down app costs by feature — auth, payments, push notifications, maps, chat. See Indian and international rates side by side.
Real-Time Audio: Agora + WebSocket
The core of Jampods runs on Agora SDK. Each live room is an Agora RTC channel. We handled the 50-concurrent-speaker ceiling by building a channel-joining queue — speakers request to join, hosts approve via a hand-raise UI, the system moves them from audience to broadcaster role in the Agora channel in real time.
Room presence data (who's online, who's speaking, live reactions, hand-raise queue position) runs on WebSocket connections maintained separately from the Agora audio stream. This separation was intentional: audio quality should never degrade because the UI is updating. Two independent channels, two independent failure domains.
// Simplified role management for Agora channel
await _agoraEngine.setClientRole(
role: pendingApproval
? ClientRoleType.clientRoleAudience
: ClientRoleType.clientRoleBroadcaster,
);
Creator Studio: The Feature Users Actually Loved
The Voicepod creator flow was technically the most complex piece, and it's the feature that drove most of the early organic downloads.
We built a mobile-first recording/editing pipeline with four layers: raw voice capture, background music from Jam 8's licensed track catalogue, voice effects (reverb, pitch shift), and volume mixing with visual waveform feedback. The 30-second countdown timer with live visual feedback was a deliberate UX decision — short-form audio needed the same creator guardrails that TikTok built for video.
One-tap publishing from the studio directly to the feed, with automatic transcription tags generated by our backend to feed the recommendation engine.
Discovery: Making the First Session Feel Personalised
Most recommendation systems only get useful after you've accumulated user data. We needed Arré Voice to feel personalised from the first session.
During onboarding, we collect creator category preferences and audio mood preferences. These seed a content-based filter that runs before any collaborative filtering data exists. Once users generate engagement signals (listens, reactions, follows), the system shifts weight toward collaborative filtering — what users with similar early taste profiles engaged with.
This hybrid approach meant Day 1 retention numbers were significantly higher than they would have been with a pure cold-start collaborative model.
Moderation: Building for Women-First Safety
The moderation architecture had three layers, each with different response times and precision:
Layer 1 — Automated audio screening: Background process running on submitted Voicepods, flagging content via keyword detection in transcriptions and audio fingerprinting against known problematic content signatures. Fast, low-precision, high-recall.
Layer 2 — Community reporting with priority queues: User-initiated reports routed through a queue weighted by reporter reputation score. High-reputation reporters' flags get reviewed faster.
Layer 3 — Admin dashboard with live room monitoring: Real-time view of active Jampods with host-level controls (mute, remove, end session) and escalation paths for in-progress incidents. This was the layer that mattered most for launch — Arré's team needed direct control over the live room environment during the celebrity launch event.
Infrastructure and the Launch Spike
As an AWS Partner, we deployed the backend on EC2 with Auto Scaling Groups configured for the traffic patterns we modeled from the Ankur Tewari launch announcement. S3 for audio asset storage. CloudFront CDN for delivery. Redis for session state and recommendation caching. Firebase for push notifications and analytics.
The GitHub Actions CI/CD pipeline had a two-environment setup: staging received every commit, production received tagged releases only. This let Arré's product team test in staging throughout the build without touching production stability.
The launch spike — when Ankur Tewari posted to his audience — hit expected peak concurrent users within the first two hours. EC2 Auto Scaling handled the load. Zero dropped Agora connections during the event.
This is the infrastructure outcome we're proudest of. Social audio lives or dies on launch-day reliability. If the audio breaks during the moment the first celebrity goes live, you don't recover.
The Results
| Metric | Result |
|---|---|
| Time from wireframes to production | 14 weeks |
| Total app downloads | 370,000+ |
| Google Play Store rating | 4.5 ★ |
| Peak daily active users | 3,000+ |
| Dropped connections at launch | Zero |
"We needed a technical partner who could move at startup speed without cutting corners on audio quality. Innovatrix shipped a production-ready social audio app in 14 weeks — real-time audio rooms, a full creator studio, content discovery — all working flawlessly. The app handled our launch traffic spike without a single dropped connection."
— Vimal Kumar, VP Product, Arre
What We'd Do Differently
Honest reflection matters more than polished retrospectives.
The recommendation engine cold-start was still imperfect. Our onboarding preference collection got us further than most apps at Day 1, but we'd invest more in the audio transcription pipeline earlier in the build — better semantic tags from Day 1 would have improved preference seeding accuracy.
We'd propose a longer QA runway for the Creator Studio. The audio mixing pipeline had two edge cases in voice effects that surfaced in beta testing (not in production, thankfully), but we absorbed them by compressing the QA cycle at the end of the sprint. A 16-week timeline with the same scope would have been healthier for the team.
WebSocket reconnection logic needed more hardening for 2G/3G connections. Indian users on lower-bandwidth connections experienced occasional presence data lag that didn't affect audio quality but did affect the real-time hand-raise UI. We patched this post-launch, but it should have been in the original build.
What This Means for Your App Project
The social audio market is projected to reach $17.71B by 2030, growing at 24.5% CAGR. In India specifically, the opportunity is still early — most serious social audio platforms targeting Indian creators are self-built by well-funded teams, not agency-built products for funded media companies.
If you're building a social or community app that requires real-time audio, video, or presence features, the Flutter + Agora SDK combination is mature enough for production in 2026. Development costs in India for a full-featured social audio app run between ₹50L and ₹1.2Cr depending on feature scope — significantly less than equivalent builds in the US or UK, with no compromise on technical quality when you have the right team.
Our app development service covers Flutter-first cross-platform builds, Swift for iOS-native projects, and Kotlin for Android-native when the use case demands it. For a detailed breakdown of what your specific app might cost, read our app development cost guide for India 2026.
Free Download: App Development Cost Estimator
Break down app costs by feature — auth, payments, push notifications, maps, chat. See Indian and international rates side by side.
Written by

Founder & CEO
Rishabh Sethia is the founder and CEO of Innovatrix Infotech, a Kolkata-based digital engineering agency. He leads a team that delivers web development, mobile apps, Shopify stores, and AI automation for startups and SMBs across India and beyond.
Connect on LinkedIn