Pre-Requirements
Before starting the ElevenLabs research, the client must provide specific information and access credentials. This page details everything needed to begin the 12-hour research period.
⚠️ Client Checklist (Confirm Before Day 1)
IMPORTANT: Before the backend developer begins Day 1, confirm all of the following:
-
API key is active and tested
- Log into ElevenLabs account
- Generate or locate existing API key
- Test the API key with a simple request
-
Voice IDs are valid and accessible
- All voice IDs are copied correctly (no typos)
- Voice IDs are accessible with the provided API key
- Voice IDs are in correct format (e.g.,
21m00Tcm4TlvDq8ikWAM)
-
Account has sufficient credits/quota for testing
- Check account balance/credits
- Research will generate ~18 audio samples (2 models × 3 voices × 3 samples)
- Ensure sufficient quota to complete all testing
-
Client is available for questions during the research period
- Developer may have questions during Day 1 or Day 2
- Quick responses help keep research on schedule
- Availability via email, Slack, or preferred communication channel
Required from Client
1. ElevenLabs Account Information
What we need:
- Account email or account ID
- Access level/subscription tier (Free, Starter, Creator, Pro, etc.)
- Usage limits and quotas
Why we need it: Understanding the account tier and limits helps us plan testing within available quotas and identify any constraints that may affect production implementation.
2. API Credentials
What we need:
- ElevenLabs API Key with appropriate permissions
Required API access:
- Text-to-Speech endpoint
- Voice listing/details endpoint
- Voice validation capabilities
Why we need it: The API key is essential for all testing and validation work. It must have sufficient permissions to test text-to-speech generation, list available voices, and validate voice IDs.
3. Use Case Specifications
What we need:
- Target audience (affects voice selection)
- Any specific voice characteristics required (accent, tone, age, etc.)
Why we need it: Understanding the target audience and voice requirements helps us provide more relevant recommendations during the research phase.
Examples:
- "Target audience: Young adults aged 18-35, prefer energetic and friendly tone"
- "Need professional, neutral accent for business communications"
- "Require warm, calming voice for meditation content"
Optional but Helpful
4. Voice IDs for Testing (Optional)
What we need (if client provides):
- 3 voice IDs that the client wants to use
- Mix of male and female voices recommended
- Example format:
21m00Tcm4TlvDq8ikWAM(Rachel) - If using custom cloned voices, provide those IDs as well
If client doesn't provide voice IDs:
We'll use default voices for testing:
- 1 Male voice - Professional, clear
- 1 Female voice - Natural, engaging
- 1 Male British voice - Accent variety
Why this helps:
If client provides their preferred voice IDs, we test with voices they actually want to use in production. This makes the research more relevant and actionable.
If not provided, we'll test with diverse default voices to demonstrate capabilities.
Where to find voice IDs:
- Go to ElevenLabs Voice Library
- Select voices and copy their IDs
- Or use custom cloned voices from your account
Future requirement: In the future, we will need 8 desired voices for the full implementation.
5. Budget Constraints
What helps us:
- Maximum acceptable cost per generation
- Monthly budget allocation for voice generation
Why it helps: This information helps the client make informed decisions when choosing between models. We can better balance quality vs. cost trade-offs if we know budget constraints upfront.
Note: Cost tracking is integrated into model testing, so you'll see actual costs per sample in the deliverable.
6. Quality Expectations (Optional)
What helps us:
- Audio quality standards (professional, good, acceptable)
- Clarity and naturalness requirements
- Specific quality concerns or priorities
Why it helps: Understanding your quality expectations helps us evaluate the models more effectively. If you have specific quality standards (e.g., "must sound professional for corporate use" or "clarity is more important than naturalness"), we can focus our testing and recommendations accordingly.
Examples:
- "Audio must be professional quality suitable for customer-facing content"
- "Prioritize clarity over naturalness - every word must be understood"
- "Natural, conversational tone is most important for our use case"
Audio Format Requirements (Pre-defined)
These settings are already defined for MicDots platform:
- Format: MP3
- Bitrate: 128 kbps
- Channels: Mono (single channel)
- Purpose: Optimized for voice speech and slow internet connections
Why these settings:
- MP3: Universal compatibility, good compression
- 128 kbps: Sweet spot for voice quality vs. file size
- Mono: Voice content doesn't need stereo, reduces file size by ~50%
- Result: Fast downloads on slow connections, good quality for QR code use case
Note: These settings are fixed and should be used for all testing. No client input needed.
What Happens Next
Once all pre-requirements are confirmed:
- Day 1 (6 hours): Developer will test both models (Turbo v2.5 and Eleven Flash) with all provided voice IDs, validate voices, and gather technical metrics
- Day 2 (6 hours): Developer will create code samples, organize audio files, and complete the deliverable template
- Deliverable: Client receives comprehensive analysis with audio samples, technical metrics, and developer recommendation
Questions?
If any pre-requirements are unclear or you need help gathering this information, please contact the development team before Day 1 begins.
Return to: ElevenLabs API Research