1. Executive Summary
Prepp is an AI-powered sales onboarding platform that helps sales teams reach competency faster through guided practice, roleplay, and manager-visible progress insights.
| Item |
Details |
| Deployment |
SaaS (multi-tenant) |
| Hosting |
Google Cloud Platform (GCP) |
| Data Residency |
Israel region (me-west1) |
| Identity |
Auth0 with enterprise SSO support |
| Compliance Roadmap |
SOC 2 Type I (2-3 months), pen test |
2. Product Overview
What Prepp Does
- Content onboarding: Admins upload approved product playbooks and training materials
- AI coaching: Sales reps complete guided practice and roleplays with an AI coach
- Manager insights: Managers review aggregated progress and coaching outcomes
Key Workflows
Admin uploads content → Rep practices with AI coach → Manager reviews progress
3. Architecture & Data Flow
High-Level Architecture
Infrastructure Components
| Component |
Service |
Details |
| Compute |
Cloud Run |
Managed containers, auto-scaling |
| Database |
Cloud SQL |
Managed PostgreSQL with automated backups |
| Storage |
GCS |
Object storage for assets and recordings |
| Secrets |
Secret Manager |
All secrets stored securely, not in code |
| Workflow Engine |
GCE VM |
Hatchet Lite + RabbitMQ, internal-only, no public IP |
| Identity |
Auth0 |
OIDC with enterprise SSO connections |
Data Residency
- Primary region: GCP Israel (me-west1)
- Data stored in Israel: Application data, user profiles, coaching logs, recordings
- Data leaving Israel: Beyond core hosting in Israel, certain features process content in other regions or subprocessor regions as configured:
- Ongoing coaching / external call import: Call audio may be sent to ElevenLabs for speech-to-text transcription. Transcripts and related context (seller names, companies, counterparty fields, and similar metadata supplied by integrations or admins) may be sent to LLM providers (OpenAI, Anthropic, Google Gemini, etc.) and to the agent-service (application logic for scoring, coaching, and related workflows). These flows are not limited to “generic prompts”; they can include real call content where customers enable those features.
- LLM features: Text prompts and model responses are typically processed in provider regions (often the United States) according to each provider’s terms and DPA.
- Optional tracing: When LangSmith (or similar) tracing is enabled in agent-service configuration, trace metadata and LLM inputs/outputs (as configured) may be sent to that subprocessor for observability.
For a technical walkthrough of the ongoing coaching import pipeline, see docs/ongoing-coaching/call-import-pipeline-ELI5.md.
4. Authentication & Access Control
Identity & SSO
| Feature |
Status |
Details |
| SSO (SAML/OIDC) |
✅ Supported |
Via Auth0 enterprise connections |
| MFA |
✅ Supported |
Enforced via customer IdP; Auth0 MFA available |
| SCIM Provisioning |
❌ Not yet |
On roadmap |
| OAuth 2.0 |
✅ Supported |
Standard OIDC flow |
Role-Based Access Control (RBAC)
| Role |
Permissions |
| Owner |
Full organization control |
| Admin |
Manage users, content, settings |
| Member |
Access coaching features |
| System Admin |
Platform-level administration (Prepp staff only) |
Internal Service-to-Service Authentication
Internal communication is authenticated at two layers:
- Backend webhooks (worker -> backend): A single shared secret (
BACKEND_INTERNAL_WEBHOOK_SECRET) is sent in the X-Prepp-Internal-Webhook-Secret header and verified by the InternalWebhookSecretGuard on all webhook controllers (call-import, coaching-snapshot, worker progress). On GCP Cloud Run, requests additionally carry Google-issued OIDC identity tokens.
- Agent-service: Callers send either a Google OIDC ID token (verified against
AGENT_SERVICE_AUDIENCE on Cloud Run) or a shared AGENT_SERVICE_API_KEY as a Bearer token for local/dev environments.
| Control |
Details |
| Shared secret comparison |
Constant-time (timingSafeEqual / SHA-256 digest on backend; hmac.compare_digest on agent-service) to prevent timing attacks |
| Fail-closed |
Backend refuses to start in production/staging if BACKEND_INTERNAL_WEBHOOK_SECRET is missing; agent-service refuses to start if neither AGENT_SERVICE_API_KEY nor AGENT_SERVICE_AUDIENCE is configured |
| Dev bypass |
Optional escape hatches (ALLOW_UNAUTHENTICATED_INTERNAL_WEBHOOKS, AGENT_SERVICE_ALLOW_UNAUTHENTICATED) for local development only; cannot be enabled in staging or production |
Audit Logging
- Sign-in events
- Administrative actions
- Content changes
- Organization-scope mismatch warnings on call-review webhook payloads
- Export available on request
5. Data Handling
Data We Collect
| Data Type |
Required |
Details |
| User profile |
Yes |
Name, email, role/team |
| Organization config |
Yes |
Settings, preferences |
| Training content |
Yes |
Admin-provided materials |
| Coaching sessions |
Configurable |
Logs and analytics |
| Voice recordings |
Optional |
Only if voice features enabled |
Sensitive Data Types We Do Not Require as Standalone Onboarding Fields
- ❌ Government identifiers (social security, national ID numbers, etc.) as a required product field
- ❌ Sensitive customer records outside normal business use of the product
- ❌ Call recordings (unless explicitly enabled)
- ❌ Integration with CRM/HR systems (unless explicitly connected)
Calls and transcripts: Sales calls and coaching workflows may naturally include personally identifiable information (names, companies, phone numbers, and other content spoken or written in transcripts). Ongoing coaching and call import features process that content as part of the service. Processing is governed by the DPA and subprocessor disclosures in this document and the customer agreement—not by an assumption that calls are “generic” or PII-free.
Tenant Isolation
Multi-tenant SaaS with logical tenant separation. All records are scoped by organization_id with org-scoped queries enforced at the service layer. Ongoing-coaching review results derive their organization scope from the stored call record (database-authoritative), not from the inbound webhook payload; mismatches are logged and the payload scope is ignored.
6. Encryption
In Transit
| Protocol |
Coverage |
| TLS 1.2+ |
All public endpoints |
| HTTPS |
All API and web traffic |
At Rest
| Component |
Encryption |
| Cloud SQL |
GCP-managed encryption (AES-256) |
| GCS |
GCP-managed encryption (AES-256) |
| Secrets |
GCP Secret Manager (envelope encryption) |
Key Management
- Encryption keys managed by GCP
- Customer-managed encryption keys (CMEK) available on request
7. Security Controls
Network Security
- Cloud Run services deployed in GCP Israel region
- Private VPC connector for database access
- HTTPS-only ingress with TLS 1.2+
- Hatchet workflow engine runs on an internal-only GCE VM with firewall rules restricting access to the VPC connector IP range
- Hatchet dashboard access via HTTPS load balancer with Google Identity-Aware Proxy (IAP); only authorized Google accounts can reach the UI
- Cloud Armor available for WAF/DDoS protection on request
SSRF Protection
- Outbound audio-fetch requests are validated before execution: URL scheme (HTTPS required), port restriction, DNS resolution checked against private/reserved IP ranges (RFC 1918, link-local, loopback, metadata endpoint), and redirect targets re-validated per hop (max 3 redirects)
- GCP metadata endpoint (
metadata.google.internal) is explicitly blocked
Secure Development Lifecycle (SDLC)
- Code review required for all changes (PR-based workflow)
- CI pipeline with automated tests and linting
- Dependency scanning via Dependabot
- Secret scanning in CI
- Staging environment for pre-production validation
Debug & Documentation Exposure
- Agent-service
/docs, /redoc, and debug/test routes are disabled in staging and production environments by default
- Controlled via
ENVIRONMENT env var; opt-in override requires explicit ENABLE_AGENT_DEBUG_ROUTES flag
LLM Trace Privacy
- Sensitive ongoing-coaching LLM invocations (call evaluation, coaching snapshots, seller context) suppress content tracing to LangSmith in staging and production
- Opt-in re-enablement via
ENABLE_LANGSMITH_CONTENT_TRACING for authorized debugging
Logging & Monitoring
- Application logs in GCP Cloud Logging
- Audit logs for sign-ins, admin actions, content changes
- Organization-scope mismatch detection and logging on webhook payloads
- Alerting via GCP Cloud Monitoring
- SIEM integration available on request (log export supported)
Backup & Recovery Testing
- Automated daily backups for Cloud SQL
- Quarterly restore testing schedule
- RPO 24 hours; RTO 48 hours
8. AI Governance
LLM Providers
| Provider |
Purpose |
Status |
| OpenAI |
Text generation, voice (Realtime) |
Active |
| Anthropic |
Text generation |
Active |
| Gemini |
Text generation |
Configurable |
| ElevenLabs |
Transcription (including ongoing coaching call audio when that product path is used) |
Active |
Data Usage Policy
| Question |
Answer |
| Is customer data used to train models? |
❌ No. Customer data is never used to train foundation models |
| Are AI outputs grounded? |
✅ Yes, constrained to approved onboarding content |
| Is there human oversight? |
✅ Yes, admins control and approve all training content |
LLM Data Retention (Provider Side)
| Provider |
Audio/Content Stored? |
Abuse Logs |
Zero-Retention Option |
| OpenAI |
No (not used for training) |
Up to 30 days |
Available on request |
| Anthropic |
No (not used for training) |
Up to 30 days |
Available on request |
| ElevenLabs |
No (stateless transcription) |
None |
N/A (zero retention) |
Note: OpenAI explicitly states: "Data sent to the OpenAI API is not used to train or improve OpenAI models."
9. Data Retention & Deletion
Default Retention
| Data Type |
Retention |
| Workflow state records |
7 days (automated cleanup) |
| Coaching artifacts (audio, transcripts, review signals, snapshots) |
90 days (configurable via ONGOING_COACHING_ARTIFACT_RETENTION_DAYS), then GCS objects deleted and DB paths nullified |
| Application data |
While account is active |
| Voice recordings |
Per customer agreement |
Configurable Options
- Retention windows adjustable per customer contract
- Ongoing-coaching artifact retention configurable per deployment (default 90 days)
- Opt-out of storing prompts/responses available
- Voice recordings can be disabled entirely
Deletion Process
| Request Type |
Timeline |
| Account-level deletion |
Within 30 days |
| Tenant-level purge |
Via operational process |
| Right to erasure (GDPR/Israel) |
Supported |
10. Security Operations
Vulnerability Management
| Severity |
Target SLA |
| Critical |
7 days |
| High |
14 days |
| Medium |
30 days |
| Low |
90 days |
Detection Methods
- Weekly dependency scans (CI)
- Dependabot alerts and automated PRs
- Cloud provider security alerts
- Manual review of major advisories
Penetration Testing
| Item |
Status |
| External pen test |
Planned |
Incident Response
| Phase |
Target Timeline |
| Acknowledge |
Within 24 hours |
| Triage |
Within 72 hours |
| Customer notification |
Per contract |
Severity Levels
| Level |
Definition |
| Sev 1 |
Active compromise, widespread outage, confirmed data exposure |
| Sev 2 |
Limited impact security incident |
| Sev 3 |
Suspicious activity, no confirmed impact |
| Sev 4 |
Low-risk issues, false positives |
Security Contact
11. Business Continuity
Backup Strategy
| Component |
Frequency |
Details |
| PostgreSQL |
Daily |
Automated backups with point-in-time recovery |
| Hatchet State |
Daily |
Stored in Cloud SQL (same managed Postgres, backed up automatically) |
| RabbitMQ |
N/A |
Transient message queue; no persistent data requiring backup |
| GCS |
Continuous |
Provider-managed durability (11 9's) |
Recovery Objectives
| Metric |
Target |
| RPO (Recovery Point Objective) |
24 hours |
| RTO (Recovery Time Objective) |
48 hours |
Disaster Recovery
- Zonal Cloud SQL in GCP Israel with automated backups + PITR
- HA/failover available when regional Cloud SQL is enabled
- Quarterly restore testing
Service Availability
| Metric |
Target |
| Uptime |
99.5% |
| Planned maintenance |
Off-peak hours with advance notice |
12. Compliance Status
Current Status
| Item |
Status |
| Privacy Policy |
✅ Published (https://prepp.tech/privacy) |
| DPA |
✅ Published (https://prepp.tech/dpa) |
| HTTPS/TLS |
✅ All endpoints |
| Encryption at rest |
✅ GCP-managed |
| Audit logging |
✅ Implemented |
| RBAC |
✅ Implemented |
| Tenant isolation |
✅ Implemented |
| Dependency scanning |
✅ Automated |
Compliance Roadmap
| Item |
Status |
Target |
| SOC 2 Type I |
Not started |
2-3 months |
| Penetration test |
Not started |
2-3 months |
| ISO 27001 |
In progress |
H2 2026 |
| SCIM provisioning |
Not started |
On roadmap |
Israel Privacy Law (חוק הגנת הפרטיות)
| Requirement |
Status |
| Data minimization |
✅ Minimal data collected |
| Purpose limitation |
✅ Processing for agreed purposes only |
| Access control |
✅ RBAC implemented |
| Data retention |
✅ Configurable per tenant |
| Deletion rights |
✅ Supported |
| Database registration (Section 17) |
✅ Not required (<10,000 users) |
13. Subprocessors
| Subprocessor |
Purpose |
Data Processed |
When Used |
| Google Cloud Platform |
Hosting, storage, compute |
All application data |
Always |
| Auth0 |
Identity, authentication |
User identity, sessions |
Always |
| OpenAI |
LLM processing, voice |
Prompts, responses, audio |
Configurable |
| Anthropic |
LLM processing |
Prompts, responses |
Configurable |
| Google (Gemini) |
LLM processing |
Prompts, responses |
Configurable |
| ElevenLabs |
Transcription |
Audio |
Configurable |
| LangSmith |
LLM observability / tracing (optional) |
Trace metadata, prompts, and model outputs when tracing is enabled |
Configurable (off by default; depends on agent-service deployment settings) |
14. Contact Information
This document consolidates Prepp's security and compliance information for enterprise due diligence. For additional questions, contact security@prepp.tech.