Configuration¶
Social Scout is configured entirely through environment variables. No configuration files with hardcoded secrets are needed.
Credentials file (recommended)¶
Social Scout automatically loads ~/.config/social-scout/.env at startup. Shell-exported variables always take precedence over the file.
# Set up once
mkdir -p ~/.config/social-scout
cp .env.example ~/.config/social-scout/.env
$EDITOR ~/.config/social-scout/.env
Example ~/.config/social-scout/.env:
# Required for data collection
APIFY_API_TOKEN=apify_api_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# At least one LLM key required for analysis
GEMINI_API_KEY=AIzaSy_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
ANTHROPIC_API_KEY=sk-ant-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Optional
SCOUT_LOG_LEVEL=INFO
SCOUT_PROJECTS_DIR=~/research/social-scout-projects
Never commit your credentials
The file is stored in ~/.config/, not inside the project directory, so it is never accidentally committed to git. The .gitignore also excludes all .env files from the project root.
Environment variables reference¶
API credentials¶
| Variable | Required | Description |
|---|---|---|
APIFY_API_TOKEN |
For collect step |
Apify API token. Get yours at console.apify.com |
GEMINI_API_KEY |
For analyze step (one of two) |
Google Gemini API key. Get yours at aistudio.google.com |
ANTHROPIC_API_KEY |
For analyze step (one of two) |
Anthropic Claude API key. Get yours at console.anthropic.com |
Application settings¶
| Variable | Default | Description |
|---|---|---|
SCOUT_LOG_LEVEL |
INFO |
Logging verbosity: DEBUG, INFO, WARNING, ERROR |
SCOUT_PROJECTS_DIR |
./projects |
Root directory where project folders are stored |
Shell override¶
Any variable set in the shell environment takes precedence over the .env file:
LLM model assignment¶
Each of the five agent personas can use a different LLM. The defaults are:
| Agent | Default model |
|---|---|
| Data Analyst | claude-sonnet-4-6 |
| Consumer Psychologist | gemini/gemini-3.0-pro-preview |
| Strategy Advisor | claude-sonnet-4-6 |
| Critical Reviewer | claude-sonnet-4-6 |
| Chief Theorist | gemini/gemini-3.0-pro-preview |
Model identifiers use LiteLLM format, so any provider LiteLLM supports can be used — OpenAI, Google, Anthropic, Groq, local Ollama, etc.
To change models, pass --llm-config to scout analyze (see scout analyze --help for details).
Use --llm ensemble (default in v0.6.0) to let each agent use its own default LLM (3 Claude + 2 Gemini).
Use --llm claude or --llm gemini to route all agents to a single provider.
objectives.toml¶
Each project has an objectives.toml that persists CLI settings. The CLI always takes precedence.
Example projects/agentic-commerce/objectives.toml:
[collect]
topic = "agentic commerce"
keywords = ["agentic commerce", "AI shopping", "autonomous purchasing"]
communities = ["technology", "futurology"]
sort = "top"
time = "year"
max_items = 5000
[model]
language = "multilingual" # multilingual | english | korean
all_techniques = true
[analyze]
llm = "ensemble" # claude | gemini | ensemble (default: ensemble)
max_docs_per_topic = 50
report_language = "english" # english | korean
New in v0.6.0: report_language field under [analyze] and ensemble as the default llm.
Project storage¶
By default, all project data is stored in ./projects/ relative to the current working directory.
# Use a custom location
export SCOUT_PROJECTS_DIR=/data/social-scout-projects
# Or set per-command
SCOUT_PROJECTS_DIR=/data/projects scout project create my-study
The project directory structure:
projects/
└── <project-name>/
├── project.json # Project metadata
├── raw/
│ ├── raw_data.ndjson # Raw scraped records
│ └── collection_meta.json # Collection job details
├── cleaned/
│ ├── cleaned_data.parquet # Preprocessed records
│ └── preprocess_meta.json # Preprocessing stats
├── topics/
│ ├── topics.parquet # Topic assignments
│ ├── model_meta.json # Model parameters
│ └── embeddings.npy # Cached embeddings (speeds up re-runs)
├── reports/
│ ├── report.md # Final qualitative report
│ ├── findings.json # Machine-readable findings
│ └── analysis_meta.json # Agent run metadata
└── visualizations/
├── dashboard.html # Interactive HTML dashboard
├── interpretations.json # LLM section interpretations (--interpret)
├── visualization_report.md # Markdown report with PNG links (--export-png)
└── charts/ # High-res PNGs (--export-png + kaleido)
├── pipeline-overview/
├── collection/
├── preprocessing/
├── topic-modeling/
├── analysis/
├── sentiment-perception/
└── cross-stage/
Logging¶
# Enable debug output for a single command
SCOUT_LOG_LEVEL=DEBUG scout collect my-project
# Quiet mode (suppress INFO messages)
scout --quiet model my-project
# Verbose mode (enable DEBUG)
scout --verbose analyze my-project
Log format: YYYY-MM-DDTHH:MM:SS LEVEL module: message
Proxy / network settings¶
Social Scout honours standard Python HTTP proxy variables:
export HTTPS_PROXY=http://proxy.example.com:8080
export HTTP_PROXY=http://proxy.example.com:8080
export NO_PROXY=localhost,127.0.0.1
These affect both the Apify client (data collection) and LiteLLM API calls (analysis).