Social Scout¶
Transform social media text into actionable qualitative insights using AI-powered multi-agent analysis.
What is Social Scout?¶
Social Scout is a command-line pipeline that automates qualitative research on social media text. It handles the full journey from raw data collection to a structured, citation-backed analysis report — without requiring any coding.
The pipeline has four independent steps, each persisting its output so you can re-run any stage without redoing earlier work:
scout collect → scout preprocess → scout model → scout analyze → scout visualize
Apify scraper Polars+GLiNER+ BERTopic (7) CrewAI (5 agents) 30+ Plotly charts
Sentiment + PNG export
Or run everything in one command:
Key Features¶
- Collect Reddit posts and comments via Apify's managed scraping infrastructure
- Filter by keywords, subreddits, sort order, and time range
- Output: newline-delimited JSON (
raw_data.ndjson)
- High-speed text cleaning with Polars (Rust-based DataFrame library)
- Named entity recognition with GLiNER (brand names, products, people, locations)
- Sentiment scoring with
--sentiment— addssentiment_labelandsentiment_scorecolumns using a Twitter-fine-tuned RoBERTa model - Configurable minimum text length and entity confidence thresholds
Seven BERTopic techniques available:
| Technique | Use case |
|---|---|
basic |
Standard topic clustering |
dynamic |
Topic evolution over time |
hierarchical |
Macro/micro topic trees |
class-based |
Topics by metadata group |
sentiment-topic |
Sentiment distribution per topic |
network |
Topic co-occurrence patterns |
zero-shot |
Hypothesis validation |
Five CrewAI agent personas analyze and debate the findings:
| Agent | Role |
|---|---|
| Data Analyst | Quantitative patterns from topic model |
| Consumer Psychologist | Emotional and cognitive drivers |
| Strategy Advisor | Stakeholder implications |
| Critical Reviewer | Validates claims, flags speculation |
| Chief Theorist | Synthesizes a unified framework |
Hallucination control: every finding must cite a source record. Uncited claims are blocked from the report.
Use --report-language korean to produce all findings in Korean.
Use --llm ensemble (default) to mix Claude and Gemini for diverse perspectives.
scout visualize <project>generates a standalone HTML dashboard with 30+ Plotly charts- Sections: Pipeline Overview · Collection · Preprocessing · Topic Modeling · Analysis · Sentiment & Perception · Cross-Stage
--sentimentflag (preprocess step) unlocks the Sentiment & Perception section: donut, heatmaps, controversy bar, time series, community heatmap, perception scatter--export-pngexports all charts as 300 DPI PNG files via kaleido--interpretgenerates LLM-written section interpretations shown in the dashboard--export-png --interpretwritesvisualization_report.md— section heading + blockquote + PNG links, ready for academic papers
5-Minute Quick Start¶
# Install (user)
uv tool install git+https://github.com/ecoinfoai/social-scout.git
# Configure credentials
mkdir -p ~/.config/social-scout
cp .env.example ~/.config/social-scout/.env
# → edit ~/.config/social-scout/.env with your API keys
# Create a project and run the pipeline
scout project create agentic-commerce
scout run agentic-commerce \
--keywords "agentic commerce,AI shopping" \
--communities technology,futurology \
--all-techniques
# Read the report
cat projects/agentic-commerce/reports/report.md
See the Quick Start guide for a step-by-step walkthrough.
Documentation¶
- Installation — system requirements, virtual environment, API keys
- Quick Start — get running in 5 minutes
- Tutorial — full walkthrough with a real research example
- Configuration — environment variables,
.envfile, advanced settings
Project¶
- Source code: github.com/ecoinfoai/social-scout
- Issues: github.com/ecoinfoai/social-scout/issues
- License: MIT