Competitor Network
A three-dimensional neural map of 33,000+ accounts, built from the intersection of semantic intelligence, government contract competition, and behavioral signals. Every account with a BI profile or award history is placed in this space — sponsors, prospects, and net-new logos alike.
Core question this answers: "Which Marketing-owned baseline account shares the same shape as a multi-portfolio Strategic whale — and what's the graduation pathway to get there?"
The Three Dimensions
1. Semantic Similarity — "What they do"
Built from BI narrative embeddings (1,536-dimensional vectors) generated from each account's business summary, products/services, SLED use cases, and target customers. Two accounts are semantically similar when their capabilities and market positioning overlap. Covers 33,112 accounts.
2. Competitive Geography — "Where they fight"
Built from award embeddings (solicitation titles from Navigator bids matched through entity resolution) and state co-occurrence (accounts bidding in the same jurisdictions). Two accounts are competitively linked when they chase the same government contracts. Covers 18,316 accounts.
3. Behavioral Archetype (M3) — "How they behave"
The ML pipeline's Account Segmentation model scores each account on six behavioral dimensions: subscription depth, platform usage, pipeline activity, revenue growth, engagement patterns, and partnership tenure. This produces archetypes like Power User — Growing, Active Subscriber, Pipeline Building, or Baseline. Covers 40,323 accounts in the ML feature set.
Five Edge Signals
Every edge between two accounts is a weighted composite of five distinct signals:
| Signal | Weight | Source | What it captures |
|---|---|---|---|
| BI Similarity | 40% | bi_embeddings.npy cosine | Overlapping capabilities, products, SLED relevance |
| Award Similarity | 25% | award_embeddings.npy cosine | Competing for same contract domains |
| Tag Co-occurrence | 20% | index_account_bi_tag Jaccard | Shared topic areas (Cybersecurity, AI, Cloud, etc.) |
| State Co-occurrence | 10% | index_account_award_state Jaccard | Competing in the same geographic markets |
| Category Overlap | 5% | index_account_categories Jaccard | Same engagement formats (events, webinars, papers) |
Edges below a 0.15 composite threshold are pruned. The final graph contains 1.36M edges across 33,253 accounts.
Semantic Clustering
Louvain community detection runs on the weighted graph to discover natural competitive clusters. No labels are prescribed — the algorithm finds them organically from the data. Each cluster is then auto-labeled by its dominant BI tags.
The current build discovered 18 clusters, including:
- Cloud / IT Infra / Cybersecurity — the mega-cluster (8,700+ accounts, 55 whales)
- Cybersecurity / Identity & Access Management — specialist security (2,200 accounts, 33 whales)
- K-12 / Higher Education — education vertical self-organized (2,100 accounts)
- Health & Human Services — health vertical emerged naturally (770 accounts)
- Election Technology — micro-cluster, niche market (46 accounts)
Whale Detection & Graduation Pathways
What is a whale?
An account with 4+ product categories (Event Sponsorship, Webinar, Paper, Special Event, Event Session). These are multi-portfolio strategic accounts like AWS, Palo Alto Networks, IBM, Cisco, and Dell. The network contains 176 whales.
Graduation scoring
For every non-whale in the graph, the ETL computes:
- Nearest whale — which whale is most similar via composite edge weight
- Whale distance — the composite similarity score (higher = closer match)
- Category gap — how many product categories separate them from whale status
- Primary signal — which of the five edge signals most strongly connects them (BI, award, tag, state, or category)
The graduation insight: A Marketing-owned account at $0 revenue that sits right next to Palo Alto Networks in the semantic space, competes in the same states, but has 0 product categories — that's not a cold prospect. That's a whale in disguise. The category gap tells you exactly how many products away they are from the template.
Reading the graduation candidates
The Graduation tab shows accounts ranked by whale similarity, with filters for sales team. A candidate showing:
- 72% whale match via BI = semantically identical to a whale, just not activated yet
- 58% whale match via award = winning the same types of contracts, could be cross-sold into events
- 55% whale match via tag = topic overlap is strong, the competitive context is there
Using the Views
Macro view
The top 300 accounts by PageRank, colored by cluster. Gives you the bird's-eye topology: which clusters are dense, which are isolated, where the whales concentrate. Use the Whales toggle to dim everything except whales and see the strategic landscape.
Clusters view
Click a cluster in the side panel to load its members. See how the cluster's internal structure organizes — which whales anchor the cluster, which prospects orbit them. Use the Labels toggle to see account names.
Graduation view
Shows the macro graph alongside ranked graduation candidates in the panel. Filter by sales team to find Marketing-owned prospects that match Strategic whale shapes. Click any candidate to load its ego network and see the specific whale connection.
Ego network
Search for any account or click a node. Loads the 1-hop neighborhood: the account + its strongest connections + edges between them. The side panel shows full account detail, BI tags, and if applicable, the graduation pathway card.
Technical Details
| Component | Detail |
|---|---|
| ETL module | etl/competitor_network.py — 10-phase pipeline, ~135s runtime |
| Service | services/competitor_network.py — lazy-loaded singleton, sub-second queries |
| API | api/competitor_network.py — 8 endpoints under /api/competitor-network/ |
| Graph library | NetworkX 3.3 (Louvain, PageRank) |
| Visualization | D3.js v7 force simulation |
| Similarity | Batched cosine similarity (500 rows/batch, numpy), top-30 neighbors per account |
| Clustering | Louvain community detection (resolution=1.0, weighted edges) |
Data outputs
| File | Content |
|---|---|
data/indexes/competitor_edges.parquet | 1.36M edges with 5-signal decomposition |
data/indexes/competitor_nodes.parquet | 33,253 nodes with cluster IDs, PageRank, graduation scores |
data/indexes/competitor_clusters.parquet | 18 clusters with labels, whale counts, tag distributions |
Rebuild
To regenerate the network after new BI profiles or award data:
python -m etl.competitor_network --top-k 30
| Account | Revenue | Team | Cluster | Domain | Industry | Cat | Venues | Leads | Whale | Whale Sim | Signal | Degree | PageRank |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ↗ |