Loading...

Competitor Network

Macro
Clusters
Graduation
Readme
/ accounts shown edges clusters whales

Competitor Network

A three-dimensional neural map of 33,000+ accounts, built from the intersection of semantic intelligence, government contract competition, and behavioral signals. Every account with a BI profile or award history is placed in this space — sponsors, prospects, and net-new logos alike.

Core question this answers: "Which Marketing-owned baseline account shares the same shape as a multi-portfolio Strategic whale — and what's the graduation pathway to get there?"

The Three Dimensions

1. Semantic Similarity — "What they do"

Built from BI narrative embeddings (1,536-dimensional vectors) generated from each account's business summary, products/services, SLED use cases, and target customers. Two accounts are semantically similar when their capabilities and market positioning overlap. Covers 33,112 accounts.

2. Competitive Geography — "Where they fight"

Built from award embeddings (solicitation titles from Navigator bids matched through entity resolution) and state co-occurrence (accounts bidding in the same jurisdictions). Two accounts are competitively linked when they chase the same government contracts. Covers 18,316 accounts.

3. Behavioral Archetype (M3) — "How they behave"

The ML pipeline's Account Segmentation model scores each account on six behavioral dimensions: subscription depth, platform usage, pipeline activity, revenue growth, engagement patterns, and partnership tenure. This produces archetypes like Power User — Growing, Active Subscriber, Pipeline Building, or Baseline. Covers 40,323 accounts in the ML feature set.

Five Edge Signals

Every edge between two accounts is a weighted composite of five distinct signals:

SignalWeightSourceWhat it captures
BI Similarity40%bi_embeddings.npy cosineOverlapping capabilities, products, SLED relevance
Award Similarity25%award_embeddings.npy cosineCompeting for same contract domains
Tag Co-occurrence20%index_account_bi_tag JaccardShared topic areas (Cybersecurity, AI, Cloud, etc.)
State Co-occurrence10%index_account_award_state JaccardCompeting in the same geographic markets
Category Overlap5%index_account_categories JaccardSame engagement formats (events, webinars, papers)

Edges below a 0.15 composite threshold are pruned. The final graph contains 1.36M edges across 33,253 accounts.

Semantic Clustering

Louvain community detection runs on the weighted graph to discover natural competitive clusters. No labels are prescribed — the algorithm finds them organically from the data. Each cluster is then auto-labeled by its dominant BI tags.

The current build discovered 18 clusters, including:

  • Cloud / IT Infra / Cybersecurity — the mega-cluster (8,700+ accounts, 55 whales)
  • Cybersecurity / Identity & Access Management — specialist security (2,200 accounts, 33 whales)
  • K-12 / Higher Education — education vertical self-organized (2,100 accounts)
  • Health & Human Services — health vertical emerged naturally (770 accounts)
  • Election Technology — micro-cluster, niche market (46 accounts)

Whale Detection & Graduation Pathways

What is a whale?

An account with 4+ product categories (Event Sponsorship, Webinar, Paper, Special Event, Event Session). These are multi-portfolio strategic accounts like AWS, Palo Alto Networks, IBM, Cisco, and Dell. The network contains 176 whales.

Graduation scoring

For every non-whale in the graph, the ETL computes:

  1. Nearest whale — which whale is most similar via composite edge weight
  2. Whale distance — the composite similarity score (higher = closer match)
  3. Category gap — how many product categories separate them from whale status
  4. Primary signal — which of the five edge signals most strongly connects them (BI, award, tag, state, or category)

The graduation insight: A Marketing-owned account at $0 revenue that sits right next to Palo Alto Networks in the semantic space, competes in the same states, but has 0 product categories — that's not a cold prospect. That's a whale in disguise. The category gap tells you exactly how many products away they are from the template.

Reading the graduation candidates

The Graduation tab shows accounts ranked by whale similarity, with filters for sales team. A candidate showing:

  • 72% whale match via BI = semantically identical to a whale, just not activated yet
  • 58% whale match via award = winning the same types of contracts, could be cross-sold into events
  • 55% whale match via tag = topic overlap is strong, the competitive context is there

Using the Views

Macro view

The top 300 accounts by PageRank, colored by cluster. Gives you the bird's-eye topology: which clusters are dense, which are isolated, where the whales concentrate. Use the Whales toggle to dim everything except whales and see the strategic landscape.

Clusters view

Click a cluster in the side panel to load its members. See how the cluster's internal structure organizes — which whales anchor the cluster, which prospects orbit them. Use the Labels toggle to see account names.

Graduation view

Shows the macro graph alongside ranked graduation candidates in the panel. Filter by sales team to find Marketing-owned prospects that match Strategic whale shapes. Click any candidate to load its ego network and see the specific whale connection.

Ego network

Search for any account or click a node. Loads the 1-hop neighborhood: the account + its strongest connections + edges between them. The side panel shows full account detail, BI tags, and if applicable, the graduation pathway card.

Technical Details

ComponentDetail
ETL moduleetl/competitor_network.py — 10-phase pipeline, ~135s runtime
Serviceservices/competitor_network.py — lazy-loaded singleton, sub-second queries
APIapi/competitor_network.py — 8 endpoints under /api/competitor-network/
Graph libraryNetworkX 3.3 (Louvain, PageRank)
VisualizationD3.js v7 force simulation
SimilarityBatched cosine similarity (500 rows/batch, numpy), top-30 neighbors per account
ClusteringLouvain community detection (resolution=1.0, weighted edges)

Data outputs

FileContent
data/indexes/competitor_edges.parquet1.36M edges with 5-signal decomposition
data/indexes/competitor_nodes.parquet33,253 nodes with cluster IDs, PageRank, graduation scores
data/indexes/competitor_clusters.parquet18 clusters with labels, whale counts, tag distributions

Rebuild

To regenerate the network after new BI profiles or award data:

python -m etl.competitor_network --top-k 30

Account Revenue Team Cluster Domain Industry Cat Venues Leads Whale Whale Sim Signal Degree PageRank

Selected Account

Revenue:
PageRank:
Venues:
Leads:
Cluster:

Tag Distribution

Team Composition

Top Graduation Candidates

Notifications

No notifications

Create Opportunity

DATA OS

Opportunity Created

DataOS
Install DataOS Add to home screen for quick access
All Features