Loading...

Data Pipeline

Freshness live fresh stale expired missing
Primary: …
checking beacon…
Local-Sync Daemon
Status
Phase
Cycles
Last Run
Next Run
Cloud Run Slots (claimed by local while green)
EF3 / Redshift Last Refreshed (only the laptop pulls Redshift)
Status
Last Pull
Rows
Host

Local-Sync Events

--
Data Health
--Files
--Size
--Stale
--Missing
Cloud Jobs
Loading…
Local Data Gap
Loading…
Loading file inventory…
Loading sync status…

Mode & Sources

Data Sources

Enhancements & Build

Processing (always on)
Privacy Filter Attribution Enrichment Pre-compute Attribution Stats Inverted indexes after enrichment Venue AI categorization (processed_records)
Enrichment Chain
Tier‑2 parquet (optional, on by default)
Build & Deploy
Cache Reload

This path flags pipeline_operator: venue categorization runs after extraction; inverted indexes / embeddings / competitor finalize always afterward. Deep-research disk export and Phase 2.7 research_priority / account_signals builds are omitted (use research jobs); ML inference parquet export stays off unless you POST auto_ml_intelligence: true without pipeline_operator from the API. ML Enrichment + Abstracts are forced on server-side so Activities/Campaigns and abstract-gated venue tagging always participate. When categorize finds zero venues queued for LLM batches (same junk/title rules as production), processed_records is touched so the Pipeline table “Updated” clock reflects that verify pass.

Advanced: Individual ETL Operations

Source connectivity

Probes run in this runtime (laptop or Cloud Run web service), not a job pod. Allowlist the egress IP below on Navigator and Redshift — it may differ from the job NAT IP. Venues = Salesforce only; Redshift = EF3 abstracts; Navigator = bids/RFP.

Loading sources…
Loading Cloud Run jobs…

Execution History

JobExecutionStatusStartedDuration
Loading…

Hot-Reload Production

Pull fresh data from GCS into the Cloud Run serving container and reload all caches.

Current State

Loading service info…

Deploy Pipeline

Runs deploy.ps1 locally: test → build data bundle → docker build → push → Cloud Run rollout. Docker Desktop must be running.

Revisions

RevisionCreatedCPUMemoryInstances
Loading…

GCS Operations

Run History

JobTypeStatusStartedDurationLog Lines
Loading…

Live Console

--:--:-- Ready. Select a tab and run a pipeline.

Notifications

No notifications

Create Opportunity

DATA OS

Opportunity Created

DataOS
Install DataOS Add to home screen for quick access
All Features