AnnouncementsAI for SalesCompany ResearchProduct SignalsSales Triggers

The Complete Guide to Autobound's Signal Database: Real-Time Intelligence for Revenue Teams

The best sales teams don't prospect harder — they prospect smarter. When a CFO posts about digital transformation challenges, when a 10-K reveals an AI investment initiative, when Reddit threads surface competitive displacement conversations — these are the moments that convert. Autobound's Signal Database captures these windows systematically across 250M+ contacts and 75M+ companies, delivering schema-validated, entity-resolved signals directly to your pipeline. Here's exactly what we track, how it works, and why it matters for revenue.

Daniel Wiener

Daniel Wiener

Oracle and USC Alum, Building the ChatGPT for Sales.

·7 min read
The Complete Guide to Autobound's Signal Database: Real-Time Intelligence for Revenue Teams

Article Content

Static firmographics tell you who might buy. Signals tell you when. AI-powered sales platform's buyer signal data Database monitors 250+ million contacts and 75+ million companies across 17 signal categories — from SEC filings and hiring velocity to G2 reviews and LinkedIn posts — delivering the triggers that actually convert. Weekly refresh on SEC data, bi-weekly on LinkedIn activity, enterprise-grade GCS delivery with JSONL and Parquet formats. This is the complete technical guide to what we track, how it's structured, and how to integrate it into your pipeline.

Why signals beat static data for pipeline generation

Traditional B2B prospecting guide relies on firmographics — company size, industry, AI sales tools guide — that tell you who might buy but nothing about when. Signals flip this model. When a CFO posts about digital transformation challenges, when a company's 10-K reveals an AI investment initiative, when Reddit threads surface competitive displacement conversations — these moments create windows where CRM and sales tool integrations actually converts.

The Autobound Signal Database captures these windows systematically. Rather than building dozens of scrapers and maintaining complex ETL pipelines, teams get a unified feed of schema-validated, entity-resolved signals delivered directly to GCS buckets or via API. Each signal comes with confidence scores, LLM-generated summaries, and the raw evidence needed for downstream processing.

Three ways teams use signals today:

  • Sales triggers: Route high-intent signals (job changes, funding, hiring velocity spikes) to reps in real-time
  • Account scoring: Feed signal density and recency into predictive models for prioritization
  • Content personalization: Use extracted pain points, initiatives, and technologies to generate relevant messaging at scale

Contact-Level Signals

Contact signals capture what individual people are doing — job moves, public posts, communication style, shared connections. These are the signals that let you know who specifically is worth reaching out to and why now.

Job Changes

When someone changes jobs, there's a window where they're evaluating new tools, building new processes, and open to conversations they'd ignore six months later. The job change signal captures these transitions within a 90-day window:

You get the full career context — where they came from, where they landed, what they were doing before, and what they're doing now. The founded_new_company flag identifies entrepreneurial transitions separately from lateral moves or promotions.

Coverage spans 10-25% of monitored contacts with weekly refresh. Full job change schema →

LinkedIn Posts

People tell you what they care about through their posts. The challenge is extracting structure from free text at scale. The LinkedIn post signal does this automatically — parsing pain points, initiatives, technologies, and competitors from each post:

The intensity and urgency scores (0-1) let you prioritize — a pain point at 0.9 intensity is more pressing than one at 0.3. The status field on technologies tells you where they are in the adoption cycle: evaluating, using, migrating_from, considering, etc.

Refresh is bi-weekly with 25-50% coverage. Full LinkedIn posts schema →

Behavioral Profiles

How someone prefers to communicate matters as much as what you say to them. The behavioral profile signal infers DISC personality dimensions from a contact's digital footprint, providing guidance on tone, structure, and persuasion approach. Coverage reaches 50-75% of contacts with weekly refresh. Full behavioral profile schema →

Shared Experiences

Common ground creates instant rapport. The shared experience signals detect previous employers where both parties worked, alma mater connections, and overlapping professional networks — structured for easy matching. Full shared experiences schema →

Company-Level Signals

Company signals monitor public filings, digital footprints, hiring behavior, and market sentiment to identify organizational buying readiness.

SEC Filings

Autobound processes every 10-K, 10-Q, 8-K, 20-F, and 6-K filing using LLMs trained on SEC document structure. Rather than parsing 200-page annual reports yourself, you get structured signals classified into 70+ subtypes — aiInvestment, digitalTransformation, costReduction, internationalExpansion, ceoChange, and so on.

The metrics object provides structured numerical data when available — dollar amounts, percentages, timeframes — so you can filter signals by magnitude without parsing the summary text. All SEC signals refresh weekly (upgraded from monthly in January 2026), with cross-signal deduplication ensuring the same executive change mentioned in a 10-Q and earnings call generates only one signal. Full 10-K schema →

Glassdoor Reviews

The glassdoor signal aggregates employee sentiment across rating categories, curated feedback excerpts, and competitor mentions. You get the overall rating, breakdown by dimension (culture, compensation, work-life balance, management, career opportunities), and actual review snippets organized by sentiment.

The competitors array surfaces companies mentioned by employees — often revealing competitive dynamics not visible elsewhere. Subtypes like glassdoorConsistentLeadershipComplaints and glassdoorTalentRetentionConcerns enable filtering for specific internal challenges. Full Glassdoor schema →

Employee Growth & Departmental Trends

The employee breakdown signal provides headcount distribution and growth rates by department, enabling detection of organizational priorities without manual research.

The 1yearGrowthByDepartment object lets you filter for companies investing in specific functions — engineering headcount up 40% signals product investment, sales hiring spikes indicate GTM expansion. Full employee breakdown schema →

News Events

The news signal captures leadership changes, funding announcements, partnerships, and other public events with structured metadata.

The insightCategories array provides machine-readable event classification. News signals refresh weekly, with daily delivery under evaluation. Full news schema →

Product Hunt Launches

Track when companies launch new products on Product Hunt — useful for identifying GTM activity and product development priorities.

The topics array enables filtering by product category. Full Product Hunt schema →

Hiring Velocity

Two complementary signals track hiring behavior across 21+ million domains. hiring-velocity measures pace with accelerating, steady, or decelerating trend indicators, comparing current openings to 60 days prior. hiring-trends provides department-level snapshots. Both refresh weekly. Full hiring velocity schema →

Reddit Mentions

The reddit-company signal aggregates discussions from B2B-relevant subreddits (r/sysadmin, r/devops, r/saas, r/msp) with structured subtypes: churnRisk, buyingIntent, competitorMention, pricingConcern. Each signal includes a moderation score for brand safety (recommended filtering: confidence_score >= 0.8). Sample distribution from recent batches: 925 buying intent signals, 321 churn risk signals, 192 competitive intelligence signals. Full Reddit schema →

Technographics

The tech-stack signal covers 218M+ companies with three core subtypes: techUsedProspectUsesCompetitor (companies currently on a competing product), techUsedProspectRecentlyAdoptedCompetitor (switched in the last 90 days), and techUsedProspectUsesComplementaryTech (good fit based on adjacent tools). Full tech stack schema →

Website Intelligence

The website-intelligence signal monitors ~2 million company websites for approximately 3,600 distinct change types: pricing page updates, new product launches, partnership announcements, customer logo additions, security certifications, messaging changes. Monthly refresh. Full website intelligence schema →

Universal Schema

Every signal follows a normalized structure regardless of source. The outer envelope is consistent — signal_id, signal_type, signal_subtype, detected_at, association (company or contact), plus entity resolution fields. The data object varies by signal type but always includes an LLM-generated summary.

Entity resolution coverage:

  • company.domain: 99%+ coverage (primary join key)
  • company.linkedin_url: 95%+ coverage
  • contact.email: 85-95% coverage
  • contact.linkedin_url: 90-98% coverage

The association field indicates whether a signal is company-level or contact-level, enabling clean routing logic. Full schema documentation →

GCS Delivery Infrastructure

Bucket Structure

Each signal type has a dedicated bucket with timestamped folders:

gs://autobound-10k/
├── 2026-02-03T12-00-00Z/
│   ├── output.jsonl    ← Streaming, human-readable
│   └── output.parquet  ← Analytics-optimized

Both formats contain identical data. JSONL suits streaming ingestion and debugging; Parquet integrates directly with BigQuery, Snowflake, and Spark.

Manifest Files

A /manifest/ folder in each bucket provides per-drop JSON manifests with run timing, record counts, and completion status — enabling downstream processing triggers without polling.

Refresh Cadences

Weekly: SEC filings (all types), news, hiring velocity/trends

Bi-weekly: LinkedIn posts (contact)

Monthly: Reddit, G2, Glassdoor, GitHub, website intelligence

Full delivery documentation →

2026 Roadmap

New sources: Capterra and TrustRadius product reviews, Quora Q&A signals.

Infrastructure: Direct API access for signal fetch with age parameters, company connections graph (vendors, competitors, customers, investors), competitor URLs/domains in mentions, hiring velocity percentage change scores, daily news delivery.

Coverage: +25% audience expansion near-term, ~2× monitoring pool within 1-2 months, geographic expansion beyond North America.

Full roadmap →

Get Started

The Autobound Signal Database removes the infrastructure burden from signal-based data products. Instead of building scrapers, maintaining ETL pipelines, and normalizing disparate data sources, teams get unified, schema-validated intelligence delivered on schedule.

For data teams: Consistent schemas, enterprise authentication, Parquet for analytics, JSONL for streaming — integrate once, scale with your product.

Daniel Wiener

Daniel Wiener

Oracle and USC Alum, Building the ChatGPT for Sales.

View on LinkedIn →

Ready to Transform Your Outreach?

See how Autobound uses AI and real-time signals to generate hyper-personalized emails at scale.