Competitor Scraping Quality and URL Handling
✨ What's Improved
The competitor scraping engine has been upgraded with several data quality improvements that make scraped results cleaner and more accurate out of the box.
🔧 Key Fixes
Auto-Prepend HTTPS
Enter competitor URLs like competitor.com and the system automatically adds https:// — no more validation errors for bare domain inputs.
Brand Name Cleaning
Scraped page titles like "The Brand Toolkit Platform | Brandkit - Home for your brand" are now cleaned to extract just the core brand name (e.g., "Brandkit"). SEO suffixes, taglines, and filler text are stripped automatically.
Color Deduplication
Duplicate hex values are removed from scraped brand color palettes, giving you a clean, unique set of brand colors.
Social Link Fallback
When metadata doesn't contain social profiles, the scraper now parses the page markdown content as a fallback to find links to Instagram, LinkedIn, Twitter/X, Facebook, YouTube, TikTok, and Reddit.
Value Proposition Filtering
Extracted value propositions are now filtered to remove image markdown tags and raw URLs, keeping only meaningful text content.
Rich Data Backfill
Existing competitor records have been backfilled with typography, button styles, spacing, brand personality, and design framework data from previously captured raw scrape data.
📍 How to Use
These improvements apply automatically to all new competitor scrapes. Existing competitors have already been backfilled with the enriched data — click any competitor card to see the full details.