Building an in-house Shopify price monitoring tool is one of the most common engineering projects in commerce teams: agencies tracking client competitors, brands watching pricing pressure, dropshippers spotting margin opportunities. This guide walks through the architecture that actually works in production, the decisions that matter, and where to draw the build-vs-buy line.
Why teams build their own price monitor
Off-the-shelf monitoring tools exist, but most engineering teams hit one of three walls: they don't track the specific stores you care about, they don't fire alerts where you need them (Slack, your internal dashboard, your CRM), or their pricing scales worse than the data layer alone would. Once you cross a few hundred SKUs across a dozen competitors, building becomes the more practical option.
A price monitor at its core is just four moving parts: data ingestion, storage, scheduling, and alerting. Each can be made trivial or complicated depending on how seriously you take the use case. The goal is to spend your engineering time on the parts that actually differentiate your tool, not on rebuilding the data layer.
Architecture overview
A working price monitor needs four components communicating clearly. The data source fetches current product data from each store you watch. Storage persists snapshots so you can detect changes over time. The scheduler triggers the fetch at the cadence you need (hourly, daily, on-demand). The alerting layer routes meaningful changes to the right channel.
Done minimally, that's a single cron job, a Postgres table, and a Slack webhook. Done seriously, it's a queue-based fetcher with a deduplication layer, a time-series store, and a multi-channel router with per-team subscription rules.
The data layer: the most important decision
Everything else cascades from the data layer. Your options break down into three buckets.
Build your own scraper. Cheapest in theory, expensive in practice. You'll spend a quarter on store URL validation, retry logic, IP rotation, rate limit handling, schema drift, and authentication boundaries. Two months in, you're maintaining infrastructure instead of building your monitoring tool.
Pay for a managed scraping API. The right trade-off for most teams. You get a stable JSON contract, predictable rate limits, and someone else's problem to maintain. Pricing typically lands between $50 and $300 per month for serious use, which is almost always cheaper than the engineering time to replicate it.
Hybrid. For some very high-volume use cases, teams use a managed API for breadth and a custom in-house pipeline for a small set of high-value stores. Worth considering once you're past 10k+ price points per day.
Our Developer API is built for exactly this layer: a versioned HTTP endpoint, personal authentication, JSON in and JSON out. The full plan is $59.99/month and includes the no-code scraping UI for ad-hoc exports if your team also needs them.
Storage: schema design that won't fight you
The simplest workable schema is two tables. products holds the canonical metadata: store, handle, title, vendor, type, and a unique constraint on (store, handle). price_snapshots holds the time-series data: product_id, variant_id, price, compare_at_price, available, captured_at.
Index price_snapshots on (product_id, captured_at) for fast time-window queries. Don't store the full product object in each snapshot -- that's how you end up with multi-gigabyte tables six months in. Only persist what changes: price, compare-at-price, availability.
For deduplication, hash the relevant fields on each fetch and skip the insert if nothing changed. This keeps the table size sane and makes "detect a change" queries trivial: any new snapshot row is by definition a change.
Scheduling: cron is fine until it isn't
Start with cron. A single hourly job that loops over your watched stores, fetches their catalogs, and inserts snapshots will get you through the first few hundred SKUs without thinking. Use it. Optimize later.
You'll outgrow cron when one of three things happens: fetches start taking longer than your interval (a 30-minute cron loop running for 45 minutes overlaps badly), some stores need different frequencies (luxury brands change prices weekly, flash-sale brands change them hourly), or failures need retries with backoff.
At that point, move to a queue: a worker pool pulling per-store fetch jobs, with per-job retry logic and per-store cadence configuration. Redis-backed queues like BullMQ (Node), Celery (Python), or Sidekiq (Ruby) all work fine. Don't over-engineer the queue itself -- this is plumbing, not your product.
Alerting: route to where decisions get made
The most common mistake in price monitoring tools is alerting on every change. A 1% price drop on a $200 SKU isn't actionable; a 30% drop is. Define thresholds that match the decision the alert should trigger.
A solid baseline: absolute drop > X dollars, relative drop > Y percent, or change in availability (out of stock / back in stock). Combine the conditions to reduce noise. Route alerts to Slack channels for ongoing visibility, email for daily digests, and webhooks for downstream automation (auto-create a Linear ticket, auto-push to a Notion database, trigger a Zapier flow).
Frequently asked questions
How often should I fetch each store?
For most use cases, once an hour is plenty -- prices rarely change more often than that, and intra-hour churn rarely drives a useful decision. For flash sale or breaking-news verticals, every 5-10 minutes can be justified.
Do I need historical data forever?
No. For most teams, 90 days of full granularity plus weekly aggregates beyond that covers every realistic query. Aggressive retention saves storage and keeps queries fast.
What about variant-level vs product-level prices?
Always track at the variant level. Product-level prices are misleading when sizes or colours are priced differently, and "the product is on sale" can mean any number of variant configurations.
Should I integrate this into a BI tool?
Yes if your team already uses one. Snapshot tables in Postgres connect cleanly to Metabase, Hex, Mode, or any dashboard tool. Build the data layer right and the BI piece becomes a 1-day add-on.