CLIENT STORY
A Middle East-based SaaS platform helps restaurants across the region win more orders on food-delivery apps. Its product reads how each restaurant performs on Talabat, HungerStation, Jahez, Deliveroo, and Uber Eats: funnel conversion, ad spend, vendor metadata, and reviews from end customers. All of it turns into recommendations that the operator can act on the same morning. The whole product depends on one thing working every day: the scrape.
| Industry: | Food Tech |
|---|---|
| Period: | since 2024 |
| Location: | MENA |
Read summarized version with
"We were paying six hundred a month for scraping that didn't work. Half the time, data was missing, fixes took days, and one of the largest food-delivery platforms in our region had gone completely dark for the previous team. Our entire analytics product runs on yesterday's data. There is no version of this where we shrug at a missed morning."
— Lead Software Engineer, Middle East Restaurant Analytics SaaS
"The moment scraping stopped being a vendor problem and became part of the product, things changed. We moved to time-and-materials, scoped each platform on its own, and the conversations stopped being defensive. My engineers don't open the morning asking 'is data flowing?' anymore. They open it asking what's next."
— Co-Founder & CEO, Middle East Restaurant Analytics SaaS
The Challenge: Five Platforms, Four Anti-Bot Systems
By mid-2024, the SaaS team was running on a scraping vendor it could no longer trust. Bug reports sat in a queue. When one of the region’s largest food-delivery platforms turned on stronger anti-bot defenses, the vendor stopped delivering data from that platform altogether.
Each of the five platforms guards its data differently. One sits behind a commercial bot-protection service and later added two-factor login for a subset of accounts. Another applies an interactive login challenge. A third relies on browser-level signals. A fourth combines cookie-based login with anti-forgery tokens against its internal data API. The fifth runs on the same stack as the first, with a separate account graph.
The stakes were narrow. The product’s analytics value to restaurant operators is daily: yesterday’s funnel, yesterday’s reviews. Miss a single morning of data, and an entire cohort of restaurant clients opens the dashboard to a hole. In food delivery, that kind of trust degrades fast.
Resilient Five-Platform Scraping for Daily Restaurant Analytics
Pilot Launch
Before writing scraper code, GroupBWT mapped each platform’s login flow and known defenses. The team and the client agreed to start small: two platforms, on a 34–54 hours estimate. Three weeks later, production scrapers for the customer-funnel and conversion-funnel datasets were live. Slow ramp, predictable engineering, honest billing.
Platform Challenges
Each platform needed its own answer, and the previous vendor had stalled on the hardest one. The largest regional platform was where the prior team had gone dark — it was the first gap GroupBWT closed, with pilot scrapers live within three weeks. A second platform runs on the same stack, so the same auth path covered both. When that platform added two-factor login for a subset of accounts, team integrated with the client’s own authentication flow rather than intercepting one-time codes directly — the client exposed an endpoint that returns a valid token with the login step already cleared.
The remaining three platforms each required a different approach. One had been the second platform the previous vendor stalled on; GroupBWT restored daily collection with five static IPs at a fixed monthly rate and a browser session the team owns and controls — replacing a rented service that kept failing. A second reads browser-level signals, satisfied by a real automated browser session. The third pairs cookie-based login with anti-forgery tokens against its internal data API.
We tried three commercial services before the cheapest answer turned out to be the most stable one — five static IPs at ten dollars a month and a browser session we run and maintain ourselves. We stopped depending on an outside vendor for the layer that has to work every morning. Sometimes owning the boring layer beats renting the clever one.
Resilient Five-Platform Scraping for Daily Restaurant Analytics
Architecture
The pipeline runs on Scrapy with a Producer/Consumer queue over RabbitMQ. Each morning, the Producer fans out one task per account, between 90 and 120 accounts daily, expanding into 2,500–3,600 vendor tasks. Session state lives in MySQL; Redis caches auth tokens for 48 hours; the daily CSV outputs land in Google Cloud Storage. A lightweight FastAPI service hands fresh auth tokens to the client’s own backend in real time.
In practice, this means the restaurant team’s morning dashboard refreshes from one queue rather than five separate scrapers. If one platform stalls, the others still deliver, and the on-call engineer sees the single failed job, not a fleet of unrelated incidents.
Monitoring
Every scraping session is tracked from dispatch to completion. Metabase dashboards show the status of each job — per-account session state, per-platform task completion, and any run that misses its schedule. Email alerts fire the moment a job stays uncompleted past its window, so the on-call engineer is paged before a client opens the dashboard to a gap. By July 2025, the system had stabilized into a low-touch maintenance mode, with engineering capacity available on demand whenever a platform pushed a new defense.
Tech stack: Scrapy, RabbitMQ 3.12+, MySQL 8.0, Redis, FastAPI, Playwright, Metabase, Kubernetes (ArgoCD + Helm), HAProxy, StormProxies, Google Cloud Storage.
17 Months Live, Zero Data Gaps, a Referral on the Books
Business outcomes
- The daily scrape is the product’s spine. Every morning, the SaaS platform’s restaurant clients across the region open their dashboard to yesterday’s funnel, ad spend, and reviews. Across 17 months of GroupBWT-managed runs, daily data has been delivered without client-visible gaps — a shift from the intermittent delivery under the previous vendor.
- Stable collection at low operating cost. A flat-rate setup — five static IPs at a fixed monthly cost — replaced the per-request proxy fees that had been eating into the previous vendor’s margin, and the channel has run stably since launch.
- Predictable maintenance. After stabilization in July 2025, the engagement settled into a low-touch monthly cadence — engineering capacity remains on call whenever a platform pushes a new defense, without standing engineering overhead on the client side.
- Flexible engagement. The work runs on a time-and-materials model with each new platform scoped as its own commitment. A typical pilot runs $2–3K and reaches production data within three weeks.
- A direct referral in early 2026. GroupBWT was introduced to a second MENA delivery-analytics team on a direct referral from this client — five-platform scope, same operating model.
Looking to Replace a Scraping Vendor That Stopped Delivering?
If your analytics product runs on yesterday's data and your current scraper has been missing half of it, GroupBWT can scope a pilot in a week.
You have an idea?
We handle all the rest.
How can we help you?