Does ScraperCity work with Pipedream?

Yes. Use Pipedream HTTP request steps or custom Node.js/Python code steps to call any ScraperCity API endpoint. Scrape Apollo leads ($0.0039/lead), Google Maps businesses ($0.01/place), find emails ($0.05/contact), validate deliverability ($0.0036/email), and route results to any of Pipedream's thousands of integrations.

How is Pipedream different from Zapier for ScraperCity?

Pipedream gives you full code access inside each step - you can write Node.js or Python to handle pagination, data transformation, and conditional logic. Zapier is no-code but limited to single API calls per step. For bulk lead scraping with pagination, Pipedream is more capable.

Can I build a scheduled lead scraping pipeline on Pipedream?

Yes. Create a Pipedream workflow with a Cron Schedule trigger, add an HTTP request step that calls the ScraperCity API with your filters, loop through paginated results in a code step, and route each lead to your CRM, spreadsheet, or email tool. The workflow runs in the cloud on your schedule.

What is the best automation platform for lead scraping?

Pipedream and n8n are the best options for ScraperCity lead scraping workflows. Both support pagination, code steps, and scheduling. Pipedream is cloud-hosted with a generous free tier. n8n can be self-hosted. Both connect to the same ScraperCity API endpoints.

How do I store my ScraperCity API key securely in Pipedream?

Go to Pipedream Settings > Environment Variables and add SCRAPERCITY_API_KEY as a new secret variable. Reference it in any code step using process.env.SCRAPERCITY_API_KEY. Never paste the key directly into code - Pipedream logs are visible in the Inspector and may expose hardcoded secrets.

What happens if my Pipedream workflow times out during a large scrape?

Cron-triggered workflows have a default 60-second execution limit. For large paginated scrapes, break the job into smaller pages per run, or use ScraperCity's webhook (app.scrapercity.com/dashboard/webhooks) to receive async notifications and trigger a new workflow step when data is ready rather than polling in a loop.

Can I use ScraperCity to validate emails inside a Pipedream workflow?

Yes. Add a Node.js code step after any lead-scraping step and POST each email address to the ScraperCity Email Validator endpoint (POST /api/v1/email-validator) at $0.0036/email. The validator checks deliverability, MX records, and catch-all status. Filter out invalid addresses before routing leads to your CRM or cold email tool.

Which ScraperCity plan do I need for Pipedream integrations?

All plans ($49/mo, $149/mo, $649/mo) include full API access. The $649/mo plan is required specifically for the Lead Database endpoint (GET /database/leads, millions of B2B contacts). All other scrapers - Apollo, Google Maps, Email Finder, Email Validator, and more - are available on every plan.

ScraperCity + Pipedream - Automate Lead Scraping Workflows

What you can build

Pipedream workflows connect ScraperCity's B2B data API to any downstream tool. Every workflow runs serverlessly in the cloud - no infrastructure to manage. Here are the most common pipelines teams build.

Daily Apollo lead pipeline

Cron trigger fires every morning. ScraperCity scrapes Apollo for contacts matching your ICP filters. Pagination code step collects all pages. Leads route to HubSpot, Pipedrive, or a Google Sheet automatically.

Google Maps local business prospecting

Trigger on webhook or schedule. ScraperCity pulls Google Maps listings by keyword and city - including phone, email, reviews, and website. Send new businesses to your outreach tool or Airtable CRM.

Email validation before outreach

After any lead enrichment step, POST each address to ScraperCity's email validation API at $0.0036/email. Filter deliverable-only leads to a separate Google Sheet or CRM list. Reduce bounces before you hit send.

Lead enrichment pipeline

Receive a webhook from your sign-up form. Pass the company domain to ScraperCity's Email Finder or Website Finder. Append phone numbers with Mobile Finder. Push the enriched lead record back to your CRM automatically.

Shopify/WooCommerce store prospecting

Scrape e-commerce stores by niche using the Store Leads endpoint. Filter by platform, revenue signals, or technology. Route qualified merchants to a Slack notification channel and a CRM pipeline stage.

Sales prospecting automation with email finding

Trigger from a new row added to Google Sheets. For each company name + person name, call ScraperCity's Email Finder API. Write the discovered email back to the same row. A complete no-touch enrichment loop.

Setup

Follow these steps to connect ScraperCity to Pipedream. The full workflow takes about 10 minutes.

Get your ScraperCity API key

Log in to ScraperCity and go to app.scrapercity.com/dashboard/api-docs to copy your API key. Then in Pipedream, navigate to Settings > Environment Variables and create a new secret variable named SCRAPERCITY_API_KEY. Storing the key as an environment variable keeps it out of your workflow code and prevents accidental exposure in Pipedream's execution logs.

Create a Pipedream workflow with a trigger

Create a new workflow inside a Pipedream project. Choose the trigger type that matches your use case:

Cron Schedule - runs the scrape automatically on a fixed interval. Use the "Every" option (e.g. every 1 day) or a custom cron expression like 0 8 * * 1-5 to run at 8 AM on weekdays.
HTTP / Webhook - fires when you send a POST request to the workflow's unique URL. Useful for triggering a scrape from an external app, a form submission, or another automation tool.
Manual - click "Run Now" in the Inspector. Good for testing before enabling the schedule.

Note: Pipedream manages the servers for scheduled workflows, so there is no server or cron daemon to operate yourself.

Add an HTTP request step for ScraperCity

Add a step and select HTTP Request (or use a Node.js code step with axios for more control). Configure the request as shown below. This example scrapes Apollo for Director-level contacts in SaaS companies with a verified email address.

Method: GET
URL: https://app.scrapercity.com/api/v1/database/leads
Headers:
  Authorization: Bearer YOUR_SCRAPERCITY_KEY
Query Parameters:
  title: Director of Sales
  industry: computer software
  hasEmail: true
  limit: 100
  page: 1

Replace YOUR_SCRAPERCITY_KEY with {{process.env.SCRAPERCITY_API_KEY}} when using Pipedream's object explorer, or reference process.env.SCRAPERCITY_API_KEY inside a code step. The Lead Database endpoint requires the $649/mo plan. All other scraper endpoints work on any plan.

Handle pagination with a code step

The ScraperCity API paginates at 100 leads per page with a maximum of 100,000 leads per day. Add a Node.js code step to loop through all pages and return a flat array of leads for downstream steps:

export default defineComponent({
  async run({ steps, $ }) {
    const axios = require("axios");
    const allLeads = [];
    let page = 1;
    let totalPages = 1;

    do {
      const response = await axios.get(
        "https://app.scrapercity.com/api/v1/database/leads",
        {
          headers: {
            Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY,
          },
          params: {
            title: "Director of Sales",
            industry: "computer software",
            hasEmail: "true",
            limit: "100",
            page: String(page),
          },
        }
      );
      allLeads.push(...response.data.data);
      totalPages = response.data.pagination.totalPages;
      page++;
    } while (page <= totalPages);

    return allLeads;
  },
})

Using axios (rather than $.send.http()) is recommended here because you need to read the response body to extract pagination metadata and pass the full lead array to the next step. Pipedream makes any value you return from a code step available to all downstream steps via the steps object.

(Optional) Validate emails before routing

Add an optional email validation step after the pagination loop. For each lead returned, POST the email address to the ScraperCity Email Validator at $0.0036/email. This filters catch-all addresses and undeliverable contacts before they reach your CRM or outreach tool - keeping your sender reputation clean.

export default defineComponent({
  async run({ steps, $ }) {
    const axios = require("axios");
    const leads = steps.fetch_leads.$return_value; // from previous step
    const validatedLeads = [];

    for (const lead of leads) {
      if (!lead.email) continue;
      try {
        const res = await axios.post(
          "https://app.scrapercity.com/api/v1/email-validator",
          { email: lead.email },
          {
            headers: {
              Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY,
              "Content-Type": "application/json",
            },
          }
        );
        if (res.data.deliverable === true) {
          validatedLeads.push(lead);
        }
      } catch (err) {
        console.log("Validation error for " + lead.email, err.message);
      }
    }

    return validatedLeads;
  },
})

Route leads to your destination

Add a destination step after your data is collected and optionally validated. Pipedream has pre-built actions for common destinations - click the + button and search by app name:

Google Sheets - use the "Add Single Row" action to append each lead. Map fields from the ScraperCity response (name, email, title, company, phone) to your sheet columns.
HubSpot - use "Create Contact" to push validated leads into your CRM pipeline with custom properties.
Pipedrive - create Persons and Deals from scraped contacts with a single action step.
Airtable - insert leads into a base for review and tagging before outreach.
Slack - post a summary message to a channel when a scrape completes, with lead count and a link to the results sheet.
Instantly / Smartlead / Lemlist - POST validated leads directly to your cold email tool's API using an HTTP Request step.

ScraperCity API endpoints for Pipedream

Every ScraperCity scraper is accessible via the same base URL (https://app.scrapercity.com/api/v1) with Bearer token authentication. The table below shows the endpoints most commonly used in Pipedream workflows.

Endpoint	What it returns	Cost	Delivery
GET /apollo	B2B contacts from Apollo by title, industry, location	$0.0039/lead	11-48+ hrs
GET /database/leads	4.6M+ B2B contacts, instant query (requires $649/mo plan)	Included in plan	Instant
GET /google-maps	Local businesses with phone, email, reviews, website	$0.01/place	5-30 min
POST /email-validator	Deliverability, MX records, catch-all detection	$0.0036/email	1-10 min
POST /email-finder	Business email from name + company domain	$0.05/contact	1-10 min
POST /mobile-finder	Phone numbers from LinkedIn URL or email	$0.25/input	1-5 min
GET /store-leads	Shopify/WooCommerce stores with contacts	$0.0039/lead	Instant
GET /status/:runId	Poll the status of an async scrape job	Free	Instant
GET /download/:runId	Download CSV results for a completed scrape	Free	Instant

All endpoints use Authorization: Bearer YOUR_API_KEY in the request header. Apollo scrapes are asynchronous and delivered in 11-48+ hours. For async scrapers, use the Status endpoint to poll job completion or configure a webhook at app.scrapercity.com/dashboard/webhooks to receive a POST notification when results are ready.

Handling async scrapes (Apollo and long-running jobs)

Some ScraperCity scrapers - most notably Apollo - are asynchronous. When you POST a scrape request, the API returns a runId immediately but results are not available for 11-48+ hours. Pipedream cron-triggered workflows have a maximum execution time, so you cannot poll inside a single workflow run for an async scrape. There are two reliable patterns for handling this:

Pattern 1 - Webhook notification (recommended)

Configure a webhook URL in ScraperCity's dashboard (app.scrapercity.com/dashboard/webhooks). Set the URL to a Pipedream HTTP-triggered workflow. When your scrape completes, ScraperCity POSTs the results to that URL and Pipedream fires the workflow automatically - no polling needed.

// Workflow A: Trigger the scrape (Cron trigger)
export default defineComponent({
  async run({ steps, $ }) {
    const axios = require("axios");
    const res = await axios.post(
      "https://app.scrapercity.com/api/v1/apollo",
      {
        title: "VP of Engineering",
        industry: "saas",
        limit: 500,
      },
      {
        headers: {
          Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY,
          "Content-Type": "application/json",
        },
      }
    );
    // Store the runId to track status if needed
    return { runId: res.data.runId };
  },
});

// Workflow B: Receive webhook when complete (HTTP trigger)
// ScraperCity POSTs results to this workflow's URL
// steps.trigger.event.body contains the lead data

Pattern 2 - Polling with Status endpoint

For shorter async scrapers (Google Maps, Email Finder - typically 1-30 minutes), you can use a second scheduled workflow that polls the GET /api/v1/status/:runId endpoint every few minutes. When the status returns completed, call the Download endpoint to retrieve results and route them to your destination.

export default defineComponent({
  async run({ steps, $ }) {
    const axios = require("axios");
    const runId = process.env.PENDING_RUN_ID; // store this after triggering

    const status = await axios.get(
      `https://app.scrapercity.com/api/v1/status/${runId}`,
      {
        headers: { Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY },
      }
    );

    if (status.data.status !== "completed") {
      $.flow.exit("Scrape not ready yet - will retry on next cron tick");
    }

    const results = await axios.get(
      `https://app.scrapercity.com/api/v1/download/${runId}`,
      {
        headers: { Authorization: "Bearer " + process.env.SCRAPERCITY_API_KEY },
      }
    );

    return results.data;
  },
})

Troubleshooting common errors

These are the most common issues when integrating ScraperCity with Pipedream, and how to fix each one.

401 Unauthorized

Why it happens: The Authorization header is missing or the API key is wrong.

Fix: Confirm the header is set to Authorization: Bearer YOUR_KEY with no typos. In a code step, verify process.env.SCRAPERCITY_API_KEY returns the correct value by logging it once (then remove the log - do not leave API keys printing to Pipedream's Inspector logs).

429 Too Many Requests

Why it happens: You are sending requests faster than the allowed rate.

Fix: The ScraperCity Lead Database endpoint allows up to 100,000 leads per day at 100 per page. Add a short delay between page requests in your loop if you are paginating at very high speed. Pipedream itself rate-limits HTTP triggers to an average of 10 requests per second - use throttle controls in Workflow Settings if fanning out large batches.

Workflow timeout error (red in Inspector)

Why it happens: Your pagination loop takes longer than Pipedream's execution limit. Cron workflows default to 60 seconds.

Fix: Split the work across multiple workflow runs. Scrape one page range per cron tick, storing the current page in an external state store (e.g. a single-cell Google Sheet or a Pipedream data store). Alternatively, use the webhook pattern for async scrapers so no polling loop is needed inside a single run.

Duplicate request blocked (ScraperCity 30-second dedup)

Why it happens: ScraperCity blocks identical requests made within 30 seconds to prevent accidental double charges.

Fix: This is expected behavior. If you are retrying a failed step, wait at least 30 seconds before resending. Vary at least one query parameter (e.g. page number) if you need to send multiple requests quickly.

process.env.SCRAPERCITY_API_KEY returns undefined

Why it happens: The environment variable is being referenced outside of the defineComponent export function, or it was not saved correctly.

Fix: Confirm the variable was saved at Settings > Environment Variables in Pipedream. Ensure your code references process.env inside the run function body. process.env returns undefined when called at the module level outside of defineComponent.

Empty data array returned

Why it happens: The filter parameters returned no matching contacts, or the scrape is still processing.

Fix: For async scrapers (Apollo), the data will not be available until the scrape completes (11-48+ hours). Check the run status using GET /api/v1/status/:runId. For synchronous scrapers, loosen your filter criteria - try removing one filter at a time to identify which constraint is too narrow.

Performance tips

-Use the maximum page size (limit=100). The Lead Database endpoint returns up to 100 leads per request. Always use limit=100 to minimize the number of API calls and reduce total execution time in your pagination loop.
-Filter at the API level, not in code. Use ScraperCity's query parameters (title, industry, location, hasEmail, hasPhone) to narrow results before they arrive in your workflow. Filtering in a Pipedream code step wastes credits on leads you discard.
-Validate emails before enrichment. Run the Email Validator ($0.0036/email) before the Email Finder ($0.05/contact) or Mobile Finder ($0.25/input). Eliminate invalid addresses cheaply before spending on higher-cost enrichment steps.
-Store your API key as a Pipedream environment variable. Reference it with process.env.SCRAPERCITY_API_KEY across all steps. Pipedream marks environment variables as secrets - the value is never exposed in the UI and is not accessible to anyone who views the workflow.
-Use webhooks for Apollo scrapes. Apollo scrapes take 11-48+ hours. Set a ScraperCity webhook to trigger your Pipedream processing workflow automatically when data is ready, rather than polling on a schedule and wasting executions checking incomplete jobs.
-Chunk large destination writes. When writing hundreds of leads to Google Sheets or HubSpot, batch inserts in groups of 50-100 rather than one row per API call. This avoids hitting destination rate limits and keeps your workflow execution time well within Pipedream's limits.

Platform	Code steps	Pagination support	Hosting	Best for
Pipedream	Node.js + Python	Full loop control	Cloud (managed)	Devs who want code + 2,000+ integrations
n8n	JavaScript function node	Full loop control	Self-hosted or cloud	Teams wanting self-hosted control + visual builder
Zapier	Code step (limited)	No native pagination	Cloud (managed)	No-code single-step triggers, simple routing
Make (Integromat)	Limited	Iterator module	Cloud (managed)	Visual scenario builder, moderate complexity

ScraperCity + Pipedream