Skip to content

Reference

Parameters

Query parameters (and corresponding values) are passed to the API via querystrings, as such:

sourcestack-api.com/companies?category=Fashion&export=gsheet

Each query must contain only one of the following search_params:

['name', 'url', 'category', 'uses_product', 'uses_category', 'parent', 'filters']

To combine more than one query parameter, use filters


param name description accepted values default required
name Query on the name of a company, product, SKU, or job str one of
url Query on the url of a company, product, SKU, or job str or comma,sep,str one of
category Query on the category of a company, product, SKU, or job accepted values one of
uses_product Query on companies or jobs that use a specific product str one of
uses_category Query on companies or jobs that use a specific category of product (e.g. if they use a CRM) accepted values one of
parent Query on the parent company of a product, SKU, or job str one of
filters Allows querying on any field - see Advanced Filtering docs accepted values one of
exact Whether the results must exactly match the search_param True or False False optional
fields Specify which data fields to include in the response accepted values default fields optional
order_by Order the response by the values in one or more data fields accepted values None optional
count_only If set to True, the output will be the number of entries, rather than the entries themselves. Setting this to True overrides fields, order_by, export, and limit True or False False optional
export If set, the output will be written to the specified destination instead of returned to the caller ['s3_csv', 'gsheet', 'airtable', 'caller'] caller optional
limit Maximum number of entries to return int 10 optional


Data Fields

You can specify which data fields are returned in a request by providing fields=comma,separated,values:

sourcestack-api.com/jobs?name=Engineer&fields=job_name,hours,post_url


If the fields= querystring is not included in a request, each endpoint will return its default fields:

endpoint default fields
/companies url, company_name, title, categories, tags_matched, tag_categories, last_indexed, status_code
/products product_url, company_url, company_name, product_name, categories
/skus sku_url, domain, sku_name, sku_vendor, company_name, categories, sku_price, variant_count, sku_created_at, last_indexed
/jobs post_url, company_url, job_name, company_name, job_location, hours, department, seniority, remote, tags_matched, tag_categories, last_indexed

Specifying only needed fields will substantially speed up requests and reduce the likelihood of very large requests failing.


Order By

Info

By default, query results will be sorted (but not ordered) by domain, alphabetically.

SourceStack order_by is similar to SQL ORDER BY; it will apply before the limit, and you can set multiple orderings separated by commas:

sourcestack-api.com/companies?name=bakery&exact=False&order_by=Alexa_Rank ASC,Description DESC
As with SQL, you can specify ASC (ascending) or DESC (descending); if neither is provided, the default is DESC

Ordering with DESC, the default, is as follows:

field type example field ordering
text description Z -> A
number alexa_rank 1000000 -> 1
array categories [[y,a], [a,z]] -> [[a,z], [y,a]] (sorts on first item)

Please note that order_by on very large requests may add unacceptably long latency, as ordering is not parallelizable in our system.

Tip

Order By also supports fully randomized ordering - order_by=RANDOM


Multiple URLs

The url search parameter will accept both a single value (url=sourcestack.co), and multiple values separated by commas:

sourcestack-api.com/companies?url=chess.com,sudoku.com

Aim to keep queries below 400 urls per request. If too many values are provided with the url= param, the request will return status code 413 or 414.


In-Data Status Codes

SourceStack data entries have a status_code field which is similar to - but not exactly the same as - standard HTTP status codes.

Entries' status_code value provides additional context on the status of that particular entry.

The following status_code values are present in the queryable datasets:

status_code datasets description examples
200 all Standard success
310 companies SaaS specific Holding Page Shopify Holding Page, BigCommerce Holding Page
311 companies URL redirects to a social network page Twitter, Twitch, Calendly
312 companies URL redirects to an eCom marketplace page Amazon, Gumroad, Etsy
313 companies URL redirects to a standalone Landing Page builder page Carrd, Notion
410 companies Domain Registrar specific Holding Page GoDaddy Holding Page, HugeDomains Holding Page

Failed entries are not included in the queryable datasets, so you will never be charged for them. Examples of failed entries include:

  • websites that respond with 404
  • job posts that have been taken down
  • SKUs that are no longer available for sale


Formatting

Querystring parameter values with spaces in them are displayed with spaces for readability. Your API client will (generally) URL encode them for you. If not, find and replace spaces with %20.

Query search_param values are case-insensitive and whitespace stripped, even with exact=True provided. Search param values additionally have optional internal spaces with exact=False set - e.g. name=Big Query will correctly return the product BigQuery.

You can pass a url value as any of the following:

  • https://domain.tld
  • https://www.domain.tld
  • www.domain.tld
  • domain.tld
  • domain.tld?some_utm=some_specific_thing_that_will_be_discarded
  • domain1.com,domain2.com,domain3.com

Do note: subsites (domain.tld/specific-subsite) will be included.