Reference
Overview¶
Simple query parameters (and corresponding values) are passed to the API via querystrings (GET
), as such:
sourcestack-api.com/companies?category=Fashion&fields=company_name,url&export=gsheet
Each query must contain only one of the following search_params:
['name', 'url', 'category', 'uses_product', 'uses_category', 'parent', 'filters']
To combine the logic of more than one of the above query search_params, just use filters
(POST
).
Parameters¶
param name | description | accepted values | default | required |
---|---|---|---|---|
filters | Allows querying on any field - see Advanced Filtering docs | accepted values | one of | |
name | Query on the name of a company, product, SKU, or job | str | one of | |
url | Query on the url of a company, product, SKU, or job | str or comma,sep,str | one of | |
category | Query on the category of a company, product, SKU, or job | accepted values | one of | |
uses_product | Query on companies or jobs that use a specific product | str | one of | |
uses_category | Query on companies or jobs that use a specific category of product (e.g. if they use a CRM) | accepted values | one of | |
parent | Query on the parent company of a product, SKU, or job | str | one of | |
exact | Whether the results must exactly match the search_param; cannot be combined with filters (instead, use Equals and Contains Any operators) | True or False | False | optional |
fields | Specify which data fields to include in the response | accepted values | default fields | optional |
count_only | If set to True , the output will be the number of entries, rather than the entries themselves. Setting this to True overrides fields , order_by , export , and limit | True or False | False | optional |
export | If set, the output will be written to the specified destination instead of returned to the caller | csv , gsheet , airtable , xml , json_gzip , or caller | caller | optional |
limit | Maximum number of entries to return | int | 10 | optional |
order_by | Order the response by the values in one or more data fields | accepted values | None | optional |
max_per_field | See the section below | accepted values | None | optional |
max_per_value | See the section below | int | None | optional |
Data Fields¶
You can specify which data fields are returned in a request by providing fields=comma,separated,values
:
sourcestack-api.com/jobs?name=Engineer&fields=job_name,hours,post_url
If the fields=
querystring is not included in a request, each endpoint will return its default fields:
endpoint | default fields |
---|---|
/companies | url, company_name, title, categories, tags_matched, tag_categories, last_indexed, status_code |
/products | product_url, company_url, company_name, product_name, categories |
/skus | sku_url, domain, sku_name, sku_vendor, company_name, categories, sku_price, variant_count, sku_created_at, last_indexed |
/jobs | job_name, hours, department, seniority, remote, company_name, company_url, post_url, tags_matched, tag_categories, job_location, city, region, country, last_indexed |
Specifying only needed fields will substantially speed up requests and reduce the likelihood of very large requests failing.
Order By¶
Info
By default, query results will be sorted (but not ordered) by domain, alphabetically.
SourceStack order_by
is similar to SQL ORDER BY
; it will apply before the limit
, and you can set multiple orderings separated by commas:
sourcestack-api.com/companies?name=bakery&exact=False&order_by=Alexa_Rank ASC,Description DESC
Ordering with DESC, the default, is as follows:
field type | example field | ordering |
---|---|---|
text | description | Z -> A |
number | alexa_rank | 1000000 -> 1 |
array | categories | [[y,a], [a,z]] -> [[a,z], [y,a]] (sorts on first item) |
Please note that order_by
on very large requests may add unacceptably long latency, as ordering is not parallelizable in our system.
Tip
Order By also supports fully randomized ordering - order_by=RANDOM
Max Per¶
To further refine queries, users can specify results have a maximum number of entries per the value in a given field with Max_Per_Field
and Max_Per_Limit
.
For example, to find up to 10 plant SKUs per store that sells them:
https://sourcestack-api.com/skus?name=plant&max_per_field=domain&max_per_limit=10
Count_Only
mode. Querying Multiple URLs¶
The url
search parameter will accept both a single value (url=sourcestack.co
), and multiple values separated by commas:
sourcestack-api.com/companies?urls=chess.com,sudoku.com
If you wish to send more than 400 urls per request, use Advanced Filtering. If too many values are provided with the url=
param, the request will return status code 413
or 414
.
Formatting¶
Querystring parameter values with spaces in them are displayed with spaces for readability. Your API client will (generally) URL encode them for you. If not, find and replace spaces with
%20
.
Query search_param values are case-insensitive and whitespace stripped, even with exact=True
provided. Search param values additionally have optional internal spaces with exact=False
set - e.g. name=Big Query
will correctly return the product BigQuery
.
You can pass a url
value as any of the following:
- https://domain.tld
- https://www.domain.tld
- www.domain.tld
- domain.tld
- domain.tld?some_utm=some_specific_thing_that_will_be_discarded
- domain1.com,domain2.com,domain3.com
Do note: subsites (domain.tld/specific-subsite
) will be included.
In-Data Status Codes¶
SourceStack data entries have a status_code
field which is similar to - but not exactly the same as - standard HTTP status codes.
Entries' status_code
value provides additional context on the status of that particular entry.
The following status_code values are present in the queryable datasets:
status_code | datasets | description | examples |
---|---|---|---|
200 | all | Standard success | |
310 | companies | SaaS specific Holding Page | Shopify Holding Page , BigCommerce Holding Page |
311 | companies | URL redirects to a social network page | Twitter , Twitch , Calendly |
312 | companies | URL redirects to an eCom marketplace page | Amazon , Gumroad , Etsy |
313 | companies | URL redirects to a standalone Landing Page builder page | Carrd , Notion |
410 | companies | Domain Registrar specific Holding Page | GoDaddy Holding Page , HugeDomains Holding Page |
Failed entries are not included in the queryable datasets, so you will never be charged for them. Examples of failed entries include:
- websites that respond with 404
- job posts that have been taken down
- SKUs that are no longer available for sale