Reference
Overview¶
Simple query parameters (and corresponding values) are passed to the API via querystrings (GET), as such:
sourcestack-api.com/jobs?category=Sales&fields=post_url,job_name&export=gsheet
POST request: response = requests.post(
"https://sourcestack-api.com/jobs",
headers={
"X-API-KEY": os.environ["SOURCESTACK_KEY"],
"Content-Type": 'application/json'
},
data=json.dumps({
"export": "gsheet",
"fields": ["post_url", "company_name", "company_url", "categories"],
"filters": [{"field": "categories", "operator": "CONTAINS", "value": "Big Data"}]
})
)
Parameters¶
| param name | description | accepted values | default | required |
|---|---|---|---|---|
filters | Allows querying on any field - see Advanced Filtering docs | accepted values | one of | |
name | Query on the name of a company, product, SKU, or job | str | one of | |
url | Query on the url of a company, product, SKU, or job | str or comma,sep,str | one of | |
category | Query on the category of a company, product, SKU, or job | accepted values | one of | |
uses_product | Query on companies or jobs that use a specific product | str | one of | |
uses_category | Query on companies or jobs that use a specific category of product (e.g. if they use a CRM) | accepted values | one of | |
parent | Query on the parent company of a product, SKU, or job | str | one of | |
exact | Whether the results must exactly match the search_param; cannot be combined with filters (instead, use Equals and Contains Any operators) | True or False | False | optional |
fields | Specify which data fields to include in the response | accepted values | default fields | optional |
count_only | If set to True, the output will be the number of entries, rather than the entries themselves. Setting this to True overrides fields, order_by, export, and limit | True or False | False | optional |
export | If set, the output will be written to the specified destination instead of returned to the caller | csv, gsheet, airtable, xml, json_gzip, or caller | caller | optional |
limit | Maximum number of entries to return | int | 10 | optional |
order_by | Order the response by the values in one or more data fields | accepted values | None | optional |
max_per_field | See the section below | accepted values | None | optional |
max_per_value | See the section below | int | None | optional |
Data Fields¶
You can specify which data fields are returned in a request by providing fields=comma,separated,values:
sourcestack-api.com/jobs?name=Engineer&fields=job_name,hours,post_url
If the fields= querystring is not included in a request, each endpoint will return its default fields:
| endpoint | default fields |
|---|---|
/companies | url, company_name, title, categories, tags_matched, tag_categories, last_indexed, status_code |
/products | product_url, company_url, company_name, product_name, categories |
/skus | sku_url, domain, sku_name, sku_vendor, company_name, categories, sku_price, variant_count, sku_created_at, last_indexed |
/jobs | job_name, hours, department, seniority, remote, company_name, company_url, post_url, tags_matched, tag_categories, job_location, city, region, country, last_indexed |
Specifying only needed fields will substantially speed up requests and reduce the likelihood of very large requests failing.
Order By¶
Info
By default, query results will be sorted (but not ordered) by domain, alphabetically.
SourceStack order_by is similar to SQL ORDER BY; it will apply before the limit, and you can set multiple orderings separated by commas:
sourcestack-api.com/companies?name=bakery&exact=False&order_by=Alexa_Rank ASC,Description DESC
Ordering with DESC, the default, is as follows:
| field type | example field | ordering |
|---|---|---|
text | description | Z -> A |
number | alexa_rank | 1000000 -> 1 |
array | categories | [[y,a], [a,z]] -> [[a,z], [y,a]] (sorts on first item) |
Please note that order_by on very large requests may add unacceptably long latency, as ordering is not parallelizable in our system.
Tip
Order By also supports fully randomized ordering - order_by=RANDOM
Max Per¶
To further refine queries, users can specify results have a maximum number of entries per the value in a given field with Max_Per_Field and Max_Per_Limit.
For example, to find up to 10 plant SKUs per store that sells them:
https://sourcestack-api.com/skus?name=plant&max_per_field=domain&max_per_limit=10
Count_Only mode. Querying Multiple URLs¶
The url search parameter will accept both a single value (url=sourcestack.co), and multiple values separated by commas:
sourcestack-api.com/companies?urls=chess.com,sudoku.com
If you wish to send more than 400 urls per request, use Advanced Filtering. If too many values are provided with the url= param, the request will return status code 413 or 414.
Formatting¶
Querystring parameter values with spaces in them are displayed with spaces for readability. Your API client will (generally) URL encode them for you. If not, find and replace spaces with %20.
Query search_param values are case-insensitive and whitespace stripped, even with exact=True provided. Search param values additionally have optional internal spaces with exact=False set - e.g. name=Big Query will correctly return the product BigQuery.
You can pass a url value as any of the following:
- https://domain.tld
- https://www.domain.tld
- www.domain.tld
- domain.tld
- domain.tld?some_utm=will_be_discarded
- domain1.com,domain2.com,domain3.com
Do note: subsites (domain.tld/specific-subsite) will be included.