API documentation
Datasets
Get datasets
GET
/datasets
Retrieves a list of all created datasets that user is permitted to view.
\
Headers
Name | Type | Description |
---|---|---|
token | string | Personal access token |
Retrieves a list of all datasets within a user's AWS S3 bucket.
Example:
Post dataset
POST
/datasets
Create a single new dataset resource.
\
If schema is not provided, then all of
file
,
upload_format
,
file_format
and
file_id
are required.
\
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Request Body
Name | Type | Description |
---|---|---|
name | string | Dataset name |
tags | string | Key-value pair in the format : \ {[string] : [string]} |
schema | string | Schema of data in the format: \ [{"name":[string], "type":[string]}] |
file | string | File |
upload_format | string | Format of upload, one of the following: \ [base64 | link] |
file_format | string | Format of file, one of the following: \ [parquet | csv | xls | xlsx] |
file_id | string | ID of file |
Creates a new dataset and generates a new UUID.
Example (with schema):
Note: If a schema is not provided with the request, users must include values for file
, upload_format
, __ file_format
, and file_id
.
Example (without schema):
Notes:
A UUID for the new dataset is randomly generated and returned in the response body.
A schema can be provided in the request body in two ways: 1. As an array of objects with
name
andtype
fields, each representing a single column in the dataset. An attempt will be made to parse the schema into a pyarrow table. 2. As an instruction to download a file and parse it into a pyarrow table.Schemas are currently unenforced. Files with different schemas may be uploaded to the same dataset. Although these individual files will be accessible, the entire dataset will be unreadable.
Although datasets may share the same name, IDs must be unique.
Get dataset
GET
/dataset/<dataset_id>
Get details for a single dataset resource.
\
Path Parameters
Name | Type | Description |
---|---|---|
dataset_id | string | Dataset ID |
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Retrieves a list of all available file IDs and pre-signed download links for a specific dataset.
Example:
Put dataset
PUT
/datasets/<dataset_id>
Create or replace a single dataset resource.
\
If schema is not provided, then all of
file
,
upload_format
,
file_format
and
file_id
are required.
Path Parameters
Name | Type | Description |
---|---|---|
api_url | string | Domain |
dataset_id | string | Dataset ID |
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Request Body
Name | Type | Description |
---|---|---|
name | string | Dataset name |
tags | string | Key-value pair in the format : \ {[string] : [string]} |
schema | string | Schema of data in the format: \ [{"name":[string], "type":[string]}] |
file | string | File |
upload_format | string | Format of upload, one of the following: \ [base64 | link] |
file_format | string | Format of file, one of the following: \ [parquet | csv | xls | xlsx] |
file_id | string | ID of file |
Creates or replaces a specific dataset.
Note: If a schema is not provided with the request, users must include values for file
, upload_format
, file_format
, and __ file_id
.
Example (with schema):
Example (without schema):
The notes applying to Post Dataset
are also applicable in this instance.
Delete dataset
DELETE
/dataset/<dataset_id>
Create or replace a single dataset resource.
\
Path Parameters
Name | Type | Description |
---|---|---|
api_url | string | Domain |
dataset_id | string | Dataset ID |
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Deletes a specific dataset.
Example:
Files
Get dataset files
GET
/datasets/<dataset_id>/files
Get a collection of pre-signed download links for each file uploaded to a single dataset resource.
Path Parameters
Name | Type | Description |
---|---|---|
api_url | string | Domain |
dataset_id | string | Dataset ID |
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Retrieves a list of files uploaded to a specific dataset.
Example:
Post dataset file
POST
/datasets/<dataset_id>/files
Get a collection of pre-signed download links for each file uploaded to a single dataset resource.
\
Path Parameters
Name | Type | Description |
---|---|---|
api_url | string | Domain |
dataset_id | string | Dataset ID |
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Request Body
Name | Type | Description |
---|---|---|
file | string | File |
upload_format | string | Format of upload, one of the following: \ [base64 | link] |
file_format | object | Format of file, one of the following: \ [parquet | csv | xls | xlsx] |
file_id | string | ID of file |
Creates a new parquet file within a specific dataset.
Example:
Notes:
A dataset file must be provided in a format parsable into a pyarrow table.
This pyarrow table is exported as a parquet file into S3. Download links for these files are retrievable via
GET
requests.The raw file provided will be uploaded to S3 and is also retrievable via a
GET
request.The dataset file must be provided in a form corresponding to one of the available
upload_format
parameters:link
: Provide a pre-signed download link for the file.base64
: Provide the file in the form of base64-encoded bytes, with a maximum size of 10MB.The file formats supported are parquet, csv, cel fand iles (xls, xlsx).
Additional parsing parameters are available depending on format:
CSV: Optional args such as delimiters can be specified in one of the following:
read_options
: Learn more.parse_options
: Learn more.convert_options
: Learn more.Excel (xls/xlsx): Learn more.
Can not pass alternative parameters for IO, or engine.
Parquet: No additional arguments are available.
Delete dataset files
DELETE
/dataset/<dataset_id>/files
Delete all files in a dataset.
Path Parameters
Name | Type | Description |
---|---|---|
api_url | string | Domain |
dataset_id | string | Dataset ID |
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Deletes all files from a specific dataset.
Example:
Notes:
Even if the dataset no longer exists, files that once resided in that dataset will be deleted without error.
This request deletes both parsed and raw files within a dataset
Get dataset file
GET
/dataset/<dataset_id>/<file_id>
Get download links for a single dataset file.
Path Parameters
Name | Type | Description |
---|---|---|
api_url | string | Domain |
dataset_id | string | Dataset ID |
file_id | string | File ID |
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Retrieves a list of details for a specific file within a specific dataset.
Example:
Put dataset file
PUT
/datasets/<dataset_id>/files/<file_id>
Create or replace a single dataset file
\
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Request Body
Name | Type | Description |
---|---|---|
file | string | File |
upload_format | string | Format of upload, one of the following: \ [base64 | link] |
file_format | object | Format of file, one of the following: \ [parquet | csv | xls | xlsx] |
file_id | string | ID of file |
Creates or replaces a new dataset file.
Example:
Note: Optional parameters can also be provided for file parsing logic. To learn more, see the notes included for Post Dataset File.
Delete dataset file
DELETE
/dataset/<dataset_id>/files/<file_id>
Delete a file within a dataset.
Headers
Name | Type | Description |
---|---|---|
Token | string | Personal access token |
Deletes a specific dataset file.
Example:
Notes:
Even if the dataset no longer exists, files that once resided in that dataset will be deleted without error.
This request deletes both parsed and raw files within a dataset.
Last updated