Get started
Introduction
The Data Ingestion API enables you to ingest custom data.
This article covers a range of different use cases and data types, from using both signal and market price data to covering how custom market price data can be used to replace existing, or create new, tradable instruments.
Concepts
File: A portion of custom data uploaded to the SigTech platform and added to an existing dataset. To successfully add a file to a dataset, the file's schema must match the dataset's schema.
Dataset: Data that follows a specified schema. A dataset can ingest additional data, provided that the new data adheres to the specified schema. Each additional portion of data added to a dataset is called a file.
Schema: Every dataset uploaded to the SigTech platform needs to have a schema. A schema is the organisation of columns within a table, comprised of both the column names and the type of data included in the columns. Once the dataset has been created and a schema specified, files following that same schema can be uploaded and added to the dataset.
Prerequisites
Personal access token
To access the Data Extraction API you need to generate a personal access token. This token can be parsed as an authorisation header.
To generate the token, click on your user profile in the top right corner of the SigTech platform > Access Tokens > GENERATE TOKEN.
Upload data from local memory or local files
Upload from local memory
There are two available processes:
Creating a new dataset from an existing pandas dataframe.
Adding a pandas dataframe to an existing dataset.
Upload data
Note:
If no dataset_id
is provided, the script will generate a new dataset based on the pandas dataframe. The schema that dataset follows will match the column structure used in the dataframe.
If a dataset_id
is provided, the script will add the pandas dataframe to the existing dataset. This action will only be successful if the schema used in the dataframe matches the existing dataset's schema.
Create a new dataset
To create a new dataset, the following code should be added to the above script:
Add data to an existing dataset
Note: Adding data to an existing dataset requires that the dataset_id
is provided.
To add data to an existing dataset, the following code can be used together with the script for creating a new dataset:
If a dataset hasn't been created, see Create a new dataset.
Upload from local files
There are two available processes:
Creating a new dataset from an existing local file.
Adding the contents from a local file to an existing dataset.
The API currently supports the upload of parquet, csv, xls, and xlsx file formats.
Upload data
If no dataset_id
is provided, the script will generate a new dataset.
Create a new dataset
To create a new dataset, the following code should be added to the script for uploading data:
Add data to an existing dataset
Note: Adding data to an existing dataset requires that the dataset_id
is provided.
To add data to an existing dataset, the following code can be used together with the script for creating a new dataset:
If a dataset hasn't been created, see Create a new dataset.
Last updated