Platform tools#

This page is a collection of methods used to save and retrieve custom user data inside the SigTech platform.

The following command imports the Python module:

from sigtech import platform_tools

Saving DataFrames and series#

Pandas DataFrames and series can be saved to a file in a user’s designated AWS S3 bucket and accessed later using the same workspace.

The following code block demonstrates how to save a Pandas DataFrame into a .csv file and store it in the AWS S3 bucket:

from sigtech import platform_tools as pt
import pandas as pd

hist = pd.DataFrame({'data': [1, 2, 3, 4]})
pt.save_raw(hist, 'signal_data.csv')

Once saved, the following code loads the file.

from sigtech import platform_tools as pt
hist = pt.get_raw('signal_data.csv')

Saving other file formats#

Other file types can be saved to S3 using the save_file function and retrieved using the open_file function within Platform Tools.

Example: a NumPy series can be saved in bytes to a file-like object and parsed with the save_file function:

import io
import numpy as np
from sigtech import platform_tools as pt

results = np.array([])
file_obj = io.BytesIO(), results, allow_pickle = True)

# We place the pointer back at the beginning of the file-like object
pt.save_file(file_obj, f'results')

The file can later be retrieved with the open_file function, which works similarly to Python’s built-in open function:

import numpy as np
from sigtech import platform_tools as pt

with pt.open_file(f'results') as f:
    results = np.load(f, allow_pickle=True)

Uploading in the workspace#

Data can also be uploaded in the Custom Data pathway in your workspace.

To upload a file, select the workspace. Then, select Custom Data. Follow the onscreen instructions to upload your file.

Once uploaded, the open_file function can also be used to access this data. In the following example, the file name "dummy_data.csv" is adopted.

Amend this file name to match the name of the file you have uploaded to run the code block:

from sigtech import platform_tools
import pandas as pd

# assuming the file name is dummy_data.csv
df = pd.read_csv(platform_tools.open_file("dummy_data.csv"))

Data Ingestion API#

The functions get_dataset and get_dataset_file can be used to retrieve datasets and individual files previously uploaded using the data ingestion API.

Results are returned as a Pandas DataFrame. In the following example '<id>' is used as a shorthand for explanatory purposes.

Learn more: Data Ingestion API

from sigtech import platform_tools as pt

dataset_id = '<id>'
file_id = '<id>'

full_dataset = pt.get_dataset(dataset_id)
individual_file_df = pt.get_dataset_file(dataset_id, file_id)