Data Retrieval#
Introduction#
The purpose of this primer is to describe the various CLS datasets available and demonstrate how to access the this data.
CLS operates the largest multi-currency cash settlement system.
CLS Market Data is a comprehensive suite of FX alternative data products designed to provide quality insight for financial efficiency, visibility and control.
These data sets can be accessed and analysed in the SigTech platform.
A notebook containing all the code used in this page can be accessed in the research environment: Example notebooks.
Environment#
This section will import relevant internal and external libraries, as well as setting up the platform environment. For further information in regards to setting up the environment, see Environment setup.
import sigtech.framework as sig
from sigtech.framework.analytics.cls_data import ClsData, CLS_DATASETS, CLS_FREQ, CLS_INSTRUMENTS, CLS_PUBLICATION_FREQ, resample_reduced, remove_weekends, CLS_CCY_CROSSES
if not sig.config.is_initialised():
env = sig.config.init()
A CLS data class (ClsData
) is created. This instance provides access to the various data sets and parameters.
cls_data = ClsData()
Python:
CLS_DATASETS??
Output:
Init signature: CLS_DATASETS()
Docstring: <no docstring>
Source:
class CLS_DATASETS:
VOLUME = 'fx.volume'
FLOW = 'fx.flow'
VOLUME_FORECAST = 'fx.forecast'
PRICING = 'fx.pricing'
File: ~/.pyenv/versions/3.8.8/envs/signotebook/lib/python3.8/site-packages/sigtech/framework/analytics/cls_data.py
Type: type
Python:
CLS_INSTRUMENTS??
Output:
Init signature: CLS_INSTRUMENTS()
Docstring: <no docstring>
Source:
class CLS_INSTRUMENTS:
SPOT = 'SPT'
FORWARDS = 'ORF'
SWAPS = 'SWP'
File: ~/.pyenv/versions/3.8.8/envs/signotebook/lib/python3.8/site-packages/sigtech/framework/analytics/cls_data.py
Type: type
Data retrieval#
The ClsData
instance provides various helper methods to retrieve the data. The full dataset for a given cross can be directly loaded using the retrieve_cls_data
method. This method takes the currency cross, data set, instrument type, publication frequency and data frequency as inputs.
In the example below we obtain daily volume data for EURUSD.
Python:
cls_data.retrieve_cls_data('EURUSD', CLS_DATASETS.VOLUME, CLS_INSTRUMENTS.SPOT, CLS_PUBLICATION_FREQ.DAILY, CLS_FREQ.DAILY).head()
Output:
Specific configurations of data can be loaded with the load_data_for_cross
method. This caches each query and gives back formatted output, with the primary columns.
Python:
cls_data.load_data_for_cross(cls_data.SPOT_VOLUME_DAILY_DAILY, 'EURUSD').head()
Output:
A selection of currency crosses can be queried together using the data
method.
Python:
cls_data.data(cls_data.SPOT_VOLUME_DAILY_DAILY, cross_list=['EURUSD'])['EURUSD'].head()
Output:
Data Sets #
Spot Pricing #
The CLS FX prices dataset provides information on the executed trade volume that is submitted to the CLS Settlement and Aggregation services.
In determining the time of submission, CLS receives and matches two sides for each trade, one per counter party. CLS uses the earlier of the two submission times as the trade time proxy. CLS receives confirmation on the majority of trades from Settlement Members within two minutes of trade execution.
The intraday FX prices dataset includes both matched and unmatched data. The unmatched data represents the trades where CLS is still awaiting the remaining side of the trade details from the other trading party before completing the match. Both data points are included to provide subscribers with the complete picture of activity submitted to CLS at the conclusion of the hour.
The daily FX prices dataset is based on matched data. The matched data represents the trades where CLS has received both sides of the trade details from each of the trading parties, thereby completing the match.
Python:
cls_data.retrieve_cls_data('EURUSD', dataset=CLS_DATASETS.PRICING, instrument=CLS_INSTRUMENTS.SPOT,
frequency=CLS_FREQ.FIVE_MINUTE, publication_frequency=CLS_PUBLICATION_FREQ.DAILY).head()
Output:
Formatted retrieval example:
Python:
cls_data.load_data_for_cross(cls_data.SPOT_PRICE_5MIN_DAILY, 'EURUSD').head()
Output:
Volume #
Accurate real-time FX volume data from CLS Group, which settles 50% of global FX transaction activity. This is available for spot, FX swaps and forwards across 33 currency pairs.
In determining the time of submission, CLS receives and matches two sides for each trade, one per counterparty. CLS uses the earlier of the two submission times as the trade time proxy. CLS receives confirmation on the majority of trades from Settlement Members within two minutes of trade execution.
The Intraday Hourly FX Volume includes both matched and unmatched data. The unmatched data represents the trades where CLS is still awaiting the remaining side of the trade details from the other trading party before completing the match. Both data points are included to provide subscribers with the complete picture of activity submitted to CLS at the conclusion of the hour.
The volume provided is quoted in USD.
There are three different trade instruments available:
FX Spot
Outright Forwards - Transactions involving the exchange of two currencies at a rate agreed on the date of the contract for value or delivery (cash settlement) on a date other than the spot date.
FX Swaps - Transactions involving the exchange of two currencies for delivery on a specific date at an agreed rate (the near leg), and a reverse exchange of the same two currencies at a date further in the future at an agreed rate (the far leg). The delivery dates and rates of both legs are agreed on the date of the contract.
Spot
Python:
cls_data.retrieve_cls_data('EURUSD', dataset=CLS_DATASETS.VOLUME, instrument=CLS_INSTRUMENTS.SPOT,
frequency=CLS_FREQ.HOURLY, publication_frequency=CLS_PUBLICATION_FREQ.DAILY).head()
Output:
Formatted retrieval example:
Python:
cls_data.load_data_for_cross(cls_data.SPOT_VOLUME_HOURLY_DAILY, 'EURUSD').head()
Output:
Forwards
Python:
cls_data.retrieve_cls_data('EURUSD', dataset=CLS_DATASETS.VOLUME, instrument=CLS_INSTRUMENTS.FORWARDS,
frequency=CLS_FREQ.HOURLY, publication_frequency=CLS_PUBLICATION_FREQ.DAILY).head()
Output:
Formatted retrieval example:
Python:
cls_data.load_data_for_cross(cls_data.FWD_VOLUME_DAILY_DAILY, 'EURUSD')
Output:
FX Swaps
Python:
cls_data.retrieve_cls_data('EURUSD', dataset=CLS_DATASETS.VOLUME, instrument=CLS_INSTRUMENTS.SWAPS,
frequency=CLS_FREQ.HOURLY, publication_frequency=CLS_PUBLICATION_FREQ.DAILY).head()
Output:
Formatted retrieval example:
Python:
cls_data.load_data_for_cross(cls_data.SWP_VOLUME_DAILY_DAILY, 'EURUSD')
Output:
Spot Volume Forecasts #
The CLS FX forecast data provides forecasted volumes, over 120 hours in eight currency pairs, based on historical executed trade volume that is submitted to the CLS Settlement and Aggregation services.
Python:
cls_data.retrieve_cls_data('EURUSD', dataset=CLS_DATASETS.VOLUME_FORECAST, instrument=CLS_INSTRUMENTS.SPOT,
frequency=CLS_FREQ.HOURLY, publication_frequency=CLS_PUBLICATION_FREQ.DAILY).head()
Output:
Formatted retrieval example:
Python:
cls_data.load_data_for_cross(cls_data.SPOT_FORECAST_DAILY_DAILY, 'EURUSD').head()
Output:
Flow #
This new dataset offers a real-time view of FX flows by type of participant and direction of transaction.
CLS sorts FX market participants into 4 distinct categories based on their static identifying information: “banks”, “funds”, “corporates” and “non-bank financial firms”.
In addition, CLS uses historical transaction patterns to identify market participants as price-takers and market-makers. This identification is done separately for each FX pair in the dataset; thus a bank may be a market-maker in one FX pair and a price-taker in another FX pair.
The dataset includes two types of records. First, it includes the aggregate behaviour of all price-takers and market-makers, for each FX pair and hourly time window. Second, it includes the aggregate behaviour of non-bank price-takers grouped by their specific category, for each FX pair and hourly time window. The second type thus has separate records for fund<>bank, corporate<>bank, and non-bank-financial<>bank transactions.
The volumes are quoted in the base currency.
Flow data is available for spot, forwards and swaps.
We note the following facts:
The term “buy-side” is used interchangeably with “price-taker”, and the term “sell-side” interchangeably with “market-maker”. The classification of market participants into price-taker and market-maker categories is based on their aggregated historical transaction patterns; not on the side taken or behaviour observed in any particular trade.
Transactions between two market-makers are excluded from this dataset.
Python:
cls_data.retrieve_cls_data('EURUSD', CLS_DATASETS.FLOW, CLS_INSTRUMENTS.SPOT, CLS_PUBLICATION_FREQ.DAILY, CLS_FREQ.HOURLY).head()
Output:
Formatted retrieval example:
Python:
cls_data.load_data_for_cross(cls_data.SPOT_FLOW_HOURLY_DAILY, 'EURUSD').head()
Output:
Data retrieval helper methods #
The ClsData
class provides helper methods for retrieving and formatting data for multiple currency crosses.
ccy_list = ['EURUSD', 'GBPUSD', 'AUDUSD', 'EURGBP']
Spot volume:
Python:
vol_usd = cls_data.spot_hourly_volume_df(cross_list=ccy_list, add_inverse=False)
vol_usd.head()
Output:
Spot prices:
Python:
prices = cls_data.spot_prices_df(cross_list=ccy_list)
prices.head()
Output:
Spot flow:
Here the direction is the buy volume - sell volume and the size equals the buy volume + sell volume.
Python:
flow_direction_base, flow_size_base = cls_data.spot_hourly_flow_df(cross_list=ccy_list)
flow_direction_base.head()
Output:
Python:
flow_size_base.head()
Output: