Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
FEATURES >>><br />
Jim Foster<br />
“Occasionally high-frequency algo traders want our historical<br />
market data to benchmark the system speed to monitor how<br />
fast we receive and distribute movements in the market in<br />
<strong>com</strong>parison to other data they are monitoring”<br />
participants that are handling significant volumes of<br />
such data are “quite savvy” in being able to<br />
manipulate databases. “You couldn’t exactly give this<br />
to a total neophyte [a beginner] as they would not<br />
know what to do with it. Algo traders and those who<br />
have super sophisticated strategies typically know<br />
what to do with this type of data. And, these are the<br />
drivers for the offering.”<br />
Minor Huffman, CTO at FXall, explains that are a<br />
number of items that need to be considered in this<br />
regard, including: (1) Data quality, (2) Data coverage,<br />
(3) Analytical tools; and, (4) Hierarchical or other<br />
customised data models.<br />
Data quality <strong>com</strong>es in to play in respect of the number<br />
of contributing banks or other rate sources, tradable<br />
versus indicative data, as well as error correction<br />
techniques used to cleanse erroneous or off-market data<br />
from the set. Data coverage revolves around currency<br />
pairs, spot prices and forward tenors, bid/offer rates<br />
versus mid rates, time intervals and historical coverage.<br />
Huffman says with regard to analytical tools required<br />
for modelling will create specific requirements for<br />
data availability (e.g. operating system, data format).<br />
“Many clients will want access in order to verify and<br />
back test algorithms, which may drive the choice of<br />
technology used to maintain the data sets. Traditional<br />
relational database models don’t handle time-series<br />
data efficiently,” he adds.<br />
70 | january 2010 e-FOREX<br />
In terms of hierarchical or other customised data<br />
models, the key issues are the tradability of the data<br />
and time stamps. “If a system has a high miss rate for<br />
trading, then the value of its data is reduced, both for<br />
back testing of models and measuring a system,”<br />
notes Huffman.<br />
And, in order for the data to be meaningful it has to<br />
represent tradable data. For example, is the data time<br />
stamped at receipt by the platform, in the matching<br />
engine, or at distribution? “Models will require<br />
improvements in data access speed and increases in<br />
data storage,” he says.<br />
Thomson Reuters’ Doe says that in terms of storing<br />
or sourcing the data, often when firms store it<br />
themselves they can encounter issues around “gaps” in<br />
the data. And, the problem is not just isolated to a<br />
few firms. Largely it is due down to the firms’ own<br />
collection mechanisms and where they are storing it,<br />
he explains. That is why firms <strong>com</strong>e to a source to<br />
obtain this data (e.g. Thomson Reuters).<br />
Cleanliness<br />
Bloomberg’s Brittan assets that data cleanliness is the<br />
“most important factor” in this regard and is pertinent<br />
to all types of users. The questions here are:<br />
• Is the data free from ‘spikes’?; and,<br />
• Is the frequency of updates acceptable?<br />
Minor Huffman<br />
“If a system has a high miss rate for trading, then the<br />
value of its data is reduced, both for back testing of<br />
models and measuring a system”