Conference Day One: March 19 2019
Sunday, May 19th, 2019
2:40 PM Big Data's Dirty Secret
"Let the data speak for themselves."
"We apply machine learning to the problem of..."
These are two commonly heard phrases these days. But what data exactly are we speaking about, and what do we intend to do with it? What is ignored all too often is the quality of the data being used and how it impacts the analyses being done. Are there holes in the data? Are there anomalies? Given how dirty data can be, a more apt phrase might be "Garbage in, garbage out".
In this talk we will discuss some of the data problems we've encountered in financial data, and approaches that can be used to address them. Our particular focus will be on techniques we've employed to address missing data and bad data in credit default swap (CDS) spread histories.