Hacker News new | past | comments | ask | show | jobs | submit login

SAX requires different type of storage. You can store SAXified time series in ElasticSearch, or Solr but time-series database doesn't fit for this. Time-series databases should be able to generate SAX representation.



> You can store SAXified time series in ElasticSearch, or Solr but time-series database doesn't fit for this.

That's making some presumptions about the time-series database use case. SAX is convenient for storing and retrieving the data as well as identifying trends or recurring behaviour. What more do you need?


You can't retrieve original time-series data from SAX storage because of normalization. To query time-series data by content (approximate 1-NN, motif discovery, etc) you need inverted index.


The normalized data is the index. It can still be pointing to the raw data. The nice thing is that the index organizes the data in a way that makes it easily (lossless) compressible.


> It can still be pointing to the raw data.

Yep. Each SAX word should be mapped to the list of seriesid:timestamp pairs. This list is often referred as postings list in information retrieval. The resulting data-structure is an inverted index. SAX and iSAX papers describes inverted index variant (really bad one) based on folder structure but one can use convenient IR tools for this.


iSAX2 yields a much better inverted index style model.

The thing is, particularly with time series data, a lot of times it is sufficient to at least start with the summary data in the index.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: