Hacker News new | past | comments | ask | show | jobs | submit login

Are these records assumed to be in order?



Yes. That would of course be included in the problem statement


That’s not obvious. If you are including “gotchas” this may be another one.


Its only a gotcha to anyone who has never looked through a log file.


I have seen a lot of log files, never one in CSV format or without timestamps.


Since there is no timestamps, it being in order is a requirement because otherwise it's unanswerable. Since chronologicalness is indeed virtually universal for any sort of log file, it's also a fairly safe assumption, but sure, if you want to double check assumptions then it's a valid question to ask. I do think it was obvious enough, though, and the question that came to my mind was rather about scale, like: can I assume the number of users and unique paths will both fit in RAM?

Btw, if you want CSV log files, look no further, and not all my data logs have timestamps either! :D The particular timestampless case I'm thinking of, I wanted to log pageload times for a particular service so it logs the URI (anonymized) and the loading time, though I think that's not csv but just space separated, one entry per line


Or citing the previous “gotcha” this is a trick question and I am meant to describe a change to the system in which useful logs can be captured.


Candidates that handle this in a streaming fashion get extra points, but it’s not required.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: