Hacker News new | past | comments | ask | show | jobs | submit login
What data on myself I collect and why (2020) (beepb00p.xyz)
187 points by karlicoss 11 months ago | hide | past | favorite | 72 comments

I got heavy into tracking personal data after spending a week in a cardiac unit in July 2017. My own tracking efforts have lead me to the absolute best physical shape I have ever been in.

My weight went from 340 pounds (hence the cardiac unit) to 160. I have gotten into ultramarathons and lately, triathlons. Strength wise, I managed to drop 180 pounds while increasing my strength.

It is amazing what the mind can do if you prime it with the right kinds of metrics.

> My weight went from 340 pounds (hence the cardiac unit) to 160.

From one Internet Stranger to another, DAMN, good job! OCD or not, losing close to 200 pounds is a hell of an accomplishment. I'm down just a bit over 50 since May 2020, with an eventual goal of <185. It is also heart issues that inspired me (though my cardiologist says my symptoms are actually benign, I get PVCs so often that they make me anxious, which causes more PVCs, and so on). Being in my mid-40s with a BMI closing in on 40 and anticipating an early demise was very motivational. Still hard sometimes to be good, but every day is a new effort.

I'm just seriously impressed to hear of your success. It's inspiring!

Friend, dropping 50 since May 2020 is amazing! You're kicking ass. PVCs are a damned scary feeling - that fluttering is something that I would have gladly gone my entire life without experiencing.

I was basically where you are, BMI and I and friend, I believe in you completely. However, you're going through PVCs and dealing with that anxiety so I'm going to share something a little dark. I really firmly believe that whoever coined the term 'a broken heart' to describe the loss of love had had a heart condition too. They are so close, emotionally and even physically.

Point being, you're going through a lot and if you need a friend to talk to, my email is in my profile. If you're anything like me, waking up in the middle of the night (4am was my hour) might be pretty rough. I'll give you my phone number and if you need, call me at 4am and we'll have a talk.

I believe in you so much that if you're interested, I'll hold off on qualifying for the Boston marathon and wait for you. Even if you don't want to run it with me, come to Boston. We'll celebrate how far we have both come.

Just wanted to say congrats- I've lost around 15 pounds over the course of the year and found that very difficult. I'm sure I'm not the only one who has told you this, but you should take incredible pride in that.

This question might miss the point because I'm sure a lot of the data interacts, but if you had to pick one metric/data collection system to track- what would you choose?

Edit: Also- what baseball team :)

First off friend, 15 pounds is amazing and life changing. It's incredibly hard. Every single pound is an accomplishment worth celebrating because let's face it, it's a hell of a lot more fun gaining weight than losing weight.

As for your question, that has actually changed quite a bit over 3.5 years. In the beginning, it was purely about my heart because I didn't want to die. Now if I could only track one thing, it would be protein volume and amino acid profiles versus the intersection of raw athletic performance and recovery time. If I want to set a PB, I can usually hack that with nutrition. If I want to increase my weekly mileage, I can always hack that with nutrition. But at this point, I can only do one of the two (or else I poop my pants).

Edit - I forgot the most important question. My two favourite baseball teams are the New York Yankees and whoever is playing the Boston Red Sox. :)

Edit 2 - I wish that I was joking when I said 'or else I poop my pants' but I'm not.

Edit 3 - If I just wanted to create a baseline for how fit I am, I would likely use either burpees per minute or perhaps how far I can walk on my hands. Walking on my hands is a very good predictor of everything from swimming to running performance. Or maybe I'd use how many pukers I can bang out in two minutes? A single puker is one chin up followed by one burpee, both with perfect form. If I ever want to train people, I should come up with a better name than a puker but hey, I talk about pooping my pants on HN so what have I got to hide?? :)

Could you expound on how different amino acid profiles impact your performance, and how that knowledge impacts the day to day food choices you make?


My cholesterol was extraordinarily high when I was in the hospital so lowering that was a major priority. Consequently, I started going more plant based. At this point, my diet is primarily plant based.

This has worked very well for me, but I've had to seriously up my knowledge of nutrition, particularly around the concept of whole proteins. A whole protein contains nine essential amino acids. My body performs and recovers best if I plan my daily nutrition around between .75 and 1.0 g of whole protein per pound of body weight (120 t0 160 g at my current weight).

As I get deeper into training, I'm starting to see some data about how timing of proteins impacts my performance. I used to use raw oats by themselves as a meal before short (up to 12k) runs and speed training. Raw oats are amazing but they don't have enough lysine to be considered a whole protein. Adding dried mangos to my raw oats or carrying some in my run bag completes that protein. I don't see immediate results from completing the protein, but I do see long term results from consistently not completing it. I don't have to complete a protein every single meal - in general, as long as I complete it within that day's nutrition, I'll be okay. But my body seems to recover faster if I complete proteins between pre-run and the end of my run.

Now, let's talk about branch chain amino acids. If I want to go longer than 12k or make significant gains in my weekly base, I perform and recover better if I add sources of leucine, isoleucine and valine before and after I exercise. Meals like brown rice, beans and tomatoes are excellent before/after a workout.

As I train my gut for longer races, I keep learning more about what works and what doesn't. With some luck, by this summer, my gut will be trained enough that I have more personal bests and no more personal bursts. To put it euphemistically, bad bad bad things happen if I'm running a long (90+ plus minute) hilly course, pre-load with rice and beans and then add glycogen via gels.

If possible, can you share how the non-mass metrics have trended? RHR, HRV, VO2 Max, respiratory rate, etc, whatever you got.

Had a smaller health concern as well and did about half the work you have...wondering how much value comes from being able to jog a few miles to completing a distance event. Empirically, of course and ignoring any psychological / self-worth principles: on a effort per heart improvement metric, how much is that juice worth the squeeze, because running a triathlon seems very very hard.

First off, triathlons kind of suck. Everything about them is bad and if given a choice, I would only run. Thing is, I have a very small build and when I was over 300 pounds, even walking was a lot of impact. Running was impossible - I simply did not have the dynamic strength to run with any kind of form. So, I got into biking and swimming as ways to improve my cardiac health without the constant pounding. Problem is, I hate swimming or more accurately, I don't like drowning. So if I don't have events to claim, I won't do it.

I'm still a significantly better runner and don't think I'll ever graduate up to a full Ironman. I have never even done an official triathlon, only self measured events. But having biking and swimming as part of my fitness package have been incredibly valuable. Swimming, for example, has done amazing things for my lung capacity (though again, it's not the swimming, it's my aversion to drowning). :)

Now I'll give you some raw numbers.

My resting heart rate has gone from a weekly average of over 90 to a weekly average of under 50. Simultaneously, my heart rate variability increased significantly. Raw numbers with HRV are a little misleading since it varies so much, but I am 43 years old. At the beginning of my odyssey (I had just turned 40), I had the HRV of the average 70 year old. Today, my HRV is way above average for a 20 year old.

Oxygen saturation is interesting. I went through a battle with covid in December and the nurse practitioner commented that she had never seen anyone as sick as me with such a high oxygen saturation level. When she asked me how I cared for myself, I told her that I had been working on speed all fall, doing sprints and running hills. Her reply was that she was going to start running up hills everyday on her way home from work.

My cardiologist talked me out of trying to actively track and monitor VO2 max. He explained that while it's a decent metric, I'm unlikely to be accurate enough at home to gain anything from it. Back when I used to get regular stress tests, I had a good number every 3, 6 and then 12 months. At my current fitness level, there isn't a particularly good reason to give me a stress test other than personal curiosity and the really fit, incredibly competitive health care worker who actually chirps me out after my stress tests. Next official marathon, I'm beating his ass by thirty minutes.

Mind if I share something that both my cardiologist and psychiatrist have told me? I'm not the typical cardiac patient. Instead, I'm deeply obsessive to the point that after my OCD screening, I went home to rewrite my resume as the screening did a far better job of explaining my talents than anything I had ever written. There is no particular need to take things as far as I have. I'm just 'special' and not necessarily in good ways. A lot of my fitness has come out of fear and if I could spare you anything, it would be to spare you that fear.

Heart problems are emotionally hard. If you want to talk to someone relatively young who has been there too, my email address is on my profile. We can keep everything anonymous if you'd like to feel more open to talk. I've been through a lot and friend, I've seriously got your back.

Regarding the accuracy of home VO2 max measurements, Garmin claims to have a watch that's 95% as accurate as a formal VO2 max test (with oxygen mask on a treadmill). I would be surprised if other good brands (e.g. Apple watch) are much worse.

If you want to go low-tech, max distance run in 12 minutes is a good proxy; see https://en.wikipedia.org/wiki/Cooper_test.

It has been a few years so maybe the technology has improved. My cardiologist explained that the Garmin technology only gets that level of accuracy if their calculation of max heart is accurate. With my conditions, he felt that my max heart rate calculation would be off by far enough to make it next to impossible to optimize based off of that data.

I'm in significantly better shape now. The next time I see him, I'll ask about VO2 max again - it's the only metric that my metric obsessed triathlon/running friends talk about that I don't have a timeline for.

Now all I need are wrists big enough to wear an actual adult sized watch.

For those browsing here there is a common term for "tracking personal data" this called "Quantified Self". If you are interested in it as a n00b like I am this will greatly help start your search into the hobby. There are a lot of people doing it out there and a bunch of outdated or abandoned projects. I have yet to find a modern one that does for QS what Home Assistant has done for home automation and data collection.

There is an excellent subreddit dedicated to QS - /r/QuantifiedSelf/. I learned one heck of a lot from that sub when I started out and there are some seriously intelligent people there. I've stood on the shoulders of those giants for 3.5 years.

You note "How: sensor syncs with phone app via Bluetooth"

Did you write this app custom? Are you selling the app? Any open-sources suggestions for a "concentrator app" like this?

I don't want to make money off of this. Rather, it's more of a service to other people. When I was in a dark place, a lot of amazing, beautiful, selfless people brought me into their lives and made me feel like a human again. I have trouble with the thought of selling anything.

I did write the iphone app but it is quite simply the worst fucking mess of code I have ever written. I wasn't doing very well in the brain when I wrote it and holy hell, it needs a lot of love. I just wanted to be able to dump raw numbers and work with them on my own, but somewhere along the lines, I became really really really dumb. I'll fix it and let you have it if you want. You can do whatever the heck you want with it.

I would love that! I'm happy to clean it up. I think the "concentration" is the worst problem in everything. There is so much sensor data around the house. Sometimes I use an RPI as a concentrator, but that obviously limits the audience.

A smartphone app (iphone or android) would at least reach half the world. If you have a github/etc link (now or later) for others to fork off, that would be awesome! You could post it on this thread.

Care to describe what devices/programs you're using?

Sure, I would love to but I'll warn you that this sounds a little fucked up (because it is). I was writing code and was two days from launching a startup when my heart went to hell. I called our local Healthline and was told to call 911 because "you are having a heart attack" but I'm a moron so decided to drive myself to the hospital.

(I want to start with that story because I got myself into that mess by being a moron and figure that it's important you don't think this came from a place of intellect - it was 99% fear, trauma and trying to overcome the stupid defaults that got me into that mess in the first place.)

The first day that I got out of the hospital, I had to move back in with my Mom. I decided to go out for a walk and was in such a bad place that two little old ladies (one of whom had a walker) actually smoked me on the path back to my Mom's area. I was remarkably close to killing myself after that experience and in all truth, if it wasn't for my daughter (she was 18 months old at the time), I would not be here to write this. My GP referred me to a psychiatrist.

My psychiatrist is an amazing man and an incredible athlete. He suggested two tools that could help me with my cardio recovery but that have also been extraordinarily useful for my mental health. He recommended a Masimo fingertip oximeter and a training band by Polar.

The Polar training band is/was amazing. My first Polar was an H7 with an XXL band. Now I use an H10 with a small band. I'm not sure that I would choose the Polar again if I was starting out now - at the time, my psychiatrist suggested it because I could find a band that would fit in between my massive gut and the band of fat around my chest that used to lead coworkers to tease me about my bra size (I was assigned male at birth and continue to identify as male so that was not complimentary. It was funny at the time but quite sad to think about now.)

Program wise is where this story starts to get a little bit fucked up. Remember the first paragraph where I told you that I was writing code when my heart went to hell? Well, that really fucked me up and I quite literally could not write code or even sit in front of a computer. Computers have been my biggest love and practically my sole hobby since I was a 7 year old with a bad stutter. Suddenly, that was replaced with the most unbelievable fear whenever I would open up a text editor or even sit down at a keyboard.

Consequently, at first I started using mobile apps and ran everything through my iPhone. But as I started to recover and as my psych meds, therapy and other psychiatric help started to kick in a little more, my brain started to come back.

I have a marketing degree and really like data so as my brain started to come back, I started noticing correlations between cardiac metrics and purchasing activity.

That was strange. In a sense, my capitalism came back and I saw an opportunity. But, at the same time, I was a patient and doing cardiac rehab/support with ten older adults from my Mom's community. This sounds weird to anyone with a healthy heart but I'm a firm believer that whoever coined the term 'a broken heart' had had a cardiac intervention. It feels exactly the same and I had this group of older adults who cared enough about me to check in on me, come by and make me go for walks or even just sit and watch baseball with me. Consequently, I was never able to turn that opportunity into a business, but it gave me an idea about some code to write.

So, I started writing my own fitness tracker. At some point I might release it because it's been damned good for me. The first version is unbelievably sad and I have one hell of a lot of trouble reading that code over now. It was somewhere between a suicide note and a eulogy for my former self but that minimizes how brutally sad some of my variable and method names were. My favourite example is fatFuck.name = Greg.

That project turned into rehab and through the years I have kept adding to it. At first, it was purely about rehabbing my brain so I could write code again and maybe graduate to a point where I could comfortably refer to myself as user.name and heck, maybe even remove the word 'dead' from all my method names. But as it has grown, it has turned into a really good tool for teaching myself how to think in terms of healthy defaults.

Here is where stuff gets really fucked up sounding so if you've made it this far, good for you! I don't know if the app I wrote has actually helped me or if I hacked myself. I am deeply obsessive, to the point that when my psychiatrist did my OCD screening, I went home and rewrote my resume because the screening materials described my talents better than anything else I had ever written. When I build anything, I test it obsessively. In this case, to test it, I had to live it and 3.5 years later, I have a six pack, no love handles, two marathons down and an ultra in my imminent future.

Tracking nutrition against performance metrics yielded the most powerful results of all. At first, I started by writing down what I ate and then tracking my macros. I learned that what I thought was 2000 calories was actually closer to 4000 calories. As I started paying more attention to macros like protein and carbohydrates, I found a great feedback loop where eating healthier foods resulted in improved athletic performance. As my athleticism increased, I was able to work harder and then started tracking my moods against that data. That created another feedback loop where the better I ate, the harder I could exercise and the better I would feel.

At this point, 3.5 years later, I actually have healthy defaults. It sounds insane, but I genuinely have healthy defaults. The way I eat, the foods I eat and even the textures I enjoy have completely shifted around.

The only downside at this point is that I'm rather skeptical of most of the wearable industry. I've seen way too many metrics that make my capitalism tingle in funny ways. I know that I could enhance tracking with other tools but at this point, I'm pretty obsessed with reading their privacy policies and speaking as a patient instead of an entrepreneur, I don't think the marketing industry needs (or deserves) this type of data.

I feel like I've missed a lot. At some point, I'll put everything together into written form and do a better job of this reply. Sorry if this was disjointed!!

I read your skepticism of the wearable industry but i'm curious what you think about the generic smart watches? I've been similarly burnt by numerous wearables but the Apple Watch continues to impress, especially the latest one. I noticed you dont have mentions of the Apple Watch on your side -- is that a personal preference? did you evaluate it as a sensor set?

My only qualm with the Apple Watch is the lack of add/on or API to actually allow getting data from it in a non-Batch manner. Any suggestions?

I'm a really bad person to ask about watches. At my current weight/body fat, my wrist circumference is less than 6 inches. Watches, even a woman's size, look absolutely goofy on me. Imagine a little kid playing dressup and wearing his dad's watch. That's me in a watch!! :)

And yes, I am exactly that vain. It's another problem. :)

> I'm pretty obsessed with reading their privacy policies and speaking as a patient instead of an entrepreneur, I don't think the marketing industry needs (or deserves) this type of data.

Perhaps you might like PineTime, the open-source smartwatch.

I really like PineTime and a friend of mine has one. Unfortunately, when I tried it on, it's just way too big for my wee tiny little wrists. I wish that I had a picture to show you - it is genuinely funny.

I have nothing worth to add, just wanted to say thank you for taking the time to write and share about your story.

This was a heavy read but, thanks for sharing.

If you want to talk to someone, my email is in my profile. We can keep it totally anonymous if you would like or I can give you my phone number if you want to talk to a human at 4am.

I'm used to a cold harsh internet. Your kindness has meant a lot to me. Thank you!

You know friend, I have seen a lot of software developers die before their time. While drugs and suicide claim more than I want to recall, heart health has claimed a lot of amazing people.

If I can be of any help, even the smallest, most fleeting bit of help, it’s paying back a lot of amazing people who looked out for me. Worst case scenario, laugh at me. Best case scenario, maybe we can prevent a few premature deaths.

My email is in my profile. Feel free to reach out if you ever need a human and thanks for your kind words. It’s scary to talk about this stuff (so I don’t do it more often) but feedback like this is really encouraging. :)

That is incredible. Good job on you, truly!

I'm usually quite skeptical on the whole quantified self thing, but if it fits your personality type it can be a big help to get you started at least. I'm old fashioned, I track my weight and workouts in a paper notebook and periodically enter my weight into a spreadsheet so I can spot any upward or downward trends.

Do you think you could continue your new lifestyle if you ceased tracking all that data?

I was very worried about that when I started but strangely, over the last 3.5 years, I have developed what I call 'healthy defaults'. I notice this the most with how I shop for groceries and eat in restaurants. Through tracking things, I have been able to form some very strong habits. Ultimately (for me at least), getting into shape and staying fit is about constantly keeping the habit and maintaining healthy defaults.

Exercise is another area that has just become habitual. It started off as something I really hated but as I got more fit and started optimizing my nutrition, I got into an incredible feedback loop. The better I eat, the harder I can exercise, the more fun I have and the better I feel after. That creates a really strong feedback loop with my productivity - I can get more done now in six hours fit than I could accomplish in 10 hours fat.

I think it would be very hard for me to go back but it's worth trying out. I might give up the app for a few months and see what I can do without it. If I've merely swapped addictions, it's positive (for me) but my tracking regiment is bad.

I've found that forming habits is the key as well. At the start you need all the help you can get until whatever you do becomes part of you. I didn't have such a long way to go as you did, but having had a BMI of well over 25 I know it can take you a while to get to where you want to be.

If you told me three years ago I would be able to lift almost two and a half times my body weight, I wouldn't have believed you. I'm no star athlete, but I'm having fun doing it. That's what matters.

Habits and enjoyment.

Two and a half times your body weight?? Holy crap friend, from a BMI of over 25 to that??? My hat is off to you - that's absolutely amazing.

Is that a deadlift?

I'm 100% with you - it is all habits and enjoyment. It's so hard to grasp that until you've been through it and somehow I feel like we lose a lot of amazing athletes every year because of that disconnect. I don't know but when I see someone with a BMI of over 25, I see incredible athleticism. Heck, at that point, life is exercise.

Yeah, that's a deadlift, by far my best lift.

I forgot who said it but the quote says "I've never seen a fat guy with small calves". I've tried to see how many times I could squat 60kg, I stopped at 28 because the burn was just too much. And that's just one set, I can rack it and walk away, you can't if that weight is you.

You're a really good person and I am enjoying this conversation. Thanks for sharing this - it's really cool how much we have in common.

My former fat boy calves are my biggest ally as a runner. Running hilly courses is particularly fun - traditionally fit people can usually beat me on flats but I am very fast on hills. Heck, I was once big enough that every single day was hill day.

28 reps at 60kg is amazing. I'm going to try that out - I like deadlifts too and that sounds like one heck of a lot of fun.

Can you go into more detail? This is really fascinating

Congratulations! What you accomplished takes some serious discipline. Even with metrics it couldn't have been easy.

I love the idea of personal data analysis. In fact, I recently wrote a book with that premise called "Everyday Data Science".

I just wrote about it in another thread, but it seems relevant to the discussion here.

I definitely agree that you can really take control of your life with a bit of data and some not so fancy analysis.

In the book I shared the story of an old professor of mine that tracked a personal health marker from yearly screenings. The threshold for concern was "4 ng/ml". His numbers were always right around 1. However, there was a super noticeable trend up to 2.5 and 3. The doctors all waved it off because it was still below the magic "4" number.

However, when he showed them the graph of historic data, they rushed him to a specialist and detected (and removed) cancer.

I think data collection and personal analysis for day-to-day stuff is vital and fascinating. Thanks for sharing :)


Yep, one of the motivations for me is learning & establishing health baseline. For example, with blood tests it's unlikely that I'll be able to infer any interesting correlations (at the very least because you can't do them every day, with the exception of glucose/ketones perhaps). However I'll know what is 'normal' for my body and if something breaks, hopefully will have some clues.

Over the past few years I’ve thought about building something similar to Home Assistant, but for human centered data and APIs vs home centered. Essentially a shared bus on which different data sources could be plugged into a standardized format, on which people could build dashboards and automations. Never got around to doing so, because sidetracked by other projects, but if others are interested in teaming up I’d love to work on something like that (email in profile if anyone wants to start this)

In that past ten years, I have been recording my daily activities in Moleskine daily planners. I am using the small size daily planner with one page per day. I always carries it with me. The main motivation to start doing it, was to fight the feeling that time is going faster and faster. It takes me about a quarter of an hour per day. I do not record every things, just the things I think a worth recording. I do track all of my movements. I have thought about recording everything in a database, but never come to it. For some time, I have been tracking my travels with Google Earth. The last year, I have started doing this again for for walking and biking 'trips' I make. I do not do it for any purpose, but now that I have started this habit, I am afraid I will stick to it. In some occasions, it has been of some resolve some questions, but in other instances, my notes where not detailed enough. I do use my personal website to track certain events. I wonder whether it is worth collecting large amount of data and thinking that in the future it might be of some benefit. If it is worth investing some energy now to record it in an organized manner, it might be of some use in the future.

> How: many shells support keeping timestamps along your commands in history.

> E.g. "Remember all your bash history forever".

It would be far more important to get $? to get the rate of error of the commands you type (ex: typo) and use that as an indicator of poor mental acuity (ex: bad sleep, stress, ...)

Interesting idea! I suspect I'd need to use shell far more often than I do currently to get anything conclusive though.

But I guess it's possible to infer error rate even from past data by looking at closely spaced commands wish short Levenshtein distance.

It depends on your workflow. I'd say your approach is interesting, but it's mostly driven by availability of the data ("searching under a streetlight") except for your sleep monitor (however, being an under pillow solution, it's not essentially different from the accelerometer data from a smart watch)

I would suggest you invert the logic: design objective (ex: mental acuity), conclude on what things you want to track (ex: time slept, cups of coffee per day, time you drink it, errors in shell) and get the data accordingly, spending as necessary (ex: smart coffee maker? or maybe a smart plug that can let you infer when the machine is being used from the wattage drawn)

As for data tricks, L distance is a good one, yet not applicable for shell, as it's sensitive to the string length so you would need correction. Also, it's missing the essential metric: did the command work? Only the error return code will give you that.

TLDR: think about what you want, but before that think about why you want it. Collecting useless metrics is another form of data hoarding.

Yeah, it's a good point, and I'm indeed a bit guilty of hoarding data I can't immediately process.

However, the problem with the 'objective first' approach is that it's gonna require a lot of data to draw meaningful rigorous conclusions from small interventions, so I'm making sure to 'secure' the data first, and then gradually process it.

But also, it's a challenge in itself -- I'm automating a lot, sharing my system and trying to interoperate with existing tools, in the hope that my work can be useful to other people and make quantified self easier for them.

> in the hope that my work can be useful to other people and make quantified self easier for them

Careful, you risk making a mistake driven by "warm-fuzzy-feelings"!


Anything you do, do it only for yourself.

Have you considered loading this stuff into SQLite in addition to keeping the raw exports in files on disk?

Being able to query your personal data with SQL can get really interesting. I've been using it for my own version of a personal data warehouse, described here: https://simonwillison.net/2020/Nov/14/personal-data-warehous...

Yep! In fact I've tried interoperating with Datasette (e.g. shared here https://news.ycombinator.com/item?id=25090643 )

One secret sauce is using 'automatic' caching of data in sqlite -- this allows both for faster access and having an additional interface for the data as a collateral https://github.com/karlicoss/cachew#readme

Still need to polish this a bit, but ultimately hoping to properly plug into Datasette, I'm impressed by its data exploration capabilities!

What we need is a law that obligates companies to enable scheduled auto export to a given URL https://github.com/tomaszs/RightToBeRemembered/blob/main/REA...

> this is arguably the most important thing you should export considering how heavily everything relies on email

I found this one interesting because I practice inbox zero so I delete 99.9% of the emails I get.

Regardless, I do have some I keep just in case, so I should back those up.

I want to do something similar, and load it in a place like snowflake. I say "like" snowflake because its annoying that they have a $25 a month minimum, otherwise it's already nicely suited to be a personal data warehouse!

BigQuery charges per query and for storage which might work for you. Where did you see the $25 minimum for Snowflake?

BigQuery or Snowflake seem rather extreme for anything I might consider personal data. Even logging as many things as in the article a local sqlitedb and your preferred backup solution would get pretty far. It might be easier to setup nice dashboards with a cloud product though.

I want the db to be in the cloud. I'm not on my laptop most of my time and I'm definitely not on the same laptop more than 20% of my time. Hosting any sqlite db or postgres db on the cloud means I pay for hot storage which is significantly expensive if I have 200gb of data say. (Which I do; I have a ton of data from my lab days that also I want queryable like this; I also intend to purchase some datasets that I think might be fun to keep at hand so)

I can see the appeal. Those services might be designed for petabytes of data analysis, but the flip side is that you have something ready to go with little maintenance.

Forgot about BigQuery, will have to give it a shot!

The Snowflakes minimum Is something my colleague got from them when we set up an account for a startup idea.

You can pretty easily store most of these data in any database. Don't need something like Snowflake - postgres stock can handle about any analytics queries, especially if you're not writing too much.

So the issue is not paying for compute when you're not using it. Firstly I've tried running postgres on a low powered lightsail instance and while it runs, it's kinda insane that it takes a ton of time to load data and then to query it. Indexless scans of data in large tables (let's say a db with all my emails) can take a while!

I've tried the same in snowflake, and I'm only using their smallest warehouse, and everything happens in seconds. Loading GBS of data is seconds, scanning GBs of data is seconds. Their smallest compute warehouse is still significantly beefier than the cpu you get with a puny lightsail with a shared vcpu and while it's disk io is shit (since it's reading from S3) the parallelism makes up for it.

In the end I think it's the pricing model that makes the difference. With snowflake you seamlessly launch a very powerful machine to run your query and only get billed for a minute, and with running postgres you need to run your machine all the time. Also hot storage (as opposed to s3) is also non trivially expensive. Haven't seen any way to get 200gb of ssd storage without spending that 25 dollar minimum that snowflake costs anyway. This is data I intend to not even update or query more than a few times a month, so I really don't want to pay for compute running all the time in its name.

That's a fair point - the 200GB disk costs a bit more if you need an SSD (although my experience shows an SSD is not super necessary unless you have write heavy workloads or otherwise unusual access patterns).

If you drop the SSD requirement, a cheap dedicated server will blow Snowflake out of the water. As a bonus, you can run your own code (ETL, scrapers, dataviz). Kimsufi has 4 core 2TB (spinning) w/ 16GB RAM for $17/mo. Personally, that is the route I go for my personal warehouse, as I find I almost always want to put an API or django app in front (and other software like Celery, scrapers, etc.)

This is one of the reasons I've been building my version of this on top of SQLite: it's incredibly cheap. All you need is a writable disk somewhere. I started with a $5/month VPS.

How much data are you storing? How much is 200gb of storage? At least on lightsail it wasn't cheap! Also that's not even backed up!

Only about 20GB, so it's pretty inexpensive.

I'm not actually bothering to run backups because theoretically ALL of the data there can be retrieved from other sources - pulled back out of the Twitter archive exports for example.

But another similar project uses tarsnap for backups, which is pretty inexpensive.

This is ... the future.

I am going to start off with a few this week - well done and good luck !

Just wanted to say, I'm a fan of your site (even directing my offspring to think about your work). `/salute`

Imagine if a service like this existed. Meaning: a service that would allow you to collect all this data, without any effort.

It's doable. It could be an app on your phone. Properly done, it would be amazing.

How will you graph it all or draw correlations ?

For graphing: different data requires different representations, so I'm figuring out how to do it in my dashboard project: https://github.com/karlicoss/dashboard

For dashboard, I started with health related data, because feel like other stuff (like 'histogram of my tweets' would be amusing, but not super useful immediately), and don't really expect any interesting quantified self insights from it.

I'm gradually figuring out 'generic' ways to compute, plot and examine correlations, e.g. here https://github.com/karlicoss/dashboard/blob/708eb183130de31e...

In general I'd like to have some automatic system which can consume all of my data and suggest interesting correlations in almost unsupervised way.

(got more notes/possible implementations on it here https://beepb00p.xyz/exobrain/projects/dashboard.html#mtvtn )

I feel like the QS space needs a suite not unlike Home Assistant where you can plugin multiple integrations to these various services and begin collecting data relatively simply. Have you found anything in your travels that is mature enough to extend or spend time enhancing?

I still need to incorporate Home Assistant into my infrastructure, so not sure if I understand the analogy exactly.

But basically I think the answer is 'no' -- there are so many different 'APIs' (quoted because often there aren't proper apis, you have to scrape, export manually etc.), that very few people attempted to unify it.

I have some thoughts on how such system could work like, described here https://beepb00p.xyz/sad-infra.html#data_mirror and here: https://beepb00p.xyz/exports.html . The export tools I wrote are the first order approximation of this vision -- seems robust enough for me so far, so now I'm working on connecting it to other tools and looking to cooperate with other people.

Home Assistant takes many disparate brands, services, devices, APIs and architectures for the world of IoT devices and yet they are able to somewhat successfully pull them together. I see no reason QS is any different and couldn't do the same even if it is scraping or exporting at intervals where an API does not exist. The analogy is different devices with different things they do, measure, record, and interact coming under a single umbrella of a QS service in a time series database (not unlike and maybe even exactly as is done with: Home Assistant + InfluxDB + Grafana).

Ah, yes, putting it into a single database is something I have in mind!

Basically so far I was solving the 'data import/scraping' part, but now finally starting to connect it elsewhere now that it's somewhat unified and normalized.

I think Altair would make these plots easier.


Have you maybe compared Emfit QS to a customer-grade tracker like Fitbit directly?

Yeah, I also have a Garmin watch, so wanted to write a script to properly compare the data for a while. From the few spot checks (e.g. actually comparing 20 select days against my manual records)

- heart rate and respiration rate match pretty closely (kind of expected I guess)

- Emfit is noticeably more accurate at detecting asleep time/awake time. But even so, sometimes emfit doesn't detect that you've woken up and just lying in bed with the phone.

- 'sleep phases' at a glance don't match well. Definitely need a comparison script to do any stronger claims. In addition, I'm still not sure if sleep phases actually mean anything at all (for my body, anyway), need to do more analysis about it.

- Emfit detects HRV, which doesn't seem to correlate with any of my subjective feelings. However, it does seem to change after exercise, for example.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact