
How do data scientists log their workflow/process? - themlaiguy
How do I track how much time I&#x27;m spending on each part of my workflow (i.e. cleaning, experimenting, etc.)
======
ArtWomb
This is indeed an issue with many still using spiral bound notebooks ;)

I always thought video is great for this. Check out iBiology's "Techniques"
channel. It's a great way to disseminate good practices with complicated
biotech equipment. But even the concept of a daily "vlog" with a scientist
dictating notes to a webcam. That could then be automatically transcribed,
indexed, and made searchable.

[https://www.youtube.com/user/iBioEducation/videos](https://www.youtube.com/user/iBioEducation/videos)

------
rahimnathwani
Do you use jupyter notebooks for every part of your work flow?

If so, you could commit them to a git repo every hour or two, with a commit
message that summarises what you've done in that time.

That way, you can run 'git log' to look back over time. If you have good
commit messages, you won't need to roll back and look at previous notebook
versions, but you can use that as a fall-back if you need to know exactly what
you did at a certain step.

------
tixocloud
Given that we're in a very large organization, Excel spreadsheets have been
our default from the start. I created a template to help our data scientists
document issues (we also use Confluence but we haven't got it in a good state
just yet).

For actual tracking, we use rough estimates - more so because we want to
balance being a pain of managing the data vs the value that we get out of it.

------
aaron-santos
By logging this data what problem are you trying to solve or decision are you
hoping to make?

------
themlaiguy
I can figure out where to tell my PM to make further data investments (i.e. if
I'm spending most of my time cleaning a particular dataset, my PM can focus
more data engineers there).

