Hacker News new | comments | ask | show | jobs | submit login
MacOS monitoring the open source way (dropbox.com)
274 points by el_duderino 9 months ago | hide | past | web | favorite | 53 comments

This is super interesting and ambitious and I'm always sort of envious when I look at big engineering orgs, like Dropbox and Slack, where everything has fine-grained instrumentation.

We do this kind of work for startups, and we have a small roster of companies where we're on the hook for this kind of corpsec.

But then I think about what it'd be like to actually deploy and operationalize this stuff, and it gives me pause.

I have two big questions --- real questions, like, I have no pretense of having an answer and am speaking from ignorance --- about how this stuff works in practice.

The first is privacy. They're deploying this across a fleet of company laptops. People do all sorts of stuff on their corp laptops. I don't know enough about Dropbox's company culture to know whether people ever use their machines for personal stuff, but I do know that occasional personal use is an SFBA startup norm. They're collecting essentially a continuous bash history from every user in their fleet. How comfortable are engineers with that? I don't have a strong opinion, but it gives me enough pause that I'd be a little afraid to ask a client to do it.

The second is: what do you do with all that information? I see how this addresses malware response, and I see how this would be useful as a forensic archive for general IR. But as a day-to-day tool, connection-by-connection, file-by-file granular data from a mostly technical team seems incredibly noisy. Most engineers look like attackers: they make out-of-process connections to random backend resources, copy things around, and run new code.

I acknowledge right away that my vantage point is small startups, not organizations running at the scale Dropbox is.

> They're deploying this across a fleet of company laptops ... How comfortable are engineers with that?

I worked for the corporate infrastructure team at Dropbox, at the time my team was responsible installing/managing things like osquery on laptops, with the data going to the security team's tools, so I have thought about this. Speaking for myself, the Dropbox culture takes trustworthiness very seriously. "Be worthy of trust" is #1 on the list of company core values, and it is deeply embedded into the culture. The trust covers so much of how the company operates, including privacy. Spying on employees is so against company culture, even though I knew exactly what data being sent, I trusted the security team would not use that for anything other than protecting me. The detection and response team is made up of the most security and privacy conscious people I've ever met.

I think the mindset of respecting coworker privacy is important for anyone in corporate infrastructure, including network and system administrators.

But I've worked for other big corporations too, without that trust. I think employees need to know that when they use corporate computers and communications systems, they aren't private. Don't do stuff you shouldn't be doing at work.

> what do you do with all that information? I see how this addresses malware response, and I see how this would be useful as a forensic archive for general IR. But as a day-to-day tool, connection-by-connection, file-by-file granular data

When I worked there the security team reached out to me to ask about anomalous behavior, so there is active detection going on. That was a couple years ago, I'm sure they have improved since then too. Keep in mind, they have a full time detection and response team, in addition to all the other security teams.

When Dropbox deliberately circumvented Apple's security features to make itself difficult to remove [1], was the company's #1 core value of "Be worthy of trust" in mind, or was that only added in retrospect?

[1] http://applehelpwriter.com/2016/08/29/discovering-how-dropbo...

This issue has been addressed already:


I don't believe there was any intention of making Dropbox hard to uninstall. I think the intention was to make the process seamless for users, and there was a bug that caused a setting to get reapplied after a user changed it.

I believe that Dropbox doesn't really have poor intentions here: they're trying to make their product easy to use or work better or whatever. I don't think their intentions are malicious. However, I strongly disagree with the method that they use to achieve this: installing kernel extensions, bypassing Accessibility prompts, having a hooks in every process they could possibly get their hands on, etc. is going too far. There's a reason those checks are there: they keep the user safe. Trying to get around these is, in my mind, arrogant. They think that they're better than every other company that abides by the rules. Google Drive doesn't do this. Box doesn't do this. Even iCloud Drive doesn't do what Dropbox does.

I was talking to someone just two days ago, who was tearing their hair out because their application's "Open" panel took something like ten seconds to open. Why? Because the Dropbox extension decided it didn't want to play along nicely. It's foolish to think that your software will not have bugs, and outright foolhardy to do this for so little benefit.

Dropbox pushes the state of the art for file sync. They added sync status icons on Mac OS X before Apple had an API for it, later Apple added an API. Those icons are part of what makes the product so usable.

The kernel extension is another example. AFAIK it was added for Smart Sync[1] (née Infinite[2]). Infinite is amazing, truly amazing, and the only way to implement it is via a kernel extension. Microsoft tried to implement it via the GUI with OneDrive's "smart files" (aka placeholders), they removed it because the files didn't work in too many places, like via syscalls and from command line.

Dropbox Infinite is kind of like the source control systems Google and Microsoft created to handle their huge repos. Microsoft's GVFS uses a file system filter driver (kernel extension) called GvFlt (or ProjFS)[3]. From what I understand, Google's Piper uses FUSE[4], which would be a third-party kernel extension on macOS.

My point is, these are technical achievements that provide seamless and intuitive user experiences, they aren't betrayal of trust.

[1] https://www.dropbox.com/smartsync

[2] https://blogs.dropbox.com/tech/2016/05/going-deeper-with-pro...

[3] https://www.visualstudio.com/learn/gvfs-architecture/

[4] https://cacm.acm.org/magazines/2016/7/204032-why-google-stor...

> My point is, these are technical achievements that provide seamless and intuitive user experiences, they aren't betrayal of trust.

You having put a lot of effort in this thread into weaving trust into your description of Dropbox operations, I'm surprised that you wouldn't trust an end user when they tell you that this in fact is a betrayal of their trust in Dropbox.

But I don't think I can disable your "state of the art" features if I really wanted to. I don't want your extra features if they require trapping on file modification syscalls in the kernel. I'm fine with the behavior that other apps provide without your "truly amazing" project Infinite kernel extension.

> Dropbox Infinite is kind of like the source control systems Google and Microsoft created to handle their huge repos. Microsoft's GVFS uses a file system filter driver (kernel extension) called GvFlt (or ProjFS)[3]. From what I understand, Google's Piper uses FUSE[4], which would be a third-party kernel extension on macOS.

I find that your examples really show a lack of understanding of why I'm frustrated by Dropbox's behavior. The products you linked are source control systems used by software engineers. Dropbox is aimed at nontechnical users. One of those groups understands what a kernel extension is, and the other one can't point their finger at Dropbox when their computer crashes. How is this not betrayal of trust? When a user installs an app they don't expect it to literally put its fingers all over the operating system. Not only does Dropbox look like any other application on the surface, it actively works to perpetuate this myth by spoofing operating system dialogs and prompts. If a doctor did a procedure on a patient that they didn't fully explain or even mention that they were doing they'd be sued for malpractice.

Also, I don't really like doing this, but you seem to be overly positive about Dropbox's work in this field. Do you work there, or have you ever worked there? Did you have a vested interest in this project?

Interesting that osquery is developed by Facebook and Santa is developed by Google. I do some work with Facebook and during a video conference with one of their employees the connection dropped out. When we reconnected I asked what happened and they said that the machine updated itself. I thought this was kind of an odd time to update and dropped it into the conversation and was told that their laptops are updated every day. Without an understanding of the inner workings of Facebook it seems like they may be snapshotting employee machines, daily.

I doubt it was snapshotting. It is common to force OS updates. For macOS there are tools where the employee will be prompted to update, and can dismiss for a while, but eventually they will be forced to restart. My guess is that the Facebook employee had ignored update prompts and was then forced to update. Every day does seem odd, I can't think of an update type that would require restarts/logoff every day. At least a couple years ago Facebook and Dropbox were managing their Macs in similar ways, with similar tools, I'd be surprised if they changed to something that required daily restarts.

Facebook has a Chef cookbook called cpe_nightly_reboot for their Macs. I can't think of why that would be required, but they must have a reason.


I don't work at Facebook, but I'm fairly certain this is for things like build servers, not client machines.

They use Munki for software updates, which, just like you describe, gives the user a grace period where they can reboot at their leisure. Ignore it for long enough and it'll crank up the nagging, and eventually force a reboot.

this is for things like build servers, not client machines.

It talks about killing GUI processes. Is it common to run a display server on macOS servers?

Yes. Services that can run without a display are better run on a Linux server anyway.

I assumed it was for compilation of iOS apps, do those need a display?

No, you can compile iOS apps from the command line.

I don't think it's possible not to.

"you may run into unexpected performance issues that make the machine nearly unusable by your employees. You might also experience issues like having hosts unexpectedly shut down due to a kernel panic"

It is fairly evident they are talking about carbon black.

Carbon black's kernel modules panics in 10.13 and when you use it for development workloads it eats up half of your CPU in kernel space, making the entire machine unresponsive.

All of the large SFBA tech companies have something like carbon black, and they proxy your DNS requests with openDNS. Every network interaction is logged and so is whatever you do on your computer, personal or not. Whatever controls they have over security engineers looking over that info is probably not enough. Creepy stalker sysadmin is a tech industry scandal waiting to happen.

I've also heard rumors at some companies they even give reports of what employees are doing on their computers to managers, and managers are told not to tell their reports. I also think google records every keystroke. Those are rumors so take that with a big grain of salt.

I don't like it, but if I want SFBA pay, I've come to accept there is no privacy on work computers. The only things I do on my work computer is HN & Twitter. Things that are effectively public. Everything else goes through my personal smartphone. I have my own computers, and in the end it mostly means I have to carry 5 extra pounds in my backpack sometimes.

> I've come to accept there is no privacy on work computers.

It has always puzzled my why anyone began to assume they did have this privacy. I have been around long enough to see the token ring to ethernet transition, and (formerly) as a network engineer trying to jam web 1.5 business through a fractional T1. Never mind your privacy, all that network throughput and CPU is paid for and hard-fought-for and it costs a lot in CAPEX and OPEX. Getting to do a little online banking or email on work gear has always seemed like a short term scam to me.

Can you put a bitcoin rig under your desk and use your employer's electricity to make money? Yeah, if it looks like a footrest, until the culture and policy catch up. This already happened with postage and long distance phone calls.

You never had a right to privacy on work computers. We all pretended it was a little perk for a few years, but the risk pf letting you do personal stuff is not worth it anymore for companies of a certain size.

Well, it is a jurisdiction issue.

While I agree with you, many countries with good work laws do limit what employers are actually allowed to see from their employees' data.

Meaning if the company lets someone go based on data they capture, and it was the kind of illegal one, then they better prepare to sign up some checks.

For example, on Germany a company is not allowed to make use of how long one is online to create some kind of productivity measure.

The problem is that many prefer not to bother with work law, not even "work law for dummies" kind of books.

I don't have a 'right' to privacy in my workplace bathroom stall either, but it's pretty fucking creepy if my employer watches me take a shit on the toilet too.

Now imagine if most employees didn't quite understand what the 'door security gates' are really doing, and you start to understand where the problems start.

That's the crux of the issue for me.

Also there is another perspective about labor laws, and if a society should allow employers to do such things in the first place.

> I don't have a 'right' to privacy in my workplace bathroom stall either

Actually you do. That scenario is legally distinct from your computer use.

Thats great! We should extend it to people's private communications on work computers, because it's about equivalent.

Like Austria :) https://www.taylorwessing.com/globaldatahub/article_austria_...

I don't think looking at my genitals is equivalent to monitoring my work computer.

People are imperfect humans. Someone is gonna forget and check their email, as they habitually do, since we are not trained at birth to segregate our computer usage by machine, while they are at the office (where they spend the majority of the day) on their work machine (where they spend the majority of their Time). Take this example and extend it ....

That said I agree with you and do think being cautious about what one does on ones work machine is very important.

People really shouldn't use their corp laptops for personal things. I know they do, but there really shouldn't be any legitimate expectation of privacy on your corporate laptop; not only do you sign a contract stating as such, but anything you create on there (on company time or not) is company IP, whether related to your work or not. California is very careful about protecting employee IP if done on personal time, but only if done without corporate devices.

IANAL, etc.

I used to work at Dropbox (as a PM, not a dev so I can't speak to that specific experience).

It was made pretty clear during orientation that there was some monitoring software installed on your computer that needed to be installed so we could keep our various compliances (HIPAA, PCI, etc). I think they were pretty clear that humans were not actively reviewing these logs (i.e. a computer would flag activity based on a heuristic) which definitely made me feel better. I was friends with people on the internal security team, and it was very obvious that they took employee privacy very seriously, at least from a philosophical perspective.

Obviously, people used their work computers for personal things. People installed Steam, Skype, whatever. I never heard of anyone getting in trouble for using that kind of software.

Security actually contacted me about my usage once. I was doing some Wikipedia browsing about password cracking cracking software and I noticed that there was one program in particular (Cain and Abel) that was exclusively for Windows. That struck me as weird, because I expected most security software to be run on *nix systems. Anyway, I went to the Cain and Abel homepage, browsed around a bit and then went on my merry way.

The next day, I get a somewhat alarming email from the security team alerting me that my computer had been compromised and removed from the network. I reached out to the guy who sent the email over Slack, and I go to his desk for a quick meeting. I ask him what the deal is, and he checks my Macbook's S/N against the list of flagged machines. He asked me if I had gone to "www.oxid.it" and at that point I freaked out a bit because I didn't recognize the domain. We searched the domain, and it was at that point I realized it was the same software I was looking up the previous day, and told him the site visit had been intentional. He explained that visiting the site is behavior consistent with a compromised machine and that would explain why I had gotten flagged.

I pointed out that C&A is Windows only software and I was running OSX, and he gave me the (somewhat unsatisfying) answer that you can't be too careful. After that, he told me to be careful and sent me on my way.

Overall, I had a pretty positive perception of the internal security team. They had a strong reputation and seemed to execute competently without being obstructionist.

I hope this answer gives you a bit of insight. I'd be interested to see how other shops do it.

Making a large amount of this kind of data operational in real time seems like a large task. I'm wondering if it's mostly used as a postmortem tool to determine the attack vector and close that hole going forward.

Also I am interested to know if this data would fall under GDPR for any employees in the EU, and if so, how that data could be scrubbed if the employee left the company. In the US it's assumed that any information about actions on corporate hardware is owned by the company, but maybe that's not true outside US borders.

My understanding is that it's actually not all that common for engineers to do quite the same things attackers do, since a lot of intel is fairly specific. Sure, abnormal network or process behavior might cause interest, but if it's not connecting to "superbadmalware.org" or running "known-exploit.exe", a dev probably doesn't cause very many false positives. It probably depends on the company, and the people looking out for that company will know the difference.

As for what to do with all that information, it's definitely a great question. Right now I think the answer is person-driven (get eyeballs on the intel), but obviously that's not going to last. Everyone in the space (AFAIK) is trying to take what analysts do, and productize it somehow.

I'm sure you've got insight into how that's happening on your end, it's a pretty straightforward leap when faced with, "Do I scale my service?" questions.

There are actual Endpoint security products that pretty much do this across all platforms in much more integrated way.

Yes, but they are nosebleed expensive and are shrink-wrapped around a set of threat vectors common to all their customers.

The privacy concerns with commercial endpoint stuff are comparable (though not the same; with some endpoint tools, it would be difficult for a manager or a random IT person to snoop on what you're doing). The concern about operationalizing is mitigated somewhat, but only if all you're doing is malware/public-threat-feed stuff.

I am not sure rolling your own will be less expensive.

It depends on what you’re trying to accomplish, but commercial distributed endpoint protection would not be in my top 20 projects for any startup I can think of working with.

That obviously depends on many factors Dropbox is not exactly startup though :)

> People do all sorts of stuff on their corp laptops. I don't know enough about Dropbox's company culture to know whether people ever use their machines for personal stuff, but I do know that occasional personal use is an SFBA startup norm.

Hmm I wonder if this is true or not, or if it’s a startup thing. As someone who has worked for BigCorp Inc. companies, I know I carefully maintain a strict separation between my work and personal devices, for privacy reasons and simply to follow company policies WRT confidentiality. Is this “crossing the streams” between work and personal stuff that common in the startup world? I’d think at the very least even a small startup would want to keep their secret IP off personal devices and limit risky horseplay on company equipment.

In the company where I work I know many people doing personal stuff on corporate laptops and I do my stuff on corporate laptop also.

In my case though the company has a nice privacy policy in place and they even gone above a beyoing configuring VPN that only corporate network requests are routed through their VPN servers. Also they don’t monitor the activity, they only scan files locally for certain known hashes for obvious reasons.

In my previous company we had Linux on laptops and our network were highly monitored. People used their work machines like their own. Company had several checks for network in place so if you’d forgot to turn off VPN, you’d got a notification saying that you probably shouldn’t be doing what you’re doing on corporate network.

The MacOS security products (donationware) are also excellent and overlap the functionality of these.


They don't really. Objective See's tools are not bad for what they are (utilities), but they are lacking in:

1. Code quality (Patrick Wardle is not an engineer and it shows in his code). This is not terribly important for an end-user utility but becomes very important when you're deploying kernel modules across your entire organization that need to be absolutely rock solid and not introduce additional threat vectors.

2. Distributed nature (They're utilities meant to be executed by end-users on their own machines, not distributed agents syncing up with cloud servers)

and so on..

osquery came up a few days ago; in use at Etsy, including a direct link to a common setup.

Ask HN: Is no anti-virus software still best practice for mac?


> https://github.com/facebook/osquery/blob/master/packs/osx-at...

Also a discussion of how this type of monitoring worked out in practice in the Google/Uber/Lewandowski case last year:


> the level of detail that Google has over the logs and actions of their laptop

> https://github.com/google/grr

After they praised OSS a lot in that post, one (me ;)) would have assumed they announce the open source release of their "plumbing" code to make it all work nicely, build process trees, etc.

Trail of Bits has done a study on how large technology companies are increasingly switching to osquery for their endpoint monitoring needs. You can read the results here:




We also offer a commercial service to make custom modifications, bugfixes, and feature enhancements to osquery. It's little known at this point, but we do the same for Google Santa too!


FWIW, I'd love to see AIX support for osquery. Wondering if there is any interest from others.

We have a client that may ask us to add that! Please get in touch if you're willing to sponsor that kind of development.

I started to do something like this for my Windows box but then I realized that tracking process execution is not all that useful if you don't also track DLLs (because it's simple to just dump DLLs in a trusted applications folder and wait for them to be loaded). Tracking DLLs would just produce too much noise to monitor which put me off the whole thing altogether.

Shameless plug but anyone interested in deploying osquery and google's santa into their environment. Should check out https://www.zercurity.com - it supports Linux and Windows too.

OpenBSM is awesome except you’re forced to invent your own way of log gathering - which becomes more painful when you’re mobile or offline and then you’ve got to keep state on what’s been transmitted to the mothership.

Would be nice for some insight into Dropbox’s solution here...

To address logging offline you need a log shipper that will do reliable logging and pick of where it left off. I think rsyslog, Elastic Beats, and Splunk forwarder will all do that. Then logs are sent when a machine connects to a network.

For mobile (online but outside corporate network) there are two options I've heard of being done:

1. Have each endpoint have a unique TLS certificate, and have the log shipper do mutual TLS to the logging server which has a public IP.

2. Have a backhaul VPN that is allows connected, automatically, to the monitoring network, and send the logs over that. That VPN is different than the user VPN that gives access to the corporate network.

Yes, but to get BSM into Elastic Beats, you either need to make a shim to convert from BSM binary format into json for FileBeat to consume, or you need to write your own Beats for BSM files.

This is nice, but what would be better is the glue they use to integrate them into other monitoring solutions. How do these merge these tools into their existing infrastructure? Do they parse the logs from BSM, santa and osquery into ELK? If so, how.

The real difficulty is not finding useful open source tools, it is integrating them into existing monitoring solutions used within an organisation to get a single view of activity on a system and on a collection of systems (i.e. how do you make the tool scale).

Was anybody else thinking these tools would be great for general development and debugging purposes? Anything that uses the network or file system anyway. The fact that they can detect malware reads as just an aside to me.

There is also intellectual property issue. In my case whatever code I write on corporate laptop belongs to the company I work for so it’s kinda wrong to use it for pet projects.

Okay, so how do we use these kinds of tools to monitor our own personal machines?

What is the equivalent of these 3 tools on Linux (Ubuntu)?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact