I started a social data search platform named Datastreamer (http://www.datastreamer.io/) which is basically a petabyte-scale content indexing engine.
We provide API feeds to search engines and social media analytics companies needing bulk data but don't want to have to build a crawler.
For the last 5 years we've had major problems with customers coming to us asking for data which we felt was unethical (at best).
We actually had Saudi Arabia approach us... It was clear that they were intending to something pretty evil with the data.
Their RFP questions were a bit frightening:
- can you track people by religion?
- can you give us their email address?
- can you provide their address?
- can your provide their ethnicity?
- can you provide their social connections?
We're actually losing business to other companies that are performing highly unethical and probably illegal techniques.
We just can't compete with data at that type of fidelity.
If you're a researcher and you want to access bulk data for combating this type of non-sense WE WILL PROVIDE DATA AT COST. We can provide up to 1PB of data but for now we have to charge for the shipping and handling of that data. We're reaching out to some other companies like Google and also the Internet Archive to see if we can provide more cost effective solutions.
I'm working on more tools to give the power back to the users.
Polar (https://getpolarized.io/) is a web browser which allows people to control their own data. The idea is that I can keep a local repository of data and eventually build our own cloud platform based on open systems like IPFS and encrypt the data using group encryption.
There are tons of other shady companies out there doing nefarious things with your data.
We're going to need platforms that support group encryption and better security for apps.
I don't know if there's any way for people/companies like you to defend against unscrupulous companies.
I can only say thank you.
Definitely. Though I also feel we need to give a shock to public awareness of just how evil people can be with this data. My perception is that people are trending towards "vaguely uncomfortable" with the news of foreign interference with targeted ads, election hacking, and so on, but we've a ways to go yet before most will give up supremely-engineered convenience in exchange for security.
As long as the modern world is democratic and the voting populace is subjecting itself to targeted manipulation by data-armed bad actors, we have a problem of not just national security, but international security.
Thanks for being one of the good guys.
I do think there's potential in something along these lines, but I agree with child post that it would need to be done carefully so as not to cause collateral damage. The other question in my mind is how to market it such that people get their friends using it and thus spread the word rather than panicking and reporting it.
It takes years of dedication to be a professional electrician or plumber, but anybody off the street can build an application that aggregates personal information and isn't subject to any sort of regulation or oversight or system of professional ethics.
That doesn't just safeguard the public. It also safeguards the employed individuals when they say, "No, that is not allowed under the ethics of my profession."
This isn't about "let's make the government save us," it's about creating a legal framework to protect the interests of the public and give ethical considerations in data management and software design some legal backing.
I think things would change pretty quick if engineers responded to such requests with "I can't do that, because my professional association would remove me from their membership, which would revoke my license to write software in this country at all".
That's how (AIUI) law, medicine, (real) engineering, etc., work today.
Keep in mind that most law/medicine/engineering work has some local component anchoring it to local laws. (i.e. currently, someone typically needs to be 'boots on the ground' in the jurisdiction to provide the service) Software doesn't have that.
(edit: obviously, this is a U.S.-specific view of the situation. Other countries may not have the same issues)
1) Maintains a membership list
2) Maintains a list of software which is signed off on by members
3) Browser/OS/etc utilities which refuse and/or warn when trying to run software not in the registry
4) Member expulsion if registered software is found to be nefarious
This is basically the system Apple/Microsoft/Debian/etc/etc already use for official software distribution. We just need the organization to move out of their walled gardens.
The big leak here is users which have to use resources they don't control. I can imagine an IaaS company which won't run software unless its in the registry, and then companies can boast that your data is 'safe' (or at least not nefarious) because they run in this kind of environment.
“I can’t recommend this additional procedure for you despite it making me $8500 in a day.”
Regulation doesn’t seem to influence for profit medicine much.
The whole conversation is logged or perhaps converted speech2text as they discuss, and both patient and doctor sign each statement they make. Then both doctor and patient have a copy of their interaction.
Any poor advice is now provable to a third party (say court).
I'd add that it would be nice to be able to operate outside of certain fields and certain types of operations without that level of certification.
For instance, freelance web devs, small business software employees, etc who aren't dealing in things like personal or trivial data could continue operation. For example: you don't need to be a doctor to be certified in first aid, or even administer first aid—but you likely wouldn't attempt an invasive, life-threatening surgery.
I'd also like to see—if that kind of regulation were to pass—the inclusion of some kind of grandfather clause that would include the ability to test without formal education.
The reason being there are very many highly capable developers/engineers in the field who don't possess the exact formal background—and in many cases came from other formal backgrounds.
I'd definitely hear arguments for not requiring education at any time, but to keep it on par with the other professions you listed I'll leave it as is.
This might not be the thread for a larger discussion on this—because it seems like it would be a larger discussion. An interesting one, though...
Implementation seems like it would be a challenge, but then again I don't know the stories behind who more modern professions like electricians were regulated. I imagine that field grew much more slowly.
How do you modify the above idea if you think it has merit?
"I'd also like to see—if that kind of regulation were to pass—the inclusion of some kind of grandfather clause that would include the ability to test without formal education."
Why not have some sort of certification process you can do while working that holds the individual accounts to the values of being unethical and will have consequences for not adhering to, at all levels like on the scale of GDPR violations.
Also there should be steeper penalities against companies acting in bad faith, similar to GDPR for human rights. Thoughts?
Oh with regard to this one, I don't think my wording was clear. I meant with regard to testing or challenging to be certified without having a formal CS/related background. As: a doctor would have to have an MD to practice as a doctor amongst other certifications— I was contending that an explicit CS degree may not be an optimal equal designation for practicing software engineering/research, as it were. As a background— there are many talented and influential researchers who would be cut out of practicing if the line were drawn at a reputed CS degree. Aligning "software practice" (for lack of better wording) hard with a CS degree might be poor bounds for the field.
But I definitely think I agree, at least on a high level with what you're proposing. I hadn't considered it. Good things have come out of research using the large amounts of data available, so it should continue. But there definitely should be some sort of bounds and method for accountability. Also would include a special permissions and appeal process. There's I'm sure a lot of cost/benefit judgement as there is in many scientific experiments (there's seemingly a great deal of that in biological testing).
And you also might restrict certain entities from performing the research and instead be compensated for their collected data by a reputable research group. Said group can produce the hard/applied research, patents, and license them to groups to use them.
Maybe this would also undo some of the effects of outsourcing/offshoring US coding practices due to the need for ethical compliance (at the very least in mission critical systems e.g. vehicle software, hospital software, etc)
Awesome idea. The same governments who are sending RFPs to burtonator's company will then have the ability to decide who is even allowed to work in the industry.
Do you want to go back to the 1970s? Because this is how you get back to the 1970s, when only a rarefied priesthood had access to computing power.
Don't expect to accomplish that without encountering strong opposition.
No they can't. The closest is they can steal(copy/paste) portions of other people's applications, change some text, and attempt to take credit for them.
Also, we do have agency(private) that you suggested. Developer certification programs exist for nearly every language and platform. They aren't very popular. Red Hat will certify you as a java developer for JBoss, Oracle will certify you as a java developer as well. If you can't write code, but still want to feel geeky and have a career as a waiter, you can go get certified as an "ethical hacker" too.
There is no shortage of agency - employers and consumers simply do not care, and by that, I mean they do not want to pay(higher prices) for it.
PS: You can go be an amateur electrician or plumber today, without breaking any laws. However, if you'd like to touch private/public power or water/waste infrastructure, then you need endorsement. It's not illegal for you to hire me(a total amateur at those 2 trades) to both wire and plumb the new house you are building. The awkward moment will be when the water and power company refuse to connect your house, because the work I did was not the work of a licensed electrician or plumber. I understand the spirit of your analogy, but it doesn't translate as well with IT. For that analogy to work, then private/public internet infrastructure would have to refuse to inter-operate with your software if you didn't meet their licensing/certification standard.
Second, I've been fighting for regulations for over three years and I'm getting somewhere, but I'm also starting to think that we need technical solutions to many of these problems. One of the things I see as a problem is that people always want more, but privacy and security often require less.
For example, old charsets only supported Latin characters. With the introduction of Cyrillic characters many assumptions started breaking.
Each time I try to think through how to make the web / internet simpler I realize that it either requires pushing the complexity onto people unprepared to deal with it—an English speaking daughter may want to copy and paste her Russian mother's Cyrillic name, say—or it fails to handle the use cases we need it to handle.
I know this seems kinda abstract, but do you ever think about that interplay? Are there any insights or anecdotes you find useful?
And you always will. Principles and money are often at odds. This is not an easily fixable problem as the well-intended solutions often cause more problems. Enforcement of existing statutes and accepting legal-yet-unethical practices is unfortunately the most rational approach.
> We just can't compete with data at that type of fidelity.
Only in a situation where interminable growth is required in a race to the top. Otherwise, there's room for everyone and that's why there are thousands of software products that "compete" just fine. The key is just making sure there are platforms that allow everyone to build everything.
The key is that they choke you out... They have more engineers, more R&D and a better product because they have more revenue.
OpenBSD is yet another example of a small(ish) team of people making some truly great software. On the Windows side, Fookes Software comes to mind, again small operation, great software.
Now as a manager who has to hire I find it pretty straightforward finding passionate people to work with me simply because the work is compelling.
It’s a matter of getting your story out as the ethical data mining company, or something :) you’ll find like minded clients and employees do exist and that being ethical can be a competitive advantage too.
That said, I have no issue with anyone who voluntarily trades their personal information for access to a service. That's their choice to make. But it also seems reasonable that there be full disclosure as to the scope and scale of the deal they're making so they can make an informed choice. This isn't even remotely the case today.
I appreciate all your work and contributions to the community in general. Please keep your chin up and keep moving forward.
The first line of defense against this is, and always will be, not publishing so much information about yourself. They can not mine what you do not provide.
We can argue all day about the ethics of the conduct of companies and states looking at these data, but it's a non-issue if the data simply don't exist.
And it will only get worse, as more data will be shared by other people. Soon the streets may be full of people using cameras all day long, just because it allows them to make cooler diaries or blogs, and then the data companies will get universal surveillance.
The only way to keep privacy will be to have your face covered in public (but hey, that will be made illegal, because terrorism or something), or avoid public places completely.