Point being: improvements in AI / machine learning and other statistical techniques are huge, and analytical demand exists, but 'old corporations' need cultural change before they can start working with and embedding startup-like technology in their workflows.
The trick is to get the higher-ups interested - they have the power to redirect IT, and are inherently (if laggingly) interested in making employees as effective as possible.
Funnily enough, a fluff-piece in Bloomberg is a pretty good way of achieving it. Tomorrow, some manager will pop into a quant's office with this article in his hand and ask "Are you doing this? Why not?" and that's how the ball gets rolling on opening up.
Once upon a time, getting remote access to email was a "this will never happen", then an executive saw someone with a BlackBerry at the golf club. Once upon a time, getting email access on anything other than a BlackBerry was a "this will never happen", then an executive saw someone with an iPhone.
IT evolved to be able to provision systems and roll out approved software. Maybe there's a development team or two to develop in-house extensions or software (though always viewed as a cost center).
Therefore, the department the organization's rules are set up to funnel you through is simultaneously incapable of fulfilling this need.
In practice, this leads to a ton of business specialty + devOps types from anyone who can hack Python in order to Get Stuff Done.
(For those of you that don't deal with 1,000+ headcount organizations, be thankful ;) )
In my experience, the answer is usually never.
Corporate IT exists to fulfill requirements from above, not to explore how things can be done better.
For many types of information, real-time calculations and dashboarding instead of monthly calculations and powerpointing would be preferable, but the same cultural issues apply. Moving from reporting to dashboarding is disruptive for people and processes. And since big businesses tend to create value for the customer via standardization (up to a point), there is a clear reason for the dislike of disruption too.
AWS was (and is) critical to our velocity - unlike every other team in the company, we don't wait 6 weeks to start working on something while IT eventually gets around to writing a hardware purchase order and racking the gear. We don't compete for "the staging environment" because we can have as many environments as we want.
 This is a ludicrous idea. My two main rebuttals are:
- AWS takes in more revenue than our company, and they would destroy that business if it ever got out that their engineers were snooping in customer data
- Netflix is a much more direct competitor than we are (they compete with Amazon Video) and also a much heavier user of (and cheerleader for, even) AWS
In some industries that can be necessary if you want to stick to the letter of the law, especially here in europe.
You can outsource the processing, but then your provider has to meet fairly stringent certifications, you have the necessary "submitting to 3rd parties for purpose-specific processing" disclaimers in your privacy policies and a signed data processing agreement covering personal data compatible with national laws, which vary between countries.
The microsoft ireland email case also made some data protection ombudsmen suggest that the whole safe harbor deal with US companies should be considered null and void because they cannot effectively guarantee the safety of your data, even more so given the threat of national security letters + gag orders. And without that deal you can't use their cloud solutions for sensitive data.
Personally identifying data has pretty high hurdles. Medical data (which might be processed by insurances?) raises even higher bars.
What about traditional VPSes? Managed dedicated servers? Ordinary dedicated servers? Leased servers collocated in a 3rd party datacenter? Owned servers collocated in a 3rd party datacenter? What if you use their "remote hands" service?
Is there a meaningful difference between your leased office floor and your leased datacenter space? I would argue not. Do you have to own the land? What about leased vs. owned servers if they're in your office?
Even if you have outright owned servers on outright owned land (unlikely b/c mortgage, but whatever), how are they managed?
In many small businesses (including medical practices), they're managed by a "small business IT consulting" company which has both remote and regular on-premise access. You're trusting them to the same extent you would be trusting AWS. Is there a meaningful difference?
What about the proprietary software you run? A large Electronic Medical Records package which probably contains extremely sensitive data about you is technically deployed onsite, but deployment and administration are performed remotely by the vendor's team in India. (I worked for a small biz IT consulting firm that supported the underlying hardware/Windows for one of these installations.)
What about the Windows and other proprietary software with internet access that's literally everywhere?
You may think you've achieved a morally superior position of trusting no one but yourself when you forego The Cloud, but I argue that's generally not the case except in the most extremely self-reliant cases (no vendor support, no contractors, no proprietary software). Those cases are not at all representative of the average business which refuses to go AWS.
There is no bright white line. It's more a thing of legal compliance. A death by a thousand cuts/thousand checkboxes.
If I rent an office floor and run my own datacenter then I'm legally responsible for all of it. There is no need for a data processing agreement because i'm not having my data processed by a different legal entity.
If I colocate in a datacenter but the servers are my property then i'm still legally responsible and only have to show that the servers are physically protected from unauthorized access (locked cage and such)
If I buy a managed server things already get dicier. It means a 3rd party has access to the data. But at least you can find a hoster in your own country which removes a lot of headaches because it's fairly easy for them to comply with local laws. You sign a data processing agreement, they secure their datacenter (physically/organizational/networking) and you're done. In case of a data breach you might have to hash out responsibilities.
Cloud services on the other hand easily span continents, are run by foreign companies and span multiple legal systems and the machines they offer are a lot more ephemeral and more tightly integrated into a shared management architecture, which makes it much more difficult to audit and demonstrate isolation of your machines.
If your system gets audited then either your auditor will have to inspect the system himself or will require certifications of your providers adhering to specific legal standards. Having a large, international, tightly interwoven provider makes those certifications more difficult because they have to comply to multiple, possibly conflicting legal systems. If they cannot offer those certifications then you cannot use their services because your auditor will reject them.
Actually I lied. What I really love is when one customer threatens to walk if they have to buy a $3000 machine to run our new version, and the whole team spends 1-2 months doing nothing but trying to keep that account. When we could just gift them a server and save ourselves a mountain of grief and a ton of development resources for, you know, landing new accounts.
If someone could start teaching basic math in business school, that would make me so happy.
"Customers may use any AWS service in an account designated as a HIPAA account, but they should only process, store and transmit PHI in the HIPAA-eligible services defined in the BAA. There are six HIPAA-eligible services today, including Amazon EC2, Amazon EBS, Amazon S3, Amazon Redshift, Amazon Glacier, and Amazon Elastic Load Balancer."
Notably, I don't see Elastic MapReduce. It seems that if you wanted to run heavy data crunching, you'd have to run you own cluster on EC2.
In both cases the projects were controlled by marketing departments who were much more concerned with cost per customer than with maintaining total control over every aspect of the system.
Do you think this could be solved with stacks of Parallela boards?
Just curious, because that's kind of like the target market for the boards right? Cheap , multi-core processing?
Maybe in some years we'll see if the Parallela paradigm takes off and then maybe there will be more quants out there capable of using such boards.
Because it's so much more difficult to write "high performance parallel simulations", we want to make it so you can write whatever code you want, and let the platform figure out how to make it fast, parallel, and distributed.
After that, stealing cycles from co-workers machines was relegated to non-business hours.
But they might since RenderMan supports it, http://renderman.pixar.com/resources/current/tractor/tractor...
Maybe because it doesn't make sense to copy a lot of model/texture data to/from a slow laptop for processing.
So the person on the front desk had a dual socket, multi-GB RAM monster of a machine, but only used it for light email/room bookings/contact management. But when render jobs were running, the PC was under 100% load.
I'm not sure it'd work so well with laptops due to the battery life issue, as render jobs are presumably running 24/7. A quick search hasn't turned anything up though...
IPSec all traffic, including internal traffic between instances? Encrypt all volumes with keys not written anywhere within the cloud? That should do it, no?
Every other quant develper/trader I've met works at a fund or sell-side firm.
Market data is via a consolidated feed and execution via my broker to a variety of venues. I don't run any high frequency strategies so 100ms from exchange data to execution is fine for me.
I use R for most research and strategy development.
Also, are you able to detail where you source your market data from?
Sounds like a really interesting setup I'd love to know more about it.
There are many more strategies that will generate consistent returns on a few million dollars than there are that will generate the same return on a hundred million dollars.
This is a key advantage of being an individual and a problem every quant fund faces as they grow.