Hacker News new | past | comments | ask | show | jobs | submit login
The $10 Hedge Fund Supercomputer That’s Sweeping Wall Street (bloomberg.com)
107 points by gk1 on May 20, 2015 | hide | past | favorite | 73 comments

The gap that firms are trying to bridge is _huge_. I work in finance (slow finance: insurance), but finance technical enough to have our own PhD-level quants building code. They are running simulations taking hours, so scaling up on # cores would seem the first step to take. But it's not that the quants don't know parallelization, it's the IT-architecture not allowing to use AWS at all. And well, many cores on your own premises is hugely expensive because they are idle most of the time. And as a sidestep, most quants are not the first you think of to woo architectures' support. (This is probably easier for firms that trade for own profit.)

Point being: improvements in AI / machine learning and other statistical techniques are huge, and analytical demand exists, but 'old corporations' need cultural change before they can start working with and embedding startup-like technology in their workflows.

IT shops are inherently conservative, that's their job.

The trick is to get the higher-ups interested - they have the power to redirect IT, and are inherently (if laggingly) interested in making employees as effective as possible.

Funnily enough, a fluff-piece in Bloomberg is a pretty good way of achieving it. Tomorrow, some manager will pop into a quant's office with this article in his hand and ask "Are you doing this? Why not?" and that's how the ball gets rolling on opening up.

Once upon a time, getting remote access to email was a "this will never happen", then an executive saw someone with a BlackBerry at the golf club. Once upon a time, getting email access on anything other than a BlackBerry was a "this will never happen", then an executive saw someone with an iPhone.

Confirmed. I'm in insurance too - variable annuities specifically - and it's the most perfect problem for parallelization. In practice everyone maintains their own in-house grid (which is massively expensive and idle 90% of the time), and the technical staff lacks any serious chops to tackle GPUs. In practice the only way we've found to solve this is to go around corporate IT, so we do things like buying GPU cards and jamming them into boxes ourselves.

As someone related to the field, I see a fundamental lack of quality in in-house IT when faced with problems of this magnitude. Which isn't unreasonable.

IT evolved to be able to provision systems and roll out approved software. Maybe there's a development team or two to develop in-house extensions or software (though always viewed as a cost center).

Therefore, the department the organization's rules are set up to funnel you through is simultaneously incapable of fulfilling this need.

In practice, this leads to a ton of business specialty + devOps types from anyone who can hack Python in order to Get Stuff Done.

(For those of you that don't deal with 1,000+ headcount organizations, be thankful ;) )

I think the big thing at the finance companies I've been at is that corporate IT was created to enable people to use Microsoft Office applications. There is no real provision for how to deal with developers writing code, and forget about hardware.

Exactly, I think the telling question to any corporate IT team (not the head of all corporate IT) should be "When did you last identify, trial, and then make available to users a software package / process / cloud service without first receiving a request from management that you do so?"

In my experience, the answer is usually never.

Corporate IT exists to fulfill requirements from above, not to explore how things can be done better.

Sorry I'm very unfamiliar with this - what about variable annuities requires GPU power?

Variable annuities generally have guarantees, which are essentially just embedded options. To price and analyze these options you have to run large-scale Monte Carlo simulations. Many shops have >100K policies and are running 30 year Monte Carlo projections. You can imagine the computational power required, and the inherent parallelism of the solution.

Can you give more information on this? It actually sounds like an easy problem to solve to me.

Oh sure, it's completely solvable. In fact some companies have solved it, but most lack the expertise. E-mail me (address in profile) if you have any further questions.

why do you guys keep it idle 90% of the time? can't you run more finer grained sims on it all the time, or nearly all the time ?

There's lots of issues here. One is simply that people don't submit jobs during non-working hours, and most jobs don't take more than 4-8 hours, so unless you have a pile-up of jobs during working hours, the grid tends to sit idle overnight and on weekends. Another problem is that you plan your grid capacity for peak periods (and add a margin for future growth), so it will generally be underutilized. Another issue is that corporations create redundant grids for business continuation purposes, so often you'll have full grids that are hardly utilized at all.

Reporting tasks are done no more than monthly. Once the job takes longer, management often moves to quarterly or semi-annually especially if the risks involved in not-knowing are limited.

For many types of information, real-time calculations and dashboarding instead of monthly calculations and powerpointing would be preferable, but the same cultural issues apply. Moving from reporting to dashboarding is disruptive for people and processes. And since big businesses tend to create value for the customer via standardization (up to a point), there is a clear reason for the dislike of disruption too.

This is pretty common outside finance as well. "We can't let our data touch any one else's network!" And, of course, we can't bother to get our IT group on the smarts wagon, so we're massively inefficient all the way around.

Yeah. The company I work for was acquired by a big player in the industry, and one of the first things they wanted to do was move us off AWS, because the COO sees Amazon as a competitor and thinks that they'll go snooping in our databases to get a competitive advantage [1]. We managed to stave it off by coming up with a vague plan for how we'd move critical infrastructure to hardware, then shelving it indefinitely.

AWS was (and is) critical to our velocity - unlike every other team in the company, we don't wait 6 weeks to start working on something while IT eventually gets around to writing a hardware purchase order and racking the gear. We don't compete for "the staging environment" because we can have as many environments as we want.

[1] This is a ludicrous idea. My two main rebuttals are:

- AWS takes in more revenue than our company, and they would destroy that business if it ever got out that their engineers were snooping in customer data

- Netflix is a much more direct competitor than we are (they compete with Amazon Video) and also a much heavier user of (and cheerleader for, even) AWS

It might be time for tinfoil hats, but maybe Amazon only competes with Netflix because Netflix demonstrated how awesome Amazon's servers work for video delivery?

Netflix doesn't use AWS for the actual video delivery.

oh no, never engineers. always maintain a layer of plausible deniability, those engineers might turn coat and rat the organisation out you see.

> "We can't let our data touch any one else's network!"

In some industries that can be necessary if you want to stick to the letter of the law, especially here in europe.

You can outsource the processing, but then your provider has to meet fairly stringent certifications, you have the necessary "submitting to 3rd parties for purpose-specific processing" disclaimers in your privacy policies and a signed data processing agreement covering personal data compatible with national laws, which vary between countries.

The microsoft ireland email case[1] also made some data protection ombudsmen suggest that the whole safe harbor deal with US companies should be considered null and void because they cannot effectively guarantee the safety of your data, even more so given the threat of national security letters + gag orders. And without that deal you can't use their cloud solutions for sensitive data.

Personally identifying data has pretty high hurdles. Medical data (which might be processed by insurances?) raises even higher bars.

Where is the line? Everyone seems to think "cloud" services threaten user privacy, but "cloud" is a marketing term.

What about traditional VPSes? Managed dedicated servers? Ordinary dedicated servers? Leased servers collocated in a 3rd party datacenter? Owned servers collocated in a 3rd party datacenter? What if you use their "remote hands" service?

Is there a meaningful difference between your leased office floor and your leased datacenter space? I would argue not. Do you have to own the land? What about leased vs. owned servers if they're in your office?

Even if you have outright owned servers on outright owned land (unlikely b/c mortgage, but whatever), how are they managed?

In many small businesses (including medical practices), they're managed by a "small business IT consulting" company which has both remote and regular on-premise access. You're trusting them to the same extent you would be trusting AWS. Is there a meaningful difference?

What about the proprietary software you run? A large Electronic Medical Records package which probably contains extremely sensitive data about you is technically deployed onsite, but deployment and administration are performed remotely by the vendor's team in India. (I worked for a small biz IT consulting firm that supported the underlying hardware/Windows for one of these installations.)

What about the Windows and other proprietary software with internet access that's literally everywhere?

You may think you've achieved a morally superior position of trusting no one but yourself when you forego The Cloud, but I argue that's generally not the case except in the most extremely self-reliant cases (no vendor support, no contractors, no proprietary software). Those cases are not at all representative of the average business which refuses to go AWS.

> Where is the line?

There is no bright white line. It's more a thing of legal compliance. A death by a thousand cuts/thousand checkboxes.

If I rent an office floor and run my own datacenter then I'm legally responsible for all of it. There is no need for a data processing agreement because i'm not having my data processed by a different legal entity.

If I colocate in a datacenter but the servers are my property then i'm still legally responsible and only have to show that the servers are physically protected from unauthorized access (locked cage and such)

If I buy a managed server things already get dicier. It means a 3rd party has access to the data. But at least you can find a hoster in your own country which removes a lot of headaches because it's fairly easy for them to comply with local laws. You sign a data processing agreement, they secure their datacenter (physically/organizational/networking) and you're done. In case of a data breach you might have to hash out responsibilities.

Cloud services on the other hand easily span continents, are run by foreign companies and span multiple legal systems and the machines they offer are a lot more ephemeral and more tightly integrated into a shared management architecture, which makes it much more difficult to audit and demonstrate isolation of your machines.

If your system gets audited then either your auditor will have to inspect the system himself or will require certifications of your providers adhering to specific legal standards. Having a large, international, tightly interwoven provider makes those certifications more difficult because they have to comply to multiple, possibly conflicting legal systems. If they cannot offer those certifications then you cannot use their services because your auditor will reject them.

In fact AWS will sign a HIPAA business agreement for medical data (http://aws.amazon.com/compliance/hipaa-compliance/).

They'll even help you with ITAR controls too: http://aws.amazon.com/govcloud-us/faqs/

I really love it when I have to head up a project that costs the company $100-250,000 in developer time and the only reason we have to do it is that we can't get the business to buy us another $10,000 in hardware.

Actually I lied. What I really love is when one customer threatens to walk if they have to buy a $3000 machine to run our new version, and the whole team spends 1-2 months doing nothing but trying to keep that account. When we could just gift them a server and save ourselves a mountain of grief and a ton of development resources for, you know, landing new accounts.

If someone could start teaching basic math in business school, that would make me so happy.

I'm playing devil's advocate a little: There are sometimes legitimate legal hurdles to having your data hit external networks. For example, if you're a medical insurance company, then you have to get everything HIPAA legal.

Looks like AWS is HIPAA complaint at least for some services:

"Customers may use any AWS service in an account designated as a HIPAA account, but they should only process, store and transmit PHI in the HIPAA-eligible services defined in the BAA. There are six HIPAA-eligible services today, including Amazon EC2, Amazon EBS, Amazon S3, Amazon Redshift, Amazon Glacier, and Amazon Elastic Load Balancer."


Notably, I don't see Elastic MapReduce. It seems that if you wanted to run heavy data crunching, you'd have to run you own cluster on EC2.

I've been on projects where we used Rackspace cloud servers or Amazon EC2 for HIPAA-compliant insurance enrollment systems. There was extra paperwork and we did a bunch of security testing, but it wasn't too bad.

In both cases the projects were controlled by marketing departments who were much more concerned with cost per customer than with maintaining total control over every aspect of the system.

Sure, sure, and some of our data is even sensitive in that way (cough, HR, cough). But I'm sure Amazon or someone else can handle that kind of requirement. We can't even open the discussion, though.

Of course, why have them pay a couple thousand dollars for an AWS installation when they can buy a "corporate solution" for "3 zeroes more"

One totally uninformed question:

Do you think this could be solved with stacks of Parallela boards?

Just curious, because that's kind of like the target market for the boards right? Cheap , multi-core processing?

Technically: sure. But I think you are touching some different skill sets going on a continuum from (1) 'sequential simulations on regular machine' to (2) 'parallel simulation on very high end, regular machine', all the way to (3) 'massively parallel on non-regular machine'. From (1) to (2) you can manage with existing staff and limited time. From (2) to (3) is quite the change in mindset, training et cetera.

I guess that makes sense. It does seem like a more exotic solution. Especially considering the amount of different skills that whoever is building such system would need to have.

Maybe in some years we'll see if the Parallela paradigm takes off and then maybe there will be more quants out there capable of using such boards.

That's one of the problems we're trying to solve with Ufora.

Because it's so much more difficult to write "high performance parallel simulations", we want to make it so you can write whatever code you want, and let the platform figure out how to make it fast, parallel, and distributed.

I'd look at checking if a GPU is a good match for the problem before thinking of even more exotic solutions.

So could you take a page from Pixar or Seti and use company laptops while they're idle? Even if it was just on the quants laptops, it could help out. There would be no PO for the hardware. Shouldn't take to long to set up a simple system, but that could be the over optimistic American in me.

Where does Pixar say they use company laptops while they are idle?

Chase did this way back in the late 90's in the derivatives department. One day a number of traders and salespeople were complaining that their machines were all running slow. It turned out this was because the swaptions and exotics desks were running Monte Carlo simulations on everyones' machines.

After that, stealing cycles from co-workers machines was relegated to non-business hours.

It's pretty common now. I think both Data Synapse and IBM Platform Symphony both let you do this.

It's pretty common for 3D/rendering companies to use company desktops at night or when they are idle for rendering, and most renderfarm managers support this kind of setup. However, this is only worth it because the graphists already have powerful workstations to do their job. I doubt using underpowered laptops would make sense.

I've not read about Pixar doing this, but I found it interesting that IKEA does: http://www.cgsociety.org/index.php/CGSFeatures/CGSFeatureSpe...

Yea, I can't find Pixar doing this. I thought I remembered an article from years ago. Other companies did it as well.


But they might since RenderMan supports it, http://renderman.pixar.com/resources/current/tractor/tractor...

> Yea, I can't find Pixar doing this

Maybe because it doesn't make sense to copy a lot of model/texture data to/from a slow laptop for processing.

The data is small compared to processing required. Laptops aren't that slow. They are as fast as the fastest "servers" from a few/several years ago. Finally, most employees proly don't have laptops, OP misspoke.

I remember reading years ago that every machine in the building was linked into the render farm.

So the person on the front desk had a dual socket, multi-GB RAM monster of a machine, but only used it for light email/room bookings/contact management. But when render jobs were running, the PC was under 100% load.

I'm not sure it'd work so well with laptops due to the battery life issue, as render jobs are presumably running 24/7. A quick search hasn't turned anything up though...

It seems like it'd be a lot more cost-effective to have a dedicated box in a place with proper power / network / cooling and give reception an iPad or MacBook Air, but I think in general the idea of using idle CPUs is cool.

> IT-architecture not allowing to use AWS at all

IPSec all traffic, including internal traffic between instances? Encrypt all volumes with keys not written anywhere within the cloud? That should do it, no?

Unrelated: Salut Florin, cred ca ne stim de pe RLUG. Veneam pe la tine in Drumul Taberei sa facem chestii pe linux. Daca vrei sa mai keep in touch, am adresa in profil. Numai bine!

If most of your cores are idle most of the time, you need to be doing things like searching bigger state spaces.

Hi guys, I'm a founder at Domino Data Lab (one of the other companies mentioned in the article). I'm happy to answer questions anyone has. I can also corroborate the opinions in the thread here: many of our customers are in finance and insurance, and many of them don't want data leaving their networks at all (hence the need for an on-prem offering).

Is the on-prem offering a combination of "OSS private cloud" and a proprietary analytics / machine learning API?

We don't offer a proprietary ML library or API -- we run arbitrary user code, whatever languages and packages you want to use (Python, R, h2o, dato, compiled C++, etc.). The infrastructure piece is like a private cloud, yes.

I've always been confused as to what takes a long time to calculate in finance. What are the common difficult problems that aren't getting solved?

I've met some of the guys from Ufora and seen a demo of their technology; very impressive. I've always been leery of the claim that you can create smart software to make the problems of writing parallel code magically disappear, but what they've done has made me rethink some beliefs related to working with big data at scale. Definitely a company to keep an eye on.

Glad to hear you like it! We think it's pretty cool ourselves :-)

Sensai which is mentioned in the article is hiring Java AI/NLP folks in Palo Alto http://sensai.workable.com/jobs/19235 The CTO Monica Anderson also runs the bay area AI meetup group

PM @ Ufora here. Happy to answer any questions you might have!

Is it common to use prop trading strategies for trading a personal portfolio? Or do quants almost exclusively work at hedge funds

I've never met a quant trader trading for their own account. It requires some infrastructure that would be hard to justify unless you have lots of money to invest, and you're really good at it.

It is common for successful quant traders to be partners (have equity) in their firms, which means they are "trading for their own account" to some extent.

I think what he meant by "for their own account" was something smaller and personal. Not part of a larger pool containing hundreds of millions of dollars.

Definitely not common, but I run my own quant strategies for my own portfolio on a fully automated basis and have done so for several years.

Every other quant develper/trader I've met works at a fund or sell-side firm.

Would you mind detailing the platform/portal that you use?

Not sure what you mean by portal, but I wrote my own execution system that I run on a server located close to my broker and the exchanges.

Market data is via a consolidated feed and execution via my broker to a variety of venues. I don't run any high frequency strategies so 100ms from exchange data to execution is fine for me.

I use R for most research and strategy development.

Ah sorry, I was interested in what broker you use for market connectivity.

Also, are you able to detail where you source your market data from?

Sounds like a really interesting setup I'd love to know more about it.

I use IQFeed[1] for data. Despite their website, I've always found their product and developer support to be excellent given the price point.

[1] https://www.iqfeed.net/

Who do you use for execution? I wish I could use Interactive Brokers for server-side but, I think, that requires 100k and FIX connection.

If you're making any consistent profits these days with such a primitive setup, you probably have some pretty fantastic insights that you could be selling to a hedge fund or bank

The issue is strategy capacity.

There are many more strategies that will generate consistent returns on a few million dollars than there are that will generate the same return on a hundred million dollars.

This is a key advantage of being an individual and a problem every quant fund faces as they grow.

Like the other commenter, I'd be interested in learning more about your setup for your data feeds too!

LTCM wasn't a warning, it was prophesy.

A prophesy which became manifest a few years back.

2006 called, they want their technology back.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact