These are obviously leading questions, but:
- Are there significant research advantages to super-fast turnaround (less than 15 minutes, enabled by massive parallelism) in this domain?
- Do you feel like a massively parallel system with Xeon Phi nodes is a good match for this problem? Or did the code get optimized to run at high scale on Cori Phase II because that's where you were given compute resources?
Finally, a bit less provocative:
Does this approach effectively scale down to e.g. a university that can afford a large storage array and beefy commercial servers (optionally equipped with Phi or other accelerators), but doesn't have HPC resources that are contenders for the Top 500? Or do you really need things that smaller systems can't deliver, like many terabytes of memory in distributed global arrays?
No, the 15 minute turn around time is not important given the dataset we have at the moment, but showing that it was possible to do was considered important from a science perspective, because of the upcoming LSST telescope. LSST will generate an amount of data equivalent to the full dataset we had available every 3-4 days, so being able to scale this up far enough to accommodate that as well as future planned extensions to the algorithm, necessitated showing scale. The actual science runs by the project are usually done on a few hundred nodes over a couple of hours.
| Do you feel like a massively parallel system with Xeon Phi nodes is a good match for this problem? Or did the code get optimized to run at high scale on Cori Phase II because that's where you were given compute resources?
Cori Phase II worked well for this problem, though I wouldn't be surprised if GPUs wouldn't have been a better fit (though harder to program of course and at the time the Julia GPU infrastructure probably wasn't quite ready yet - even KNL was a struggle since LLVM was still in the process of completing support for it). The Celeste project is still ongoing (working on science goals more so than extra parallelism or performance improvements at the moment), but I wouldn't be surprised if there was an attempt to run on Summit at some point, especially now that Julia's GPU compiler is much more mature.
One of the biggest problems we failed to anticipate actually was getting the data from disk to compute units quickly enough. Early in the project we crashed the interconnect on the machine, so for the challenge run we weren't allowed to do anything other than pull the data directly from disk (lest we bring down the machine again while other challenge runs were ongoing). I haven't really looked at the interconnect on Summit, so I can't say how well it would handle that.
| Does this approach effectively scale down to e.g. a university that can afford a large storage array and beefy commercial servers
Yes, it scales fairly well. In fact you could probably do it with spot instances on a public could fairly well. The biggest thing would once again be getting the data to the compute units quick enough. That's quite demanding on the network (and ideally you want to pre-stage the data in memory). Certainly it's feasible to do this on a large-ish university cluster on the SDSS data set in a few hours. Probably less feasible on LSST data once that comes online, but maybe by that point improvements in computation speed and storage speed will have made up for that and it'll become feasible again.
I've heard that the folks working on the EHT array need months to crunch numbers. Could something like this be used to speed up that process? Or is there some other reason that would prohibit that.
P.S. I want pictures of black holes.
Did you use one of the compilation options for this or just the JIT compilation (if direct compilation is available, I'll be honest, I'm not that abreast of Julia developments). One of the key things for more computational simulation tools (as opposed to just analysis) is we find that the awesome compilers for C and Fortran codes are a huge boon, especially on systems that have ones that improve on stock intel (not to mention gcc) by a good factor. You lose that if you're just using the jit for the sake of using a "nicer" language, I guess.
I had similar sentiments to the other cat who asked about the 15 minute run, but I guess it's fun to show it "can be done." I was planning to play a little in Julia and this shows it can be worth it.
That is the idea that a catalogue is released as a complete model of the sky, rather than as a table of intensities and coordinates. (https://arxiv.org/pdf/0810.3851.pdf)
I recall when having a cluster of 17 supermini's was a really big thing :-) of course I monitored it using a dial up 110 baud portable print only terminal.
Don't miss the poor pay though
Also ours doesn't run in 15 minutes!
"Hard-core" computing is almost all C/C++/Fortran (and of course CUDA for GPU's etc.). Python and R are fairly popular, but in those cases (hopefully) most of the heavy lifting is done by library code (again, C/C++/Fortran, or increasingly CUDA via ML libraries such as tensorflow) rather than the interpreter. Julia is very promising in this space, as it offers a solution to the "two-language" problem. I'm hopeful for Julia to make more of an impact, but it's of course a slow process.
(There is a (tiny) bit of ASM, but that's more or less exclusively done for widely used performance-critical libraries like BLAS, or FFT, not for application code.)
I don't see the plus value in Julia compared to Python or Java. It will still be slower than C/C++, probably less portable, and all the legacy libraries have to be rewritten in Julia.
If Julia is only a new syntax, to me Python is already very simple. If Julia is a JIT compiler, why not participate to already available compilers?
Julia is still there, so I guess it adds value, but I don't know where to place that effort in the grand scheme of things.
Well, no GIL for a start, which is a pretty strong selling point when running on several dozen of cores at once.
> I don't see the plus value in Julia compared to Python or Java
I don't see how you can put Python and Java in the same bag.
> It will still be slower than C/C++
Not that much https://news.ycombinator.com/item?id=17204750
> probably less portable
Who cares? 99.99% of HPC are Linux clusters anyway. And Julia runs on macOS and Linux, that cover the overwhelming majority of the concerned users.
> and all the legacy libraries have to be rewritten in Julia.
> I don't know where to place that effort in the grand scheme of things.
According to you interrogations, browsing their website would be a good start.
> I don't see how you can put Python and Java in the same bag.
What I meant is that the trio Python/Java/C is ubiquitous for many people in both entreprises and scientific fields, from embedded to web servers. It allows for great reusability of codes and people's skills.
> Who cares? 99.99% of HPC are Linux clusters anyway. And Julia runs on macOS and Linux, that cover the overwhelming majority of the concerned users.
But will Julia be able to output the necessary instructions for the future hardware accelerators, which could be totally different architectures? I'm thinking of all the new neural networks cores, the DSP, fpgas, and heterogeneous computing from different rival vendors. It seems Julia is deeply dependant of LLVM.
If you have to reuse C and Fortran libraries, why not just use Python which can do the same, or even Lua, or Lisp. Python is already the defacto language to glue libraries together onto a higher level algorithm.
Yes. Watch HN in the next week or two for an announcement that may interest you ;).
That's true. But it's also Academia's role to try (and fail or succeed) at developing and evaluating new solutions; and in the case of Julia, I have to concede I'm pretty excited to see where they will be going. The solution of mixing the ‶glue″ and the ‶high performances″ languages in a single one while still letting people call upon older C ABI libs is a little revolution in this context.
> If you have to reuse C and Fortran libraries, why not just use Python which can do the same, or even Lua, or Lisp. Python is already the defacto language to glue libraries together onto a higher level algorithm.
Because Julia is far faster, and because no GIL.
Of course, like in every other tech, what floats my boat doesn't necessarily floats your, so maybe for your usecase Python/Lisp/Lua/Ruby/... is better.
I do not know if you have read the memo, but it only objective is to find a way to hire more women in an effective way.
Personally I liked the proposition in the memo because it would allow for men and women to equally share child-duties.
That said I do not criticize google's decision to fire him as the overwhelmingly negative exposure the memo had was quite a damage to the company...
diversity and inclusion committee at google ask employees for feedback, the memo was a feedback saying "current policies do not work as expected, maybe offering more part time and family friendly jobs would allow both men and women that do want to spend time with their family to like working at google".
In terms of what might be best for me, were I looking for a job... equal opportunity is fair/good. Of course I would think that... I'm a white male... I am a historically privileged class of person in the workforce. Anything that doesn't adversely my job opportunities is good for me, right?
From the perspective of a company that is trying to build a high-output, innovative team... diversity is really, really awesome: diversity of thinking patterns, backgrounds, interests. I have experienced this first hand and am now 100% sold on the importance of diversity and oversampling certain segments of "talent pool" to reach diversity goals. I may I miss out on some opportunities due to how I am categorized in the overall talent pool.... but diversity is really good for the organization and society in general.
Some sports analogies apply: you can't make a basketball team out of all shooting guards or a soccer team out of all forwards. You need a lot of different kinds of talent to make a good team.
That is just my opinion... I would rather work on a diverse team and I'd rather live in a diverse city/culture.
I have worked in high tech for 20 years: biotech, a PhD @ Carnegie Mellon, NASA, some startups, back to biotech... I have worked with so many awesome people of all walks of life that notions of gender or racial superiority are long gone. Quite the opposite, i have experienced the tremendous benefit that usually arises with diverse teams: deeper group experience, less competition and more cohesion, stronger friendships and sense of connection, diverse technical experience and interest, really different and novel ways of approaching problems and developing solutions (things that blew my fragile little mind), etc.
Yes I did write of myself as a historically privileged class... which definitely falls along racial/gender divides. Hopefully the manner of that frank discussion indicates that this privilege really sucks for most everyone (we really all lose) and is short sited. Then I started a new paragraph which signifies a though break --> moving on to a different but related thought.
Nowhere in my post did I the use word "race" or the phrase "intellectual strengths", and I did't use those phrases because I don't believe in them... I used "thinking patterns, backgrounds, and interests."
To address this: "You added the idea that this type of diversity was beneficial because it's an advantage to have a diversity of interests and intellectual strengths in different areas. So you are equating a racial and sexual diversity with the latter."
I wrote none of that... I believe none of that. It requires multiple logical fallacies to get from what I expressed to what you wrote above. Rather than jumping to conclusions, why don't you ask me to clarify what I meant?
So what do I think about diversity and inclusion, in brief:
Diversity is highly multivariate... what team diversity means can vary from situation to situation. Hiring for a Mechanical engineering team vs. women's fashion design team would probably have very different diversity hiring goals but the desired benefit and improvement for the team is similar.
If people only look at diversity as race and gender, they are missing the boat entirely... how about personality, age, ethnicity, religion, educational background, work experience, world/life experience, family responsibilities, life aspirations, out of work interests, health concerns and disabilities, sexual preference, ...all factors which make us different/unique and interesting.
The flip side of diversity, which is inclusion and appreciation, is pretty simple: it is really important to try to understand people, appreciate them for who they are, do your best to allow them to be who they are in the workplace, and not pigeon hole them into a box.
Regardless of what I'm called next, I'm done replying to this thread. If you figure out how to construe this as "I must be the second coming of Hitler", more power to you.
The general problem of diversity is ubiquitous in tech, sort of rough to call out a young startup over this issue. Most good companies strongly desire diversity in their teams because of the real benefits diversity provides.