Hacker News new | past | comments | ask | show | jobs | submit login
Apple Machine Learning Journal (machinelearning.apple.com)
485 points by uptown on July 19, 2017 | hide | past | favorite | 126 comments



For anyone curious about why Apple is allowing its researchers to (anonymously) publish papers like these on an Apple blog, it's because of this:

Apple’s director of AI research Russ Salakhutdinov has announced at a conference that the company’s machine-learning researchers will be free to publish their findings. This is an apparent reversal of Apple's previous position.

Refusing permission to publish was said to be keeping the company out of the loop and meaning that the best people in the field didn’t want to work for Apple.

From: https://9to5mac.com/2016/12/06/apple-ai-researchers-can-publ...

We will see whether this move is sufficient to attract the top talent they're looking for.


I think that is something that is worth repeating, telling and shouting: AI research is being open-sourced (arguably the most important research to make in the open, ever) because researcher DEMANDED it. And were ready to refuse job offers for ideological reasons.

As someone who has turned down offers in the military, I have often heard that it was childish and inefficient to have ideals when hunting for jobs. Well, here is the proof that it is not.

I really congratulate these researchers. By their perseverance alone they may be helping avoid one of the most worrisome dystopia ever anticipate by SF writers.


Meh, not really. You're expected to publish in the field and your future job prospects depend on it. These people just want to ensure they can easily go to other companies/orgs.


Yeah. Hard to develop a world class reputation squirreled away in the basement of Apple. This is the real reason Apple doesn't want their people publishing. Because Google will know who to snatch away!

It's very much a kick the ladder away type of company.


I guess the Apple should pay them well enough if they don't want them to leave.

But you know they'll just spearhead another collusion with other tech company.


> Well, here is the proof that it is not.

For people with top-rated skills in heavy, heavy demand.

If you're the unicorn that everyone's after, you've always been able to dictate your terms.


People in high demand can make their own conditions. They can ask for equity, more money, a company yacht... Them, they demanded open publication. And it was acceptable to the company because all the other competitors were facing the same demand.


I like the mixed metaphor


I wonder - is there really any prestige to be had from publishing anonymously?


It is not an anonymous publication!

1. The article is based on a paper submitted to the prestigious CVPR, which is obviously not anonymous. ( https://arxiv.org/abs/1612.07828 ) They mention this at the end of the post.

2. The blog's "journal" name is misleading. It means "journal" in the sense of a record of some of the stuff happening at Apple. My guess is that the blog post does not name authors because it was written by an Apple representative who manages public communications---probably because Apple is very very particular about managing their brand and what they say publicly. I wouldn't put it beyond Apple to hire someone well-versed in ML just for this role.

So, while they are starting to allow researchers to publish, they have quite some distance to go to encourage their researchers to communicate freely. One step at a time, I guess...

PS: Somehow the SNAFU of deliberately calling it a "journal" is very reminiscent of Steve Jobs's chutzpah.


Well, the "journal" in the sense of academic publishing and in the sense of a blog derive from the same root: a continuing first-person description of one's activities, updated periodically over a stretch of time.


> It is not an anonymous publication!

So who wrote this? You say it's "based on a paper" written by known authors, but did they write this publication? Why not put their names not on it? Or did a representative write it, as you suggest? We don't know. It's anonymous.

> The blog's "journal" name is misleading

The usage seems similar to that of "Bell Labs Technical Journal" or "Lincoln Laboratory Journal."


> You say it's "based on a paper" written by known authors, but did they write this publication?

I'm confused - for me the answer is quite obvious. The authors listed on the CVPR paper wrote the published paper or are you suggesting otherwise?


The question is who is the author of this post.

It is unattributed, and while clearly based on the CVPR it is different enough that if it were a traditional publication it would be counted separately. If someone wanted to cite this post, what would they cite?

By comparison, distil.pub is a properly citable journal: https://distill.pub/2017/momentum/#citation


That's how it seems, but OP suggested it was written by a communications department. I'm not sure. Why leave out the authors' names?


No, none. I suppose they want researchers to be motivated merely by the ability to contribute knowledge to the ML community?


> I suppose they want researchers to be motivated merely by the ability to contribute knowledge

I might phrase it more as "They want researchers who are already motivated by the contribution of knowledge"

I know that if I were an ML researcher I would not be happy with all my findings being horded by a single company when the field overall is making lots of progress.


All of them can claim the credit for any given article when looking for another job later!


I would think the proper way is to say you worked for that department and mention articles published from there.


One hand if ALL papers were forced to be published without authors then all shity papers would dissapear (which is 80% of current lot), there will be no "publish or perish" and lot of eliteness and egoism will go away.

On other hand, how do you decide if person is worth talking to? worth collaborating? worth giving tenure? The problem exacerbates when supply side dominates many magnitudes over demand side. This is precisely why we as human always keep establishing trust relationships that allows us to filter out signals from noise. You monitor writings of people you know have produced stuff on interests, not every random thing out there.


Who the hell publishes for prestige? Most people won't be able to understand, let alone care, about your research.


> Most people won't be able to understand, let alone care, about your research?

The intended audience is other researchers in your field, not the general public.

Your reputation as an academic or researcher rest on your publication record (plus your ability to land funding). If you publish highly-cited papers in prestigious journals or conference proceedings, you're considered among the best in your field. This opens the door to promotions, better job offers, etc.

Researchers at Apple have historically forgone the ability to publish. They have no reputation in the field. This significantly harms their ability to get a job elsewhere as a researcher.


When you get to college, you'll be very disappointed to discover that's that's literally the only reason people publish in most cases. It ain't for the money, that's for sure. It's how you make a name for yourself amongst your peers in the same field.

You might get a sneak peek at the publish-or-perish dynamic to a lesser degree when you get to high school- sometimes you can get into special programs where you have a chance to help out with research being done by college students. If you get into a really good program and really distinguish yourself, you can even get your name alongside theirs on the paper.

Unfortunately, you will find that believe it or not, those college students are not already working out which color of Ferrari to buy with their sweet advance from SIGGRAPH or wherever. I think they deserve it personally, because I really care really hard about pushing science and engineering forward, but sadly the big bucks are still going to NFL players and pop stars instead.


Professional prestige. If you need an ELY5 answer plese just request one...


> Most people won't be able to understand, let alone care, about your research.

This is what lead me to get out of research and start my first company.

I was in a research group at PARC doing work on programming language epistemology and semantics. The group papers were typically mostly predicate calculus. At a POPL our group lead presented a paper; I was in the audience. After the paper we got polite and even enthusiastic applause, but two people in front of my in the audience spoke during the applause: Q: "Did you understand that?" A: "Not a single word"

I realized there were perhaps a dozen people in the world who even understood what we were working on, and of those 12 at least 10, if not all 12 of them were smarter/understood the subject much better than I.

It really didn't seem like it was going to make a difference to the universe.


Why would I want to publish stuff if I cannot put my name on it? I guess that shows that "rich != smart".


The occasional blowhard professor aside, there is a thread of idealism in many research circles that advancing the field is more important than advancing one's stature. My favorite example of this is all of the research that went into the creation of Bitcoin and the still unknown identity of its creator.


Not to presume any motivations of Bitcoin's creator, but it doesn't hurt when the success of your research also nets you an incredible amount of wealth. Though, as far as I know the presumed wallet that he owns has not had any funds moved off of it so at least their interest is aligned with the success of Bitcoin.


That's a nice example. But putting your name on your research is not so much about stature. Publishing anonymously is more akin to giving your baby up for adoption. Sure, there are people doing this, and they even might have a good reason for it, but if you demand that upfront then most people never would have babies.


Fair enough. I know lots of researchers are very personally invested in their work and I can imagine the sting. I promise I wasn't calling you a blowhard or anything :)


Except that this is clearly only to keep the price of the researchers down. This is shameful.


I don't know. I think there's a benefit to having anonymized research. It forces the work to stand on its own. Sort of an extension to blinded peer-review processes. Plus, I highly doubt Apple researchers, or researchers at any other tech giants, are toiling away for peanuts.


I don't quite understand the 'blowhard professor' and bitcoin references. There are are hundreds of thousands of attributed peer-reviewed publications per year by serious academics who want to further the field.


You're absolutely right. There are. Blowhard professors, in my opinion, are the ones that prioritize recognition over research. There's a thread in certain research circles that would eschew any such recognition in the hopes of putting their work on a pedestal in place of themselves (e.g., the Bitcoin people, again in my opinion).


There is also a thread of idealism in many research circles that the researchers would like to continue working on a liveable wage. Having some public work helps that.

It's pretty obscene that you equate any desire for recognition of your work with being a blowhard.


Livable wage? Researchers at Silicon Valley powerhouse companies aren't exactly toiling away for peanuts, are they? Also, the implication of my statement is that blowhards tend to prioritize their desire for recognition, not that the desire for recognition makes you a blowhard. Hardly an "obscene" assertion.


I would argue the Bitcoin author being secret has created more intrigue and possibly more noterriery. It is believed to be Nick Szabo, btw.

There is also a security issue as thought that he has a lot of bitcoins hidden away.


Fair enough. Maybe I'm being an idealist hoping for some idealism in the world.

And yeah, there've been several theories about who it might be. It's like the Banksy of the cryptocurrency world. Just as cool, but way less street cred :)


No, but "machine learning researcher working for a major company" = "smart".


I am assuming it is because they are getting paid by Apple.


> Apple’s director of AI research Russ Salakhutdinov has announced at a conference that the company’s machine-learning researchers will be free to publish their findings. This is an apparent reversal of Apple's previous position.

How about researchers in other fields?!?


Right?! I can't wait for the computer-human interaction research teams to jump on this trend and start publishing! The CHI/HCI/HCC community would flip out. My favorite example of how this sort of transparency would benefit the community: In 2009, Microsoft published an academic paper called "Mouse 2.0" (https://www.microsoft.com/en-us/research/publication/mouse-2...), where they walked through the research prototypes for touch-sensitive peripherals. The paper was met with awe and acclaim... A mere few months before Apple took the Magic Mouse to market.


> A mere few months before Apple took the Magic Mouse to market

Apple released the iPhone 2 years before the Magic Mouse and that paper was released.

Quite sure that it was also an inspiration.


Definitely.


Months? The Magic Mouse was released a few weeks later, not nearly enough time to develop the product based on the paper.


My implication was that Apple had been doing that research work behind the scenes the entire time.


> A mere few months before Apple took the Magic Mouse to market.

So you're implying that Apple saw the idea, copied it, and used it to bring a product to market—in months? That's not how any of that works.


No, the implication was that Apple had been working on similar ideas but hadn't published them.


Exactly!


Apple and many tech companies are a world where association is everything. “I work at Palantir” means a lot. “I work at Uber” at the moment may be both positive and negative. The academic world is not like this at all. Paper count is everything. Churn them out with all the power you have. Most universities allow their teachers to take sabbaticals on the hope (and sometimes mandate) that they will publish. So Apple was not going to find many academic researchers who were okay with not being allowed to interact in the same way with their community.


>“I work at Palantir” means a lot.

To who? When I hear that I just think that means you are a naive fresh graduate doing basic data cleaning tasks.


> When I hear that I just think that means you are a naive fresh graduate doing basic data cleaning tasks

Isn't that presumptuous?


Is it presumptuous if it's true with high probability?


Really, anonymously, wow. Apple going to crazy lengths to keep the price of their researchers down. I'm never stepping foot there again.


It's not anonymous. The end of the article says

> For more details on the work we describe in this article, see our CVPR paper “Learning from Simulated and Unsupervised Images through Adversarial Training” [10].

And the citation is

> [10] A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, R. Webb, Learning from Simulated and Unsupervised Images through Adversarial Training. CVPR, 2017.

So it looks like those are the authors.


It is not at all clear that this new publication indicates that researchers will not also be allowed to publish in existing journals.


Is anyone else amused by the irony of using machine-learning-trained image generator in order to provide data to a machine-learning-trained image recognition program? I'm sure the researchers themselves and plenty of people here could come up with all sorts of logical reasons why this is fine, and very possibly given the right protocols it would be fine. But this sort of approach seems to lend itself toward increasing the risks of machine-learning. ie, you're doubling down on poor assumptions that are built-in to your training criteria or which creep into the neural net implicitly, because you are using the same potentially flawed assumptions on both ends of the process. Even if that's not the case, by using less real, accurately annotated data, you're far less likely to address true edge cases, and far more likely to overestimate the validity of the judgments of the final product compared to one with less synthetic training. And if there's one thing the machine learning community doesn't need any more of, it's overconfidence.

Edit: oops, turns out I mistakenly responded to the content of the paper instead of the fact that it exists and the form of its existence. Sorry.


I cannot disagree with you but the fact is and remains that the field is a budding one and this is but a building block towards better understanding the nature of using sets of geometric transformations for pattern recognition in A.I.

If this was to be directly implemented in the real world your arguments would need to be addressed, however I'm certain this research paper wouldn't be abused in that manner.

Disclaimer: I'm a graduate student studying Computer Vision.


Google trained their AI Go algorithm by playing against itself. It kind of has the same feel to it.


Isn't that basically the idea behind GANs?


A lot of academic papers actually aren't all that great, for a variety of reasons. Normally, you can use citations, journal, and author credentials to get a sense of whether a paper is even worth skimming. The only "paper" on the "journal" right now looks like it's just a watered-down, html-only version of https://arxiv.org/abs/1612.07828!

Seems like more of a PR stunt than anything useful, but who knows.


But in the sense that Apple is notoriously tight-lipped about research, it's a notable PR stunt. Unlike with Microsoft and Google, you don't see a lot of Apple in academic research circles despite them having a significant brain-share. Love them or hate them, Apple is a powerhouse and the (undoubtedly amazing) research happening behind the scenes could do a huge amount of good with just a bit more visibility.


It's interesting how much criticism they're getting because Apple formatted their blog to be anonymous and watered down, but they're clear in their first technical post that it is just an overview of work that the researchers are presenting at CVPR [1].

So the researchers at Apple are still getting credit for their work in the scientific community, but the PR-facing side of their work is anonymous, probably for some aesthetic reason (this is Apple, of course)

[1]: https://arxiv.org/abs/1612.07828


The most Apple thing ever is that they called it a "journal" and not a "blog."


I'm only 30 and yet I think I've become an old "get off my lawn" man. I hate the word "blog" and find "journal" to be a nice relief. No need to differentiate your log/journal from others because its on the web. It feels like a language artifact of the dot com era alongside "surfing" and "cyber".


This isn't a 'journal' in any sense of what the word implies when referring to published materials. There's no peer review, probably no editorial board. Blog is appropriate.


Nobody called it scientific journal.

http://www.dictionary.com/browse/journal:

1. a daily record, as of occurrences, experiences, or observations[...]

3. a periodical or magazine, especially one published for a special group, learned society, or profession[...]

4. a record, usually daily, of the proceedings and transactions of a legislative body, an organization, etc.


The phrase "machine learning journal" strongly implies academic journal to most machine learning researchers though, in part because that is actually in the name of several journals in the field. I don't think Apple is unaware of that either. This reads to me as quite deliberately playing on that association, to upgrade the prestige of the stuff published here vs. what it'd have if it were just a blog.


The first post is actually a summary of a CVPR paper by Apple employees. For those not familiar with it, CVPR is a top conference in computer vision.[0] Recall, of course, that "conference" for much of computer science implies the same length and degree of peer review as "journal" does in non-CS fields.

[0] http://cvpr2017.thecvf.com/


I don't think that really matters. If I only post really worthy articles in my blog it doesn't become a scientific journal. I'm not saying that I believe peer-reviewed journals are the end-all-be-all for science but it does seem like Apple is being misleading with this.

edit----------

Although, a company publishing a peer-reviewed scientific journal like Nature would be surprising. So maybe that isn't the common interpretation when they see the title. Maybe it isn't totally misleading. I guess I'm split on it. :)


It's not common, but also not unheard of for companies to organize properly peer-reviewed journals for their internal research. The two most famous are probably the Bell System Technical Journal and the IBM Journal of Research and Development. From the title I was expecting Apple to be continuing in that tradition, but it seems they aren't.


I should have quoted. The person I was replying to originally said that "apple won't let it's researchers publish in real journals", and then stealth-rdited their post.


It isn't 1: They're not publishing their ML researchers' daily activities and experiences.

It isn't 2: This isn't what would be generally understood as a 'periodical' or 'magazine'. It doesn't even resemble the trade publications that are often published by larger businesses.

It certainly isn't 4 - Apple isn't giving away terribly detailed accounts of their research activities.


The meanings of words isn't set in stone, rather they change over time. As long as people generally interpret the message close to how it was intended to be interpreted, it's fine.


If nobody called it scientific journal why it has volumes and issues? I think Vol. 1, Issue 1 simply suggest they see it as a scientific journal.


From a historical perspective, so is "Journal" -- it harkens back to the days of things like the Bell System Technical Journal (published from 1922 to 1983), the IBM Systems Journal, etc.


Bell System Technical Journal really was run like a scientific journal though, with the exception that it was "internal" to Bell. It had an editor in chief, an editorial board, papers had to be submitted for review and possibly revised before publication, etc. And the Bell Labs scientific community was large and diverse enough that I'd be comfortable calling their process peer review even though it was all done internally.


Journal does not imply peer reviewed. The word originates from Latin and simply means "daily". I.e. a periodical publication. An academic journal can be peer reviewed but that's not always the case.


It's interesting to understand the roots of words but in this case I think the relevant information to be considered is how the word is generally understood today in the context that it is currently presented.


It's a journal in the sense of a scientific diary/notebook. That is also my first association.

The word "journal" comes from french "jour" which translates to "day".

The word blog comes from log which original comes from tabular record-keeping of ship speed.


The proper name I think would be "Technical Reports." Not terribly catchy. But journal does make it sound like they are pretending to be science-y.


I agree. I think the closest word from the nondigital era to a "blog" would a "bulletin".


Have you heard of the Linux Journal?


I read it as "journal" as in like a personal journal.


Not even a DOI.


I mean, yeah if it's a personal thing about your life, the word journal might work. But blog means a lot more than that. Businesses use blogs to highlight products and upcoming events. Journalists use blogs to publish their stories. Bruce Schneier and Brian Krebs use a blog to publish research and articles they've written. 37Signals uses a blog to advertise their particular way of doing business.

"Blog" is not short for "journal on the Internet", it's short for "web log" and means something far broader than "diary" that you seem to associate it with. It's not just a log/journal on the web, it's a newsletter, a flier, a classified ad, a newspaper, a document repository, a knowledge base, an event calendar, and a ton more use cases. It's all of those rolled into one. And since "journal/newsletter/advertisement/diary/repository online" is too long, we just shorten it to one four letter word: blog.

The weirdest thing to me is when people say they hate specific words when no other commonly-used word exists to describe the same thing. You can't hate a word. It's the most fundamental concept of communication, that we all agree what various scribbles and noises actually mean. "Journal" has a meaning. So does "blog". They don't always overlap.


The word "journal" makes me think more of a scientific journal, which is something else entirely, [1].

[1] https://en.wikipedia.org/wiki/Scientific_journal



how soon we forget about livejournal :)


Livejournal was peer-reviewed though.


Why no author names on the article?


This sort of thing has actually been dealt with in the past. In 1908, William Gosset working for Guinness brewery in Dublin, published his statistical work under the pen name of "Student." Which gave us the oddly named "Student's t-test," and related distributions.


I don't know how I never knew this fact, but this is pretty hilarious!


That's really weird. That said, they reference "our" work (i.e., their work) as citation [10], so one concludes it's likely a subset of the paper authors:

[10] A. Shrivastava, T. Pfister, O. Tuzel, J. Susskind, W. Wang, R. Webb, Learning from Simulated and Unsupervised Images through Adversarial Training. CVPR, 2017.


I don't know enough about the culture at Apple, but maybe they're going for a sort of "this is a team effort"? Kinda like the story of JFK asking a janitor at NASA what he does, promoting the answer "I'm putting a man on the moon".

I'm somewhat sceptical that such a policy is better than what's usually practiced, but I can imagine that it may be legitimate under the right circumstances.


I imagine their reasoning is similar to The Economist's for having anonymous bylines:

"The main reason for anonymity, however, is a belief that what is written is more important than who writes it. In the words of Geoffrey Crowther, our editor from 1938 to 1956, anonymity keeps the editor 'not the master but the servant of something far greater than himself…it gives to the paper an astonishing momentum of thought and principle.'"

https://www.economist.com/blogs/economist-explains/2013/09/e...


One of us.

Apparently now that they aren't allowed to collude to price-fix salaries with illegal anti-"poaching" agreements, they aren't letting employees have names. Can't "poach" (normal people call it "hire") 'em if you can't name 'em.


Going off on a bit of a tangent, but I feel like Apple's niche in AI will be with on-device processing. The iPhone 7 already has an FPGA onboard, and I would guess the next iPhone will have more/more powerful chips. Training would probably still have to happen on their servers though due to the dataset sizes needed. I might just be full of shit though, I'm not much of an AI developer.


The top comment on Product Hunt from Ryan Hoover raised a good point about Apple's timing with this:

> This launch is particularly interesting because this isn't typical for Apple, a fairly secretive and top down company (when it comes to external communications). Timing makes a lot of sense with their upcoming launch of ARkit, Apple Home, and the inevitable "Siri 2.0", among other things.

https://www.producthunt.com/posts/apple-machine-learning-jou...


Isn't there a way simpler timing analysis. The blog post is about a conference paper which will be presented in CVPR and the conference starts in 2 days.


This blog is not Apple's first blog; they started a Swift blog shortly after the release of Swift: https://developer.apple.com/swift/blog/


And WebKit has has a blog for 11 years now https://webkit.org/blog/4/welcome-to-the-webkit-blog/


I would really like to see the names of people who are working on the research. They reference other papers and give their authors credit, but was disappointed to not see the Apple employees get credit.


I'll be keeping an eye out for acrostics with the author's name.


I'm not sure what you mean? Their names are Shrivastava, Pfister, Tuzel, Susskind, Wang, Webb. What are you expecting to find?


What's powering this site? Doesn't look like WebObjects.



I'd be surprised if they started something new with WebObjects. Very.


Looking at the source makes me think they are using a static site generator. The CSS looks custom but to be based on Foundation: http://foundation.zurb.com/.


"AppleHttpServer/8b2905f7" is an interesting thing to see


It looks like some Node.js static site generator.

- The URL structure doesn't seem to be dynamic. For instance, https://machinelearning.apple.com/2017/ doesn't give a list of all articles from 2017, just a response of "Not found" (at least it gives a 404 error, so that's nice).

- The js files make extensive use of `require()` which is often used by Node.js developers http://fredkschott.com/post/2014/06/require-and-the-module-s...


I really wish this had an Atom or RSS feed


It does: https://machinelearning.apple.com/feed.xml

It's "just" not referenced in the source.

I get that social networks are the tool of choice to prioritise articles from the major news outlets–if The Atlantic publishes something spectacular, it will find me.

But how on earth am I supposed to follow something like this, when I have to guess randomly to even find a feed? Are they expecting me to bookmark it and check for news once a week? There's no Twitter account, newsletter signup, Facebook link... nothing.


So I just got an email back from them and they're going to look into adding it. I really like that tone of interaction, too.


It is in the source as the last element in <head>

    <link rel="alternate" type="application/atom+xml" title="Apple Machine Learning Journal" href="/feed.xml" />


Firefox makes it very easy though. There is a feeds section in site information dialog.


Unlikely, but I wouldn't be surprised to see it promoted on Apple News.


They could put it behind the Open™ Google AMP. Everyone knows that Google is better because they're Open™.


So we're def getting some form of facial recognition in the new iPhone with stuff like this being published.

Feels like an early post to show they've done some advanced work in making sure you can't trick them.


There's already facial recognition in the iPhone - it's used in the photo app to let you sort through photos of different people.

Wouldn't be surprised to see it more in use, though.


More info on the Vision Framework from WWDC 2017 here: https://youtu.be/UXhgjUIXUak


I like the font. Is is possible/legal to use the SF Pro Text webfont?

PS I know the desktop font is available for download at the apple developer site ... but I'm talking about the web font


No <title>?


calling this a "journal" and making it anonymous is disingenuous.


I'm betting that sjobs would not have approved this


Surprising. Hopefully we see more of this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: