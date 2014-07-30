Hacker News new | comments | show | ask | jobs | submit login
IMO it'd be a really smart move for Google to hire Vitaly to help with the launching of this feature and things around it. He has done a great job with PhantomJS.

Even an acquisition[1] of PhantomJS would totally make sense, then let him keep working on it but based on headless Chrome and with real resources.

[1] Careful how you spin it for this route, learn from TJ/Express https://medium.com/@tjholowaychuk/strongloop-express-40b8bcb...


Could you elaborate on the TJ/Express? The post you provided looks like the last one of series of events.


Yes, there was some controversy in the Node.js world because he sold the project of Express to StrongLoop just as he was leaving for Go. I think it was totally a problem on how things were communicated and that he had really a good intention. What didn't help much though is that StrongLoop then left the project unattended so the main person contributing to it (Doug Wilson) wasn't even acknowledged.

So if an open source project is being sold:

- Make sure everyone knows it's because it'll be better handled and not because of the money.

- At least get a decent amount of money! According to TJ it was half a month worth of money.

- If you are the company buying it don't leave it unattended afterwards.

Anyway, now TJ is surely remembered with a lot of gratitude in the Node.js community and I'm so happy he helped it grow so much!

This is all better explained here (and read the comments for TJ opinion): http://thefullstack.xyz/history-express-javascript-framework...


There was a small controversy when StrongLoop purchased the Express repository from TJ, the creator of the project.


This is an excellent idea.


Yeah - I like that idea too.

I played around with PhantomJS for something at my current job - it ultimately didn't work out for us (and we went a different route), but it was interesting and fun to learn about.


Wow, I'm very impressed. At this stage it is a very wise decision to step down and to focus onto something else, rather than to hold on to a project that will eventually disappear. It takes a lot of courage to move on from a project that had to be maintained for several years and that had such a reach.

We can only be thankful for all the good work that went into PhantomJS, and wish the maintainers the best of luck in their next endeavors.

Cheers!


Some context, he's essentially been maintaining a web browser (which is a project on the level of an operating system) on his own.

Phantom 2 switched to QTWebKit which I'm sure was a tremendous amount of work. Probably at the end of that he was hoping things would get "easier", and it sounds like it hasn't. It's just too much work for one person and if companies aren't will to pay people to do it, I'd quit too.


He says in his message it's been a slog for some time, looks like a good time to be done with it. Open source is great, we all like it, but demanding unpaid jobs can get old, too.


And it makes sense. You want Chrome in your tests if your users are using Chrome. Very few (if any) of your users will ever visit your app with a headless PhantomJS browser, so it's not a platform that you should go out of your way to support.

I've been using Selenium Driver with Chrome in xvfb for my headless testing needs, and I've used PhantomJS for some automation things in the past where it was great, but since I switched I really haven't looked back!

I had things breaking subtly that I couldn't fix, and they did not manifest as problems in the Chrome browser or Selenium. I still don't know what was wrong, I just know that my Rails app won't pass its JavaScript functional tests if I use PhantomJS. When I did the evaluation of 3 test drivers, I found that the one with the actual browser in it was the one that worked most reliably.


Thanks so much to the PhantomJS maintainer for his hard work over the years! To me, it feels like his decision is the correct one here.

After we realised that we hadn't seen a "one-browser" bug for 2 years in our massive angularjs app, we got rid of all browsers but PhantomJS in our karma suite. PhantomJS' slowness, lagging-behind in webstandards and just my general gut feeling (these facts above made me question the point in running JavaScript tests in an actual browser at all) made me port our karma test suite to jest w/ jsdom and haven't been happier since we, years prior, got rid of our gnarly Selenium test suite that caught 0 bugs but was the major cause for maintenance headache.


> After we realised that we hadn't seen a "one-browser" bug for 2 years in our massive angularjs app

Really? When I was writing JavaScript I thought I'd hit one ~ once a month. As a matter of fact I hit a v8 bug just yesterday which apparently doesn't support /Etc/GMT[-+][0-9]{0,2} timezones.


I actually started using webdriverIO + chromedriver after fighting too much with casperJS - while webdriver (and Selenium) seem to have much more momentum there are still some things I really miss that PhantomJS gave when using Capybara. I was a very happy Capy+Phantom user when testing my rails apps.

Things like reading the HTTP response code, detecting 404s on assets and catching JS errors in the console are all not possible with Selenium/ Webdriver, and I relied heavily on those capabilities in my Capybara tests.

While headless chrome might be able to replace PhantomJS for many use cases that doesn't necessarily mean the APIs will be comparable. In fact I'd more likely expect the Chrome folks to say "the webdriver API is it, because it's a standard." [1] Sadly IMO it's lacking compared to what PhantomJS was capable of.

[1] https://www.w3.org/TR/webdriver/


When I looked into it, I was surprised by how hard it is to set up basic smoke tests for web development, and that Selenium / Webdriver can't do this.


You can use PhantomJS as the backend on a Selenium script, and this news clearly demonstrates the utility of a higher-level API than using PhantomJS directly. If your tests are in Selenium, changing backends is generally a small matter.

I started writing in Selenium instead of CasperJS (a PhantomJS frontend) because PhantomJS experienced intermittent bugs on the page I was trying to access. I think you're right that "real browsers" are still much more reliable for complex use cases, but the low profile of PhantomJS is definitely nice when it works.


I used Phantom once upon a time, but I eventually switched to just using xvfb as well within a docker instance. Headless Chrome and Firefox with xvfb.


Open source software has to be one of the least efficient markets out there.

If you sum up the very real value PhantomJS has delivered to very real companies over the last several years, napkin math tells me we wouldn't have the project being abandoned for being a "bloody hell" to work on.


The main problem for developers is that getting your company to pay for something like developer tools can be very hard and a long project. Ideally every team gets a budget and a credit card to do as they see fit, but in practice a lot of (especially bigger) companies have a whole acquisition process. It's not uncommon for the hours spent in getting a license for e.g. an editor to be far more expensive than the product itself. I believe this is often highlighted again when Sublime Text is in the news again.

This as opposed to open source tooling which has no such hurdles.


Why do you think open source tooling has no such hurdles? At these same "larger" companies, they're going to have an open source review board, and legal involved for each open source product you want to use.

Just because you can download and use it on your local machine, doesn't mean you're not violating your corp policy and procedure.


Whats more surprising is the company I work for does have a support agreement with Oracle, although when we've run into problems that do need a technical expert from Oracle such as crash dumps and stack traces we're told to figure it out our-self.

Kind of defeats the whole reason to have a closed source system when you're still on the hook and charging out $500 per hour to clients.


This problem was well demonstrated at my shop the other day. There's a lot of fear of open source here. We needed to run a simple FTP process. I said I'd write a short script.

"You've gotta be careful with those free tools though... Never know what will happen when they break. You can't get any support. Besides, we need logging and alerting too."

"Oh, $currentTool does logging and alerting?"

"Well, yes... But it's currently not working. We've got a ticket in to fix it."

That was months ago. Yesterday it broke and there was no alert. There's also still no logs.

Gotta be careful about that "open source" stuff though.


Open source doesn't allow you to pass the buck. Commercial software with support contracts does. There is also the "look how much we're paying! it must be good" factor.


Many companies are paid handsomely to take the blame.


They're also afraid of the abomination you quickly scripted together that the future-you, who no longer works for them, then has to try and figure out.


ffs it's just FTP. All it needed was a scheduled task that called winscp pointed to a file containing a handful of flat FTP commands. If the guy who replaces me can't sort that out, how's he going to sort through my code?

Besides, now it's some abomination wrapped inside a proprietary program that doesn't work right in the first place, only this time we're out the $x,xxx licensing costs and the entire process is opaque, with absolutely zero hope of sorting it out on our own. That's not a huge gain...


and then you can't search stackoverflow for help, compared to using something like postgres


Is it fair to say that kind of people that want to pay for an SFTP client for an automated process probably aren't doing much digging around on stackoverflow?


A good way to address this is have a tools budget.

Product Manager has $X per team member. Is encouraged to coordinate purchases and pool purchases together for licenses. Licenses can be proprietary or Open Source licenses X, Y and Z.


It's a classic "public good" - an economic activity whose benefits are impractical or unworthwhile to deny to those who don't pay. Things like emergency services, last-mile road infrastructure, and environmental protection work. A special-case of positive externalities. These are a classic example used by economists of a natural place for government in the economy.


Which makes me wonder why governments don't give more grants for writing open source code.

They do it for research, which is another thing that would otherwise probably not pay for itself.

I guess if they started doing it a lot of commercial companies would complain to them about the competition.


Dutch government has explicitly funded quite some projects, e.g. libressl and libreoffice, but compared to budgets for closed source, it's still very little.

Here's a nice blog about how the UK deals with this.

https://governmenttechnology.blog.gov.uk/2016/12/15/next-ste...

As to infrastructure work, easiest way to help and still profit is to hire one or more developers that explicitly work on a set of FOSS libraries. That way you have that knowledge in-house and a connection into the community. Also, you'll have some highly motivated employees.


I think that in the US, we've been fighting for some long about how much to cut government, the idea of proposing to create a new category of spending just doesn't occur to Democrats. The closest they can come to imagining something new is free college (which is the same what we have now [free high school, college scholarships] but more so) and free daycare (which is like free preschool but younger).


You're kind of in a bubble, there is a lot of U.S. government support for open source and most agencies use it. See https://code.gov for a starting point. There's never going to be a huge multi-billion dollar grant program because it is unfair to closed-source companies (who pay taxes) but you are imagining a debate or hangup that doesn't exist.


Why is it unfair? Those companies presumably charge for their software, and maybe will be incentivised to go open source.


It's unfair because those companies can't extract their required funding at gunpoint.


At some point the falsifiable hypothesis "government cannot provide goods or services better than private actors" mutated into the dogma "government shall not provide goods or services better than private actors."


Governments do fund some things directly – most commonly grants to specific interests – but one other area which helps is allowing staff to work on open-source projects. At least in the United States civil servants’ work is generally considered public domain so we don't have to deal with the IP concerns which many companies still obsess over, which is nice.

If you poke around https://government.github.com/community you'll find a lot of government created projects but checking those organizations/ contributors will often turn up a ton of forks of popular tools. One common thing is improving security defaults or accessibility, which are tedious but mandatory for government.

If you value this, make sure to let your elected representatives know: I'm sure they hear from the major contractors regularly.


Or companies for that matter. I know other companies that have an open source division like AT&T labs.


The question is reversed: open source software _is_ a positive externality, but happened almost always _without_ any government involvement (save for some rare exceptions like SELinux).

This is probably the reason it worked so good.

So, do we necessarily need governments for other positive externalities in the list?


The comment above mine was mentioning chronic underinvestment; I would say that this indicates the current system doesn't work.


It's absolutely not a necessity, but more incentive for writing OSS could be nice.


> It's a classic "public good" - an economic activity whose benefits are impractical or unworthwhile to deny to those who don't pay.

Open source software is not a public good, in the economic sense. There are two criteria for being a public good: non-rivalry and non-excludability. Open-source software satisfies the first criterion (my using it doesn't prevent you from using it), but it's fairly excludable (I can prevent you from using it legally).

As developers, our instinct might tell us that it's not excludable because "if the source is there, nothing prevents me from using it", but when we're talking about goods which fall under copyright law, the legal aspect matters as well as the practicality. And in fact, open-source licenses (such as the GPL and the Apache licenses) can contain provisions which prevent people from using the licensed software under certain circumstances, while still being considered both free and open-source by the FSF and OSI respectively[0]

The real classic example of a public good is national security. Practically, there is literally no way that national security can be applied to people within a country on an individual basis, as opposed to a geographic one. For most threat models (e.g. espionage, (counter-)terrorism), the mitigations are things like "prevent terrorist attacks from happening". You can't apply the benefits of that only to people who have paid for the service - a terrorist attack either happens or it doesn't, and you can't choose who's a victim of it.

> These are a classic example used by economists of a natural place for government in the economy.

Even for things which are actually public goods, like national security, that's overstating the case greatly. Public goods are used as an example of a good for which an individual market cannot exist, but that doesn't mean that the only alternative is a government one.

The so-called "tragedy of the commons" is an appropriate (and ironic) example - despite the way that most people use the term, the town commons was actually something for which there were plenty of well-established codified rights, and these were not always negotiated or enforced by a government entity.

[0] For example, the Apache license contains a patent retaliation clause, which terminates your right to use the software in the event of a patent lawsuit. (Technically it doesn't revoke your right to the copyrighted code, but it does revoke your right to the underlying patents, which amounts to the same thing, because presumably the copyrighted code utilizes the underlying patents, or else it wouldn't be covered by the license in the first place).


Non-excludability isn't just about the law; it's about practicality. It might in some system be legal to prevent fire services from putting out fires in houses that haven't paid for it, but that would be an impractical system.

Similarly, critical software intended for developers is impractical to deny to them; doing so has some serious negative effects on the production process (hard to get good feedback/PRs from customers, for example, unless you expose the source to them to a level that closed-source manufacturers find dangerous to their business model).

WRT the original commons, you can make such things work in a tight-knit community that can enforce social norms (which, by the way, would take on a lot of what are now considered "government functions" in a modern society), but in a larger capitalist economy with actors that aren't inside the community, the government is the only actor that has the authority to enforce public ownership of the commons.


Can we get non-excludability by requiring that any OSS produce with public funds have a license that prevents such possibilities​?

I'm imagining a clause like: "this source and any modifications is irrevocably eligible for use by all, provided that its creation did not break any other laws".


Yeah, I really think that open-source software shot itself in the foot by incorporating unlimited free sharing for every recipient into its mantra. Now everyone thinks that open-source has to mean impoverished, because despite all the happy vibes, very few people will pay for software that they could otherwise get for free.

You can make your software "source available", i.e., not open-source under activist definitions but still have a GitHub repo and all that, and restrict [heavy?] commercial use. I think it'd be interesting to see more open-source devs take that route and stop giving away the farm.

This will still allow people to use your stuff, developers will get familiar with the tooling and expect to be able to use it at work, and companies that have the dough can be compelled to pony up for a license.

On Windows, there is still an underappreciated market for cheap early-90s-shareware-style applications that are < $100 a pop, but I think most of them think that sharing the source means they have to enter the poor house, which is sad. We should show people that there's a way to share your source without bankrupting yourself in the meantime.

The GPL almost gets there, as it makes large-scale commercial use undesirable due to its infectious nature, which allows for dual-licensing, but with everything server-side nowadays, those stipulations are much less effective (have to go AGPL).


There's no shared source license which bans commercial usages but I think the software industry desperately needs one. I'm selling the software and want to enable the user to make modifications for private use and even share those modifications if they so desire = I have to hire a lawyer and hope he comes up with something that stands the test of a trial.

Not giving away something for free (as in freedom) but asking for free legal advice may sound ironic but it's not about the money, it's about having something reliable for a very common use case.


That's typically the purpose of AGPL plus a dual license offering for cash. This will probably require copyright assignments to make it work tho


I can't help but think that if the Linux kernel were under a CC-NC-BY-SA-type license, as you suggest, that it would be practically unheard of today.

BTW, please mind the distinction between open-source software and Free Software. You don't have to like the FSF, but the distinction they recognize is important.


>I can't help but think that if the Linux kernel were under a CC-NC-BY-SA-type license, as you suggest, that it would be practically unheard of today.

I don't suggest that the Linux kernel be distributed under non-commercial terms, and I agree with you that such a project wouldn't have done well.

>BTW, please mind the distinction between open-source software and Free Software. You don't have to like the FSF, but the distinction they recognize is important.

Right, so there are 3 levels of "purity" here. For the record, I didn't run afoul of any of them; I intentionally distinguished my suggestion as "source available", not "open-source".

There is "Free Software", which is software meeting Stallman's "Four Freedoms".

There is "Open Source", typically referred to as the improper noun "open-source", which activists insist refers solely to license approved by the OSI (Open Source Initiative). Because these include permissive licenses, the FSF considers them potentially non-free, and makes the point that Open Source isn't good enough; it must be Free Software.

Then there is "source available", insisted upon by the OSI people, to indicate that while you can download and modify the source, it is not distributed under copyright terms they like. This would be source distributed only for non-commercial use, for example. Jef Raskin's Archy project (apparently now dead) [0] was distributed under CC BY-NC-SA and made this distinction.

[0] https://en.wikipedia.org/wiki/Archy


No, companies that make billions of dollars in an ecosystem that benefits from said tools and don't pay a dime for the public good are to blame. It's not a problem with open source, it's a problem with a culture that takes and doesn't give back enough.


Open source software has to be one of the least efficient markets out there

Does it have to be a market?


Amen. Somehow a lot of comments in this thread are blinded to the fact that the 'commons' are not simply subsumed within the 'market'. The logic of commons is quite orthogonal to that of markets.


> Open source software has to be one of the least efficient markets out there.

The regular market rules don't really apply to Open source software. A lot of viable (or even thriving) open source projects would be dismal failures as stand-alone businesses or startups. Paradoxically, the only way they can provide real value is in their current form of open source projects.

I would have given CyanogenMod as an example, but the amount of inept management at the startup there would cloud the issue.


Hopefully the author will be able to parlay the street cred generated from PhantomJS by selling future employers or customers on the same napkin math.


Also Firefox will get headless mode in few releases https://bugzilla.mozilla.org/show_bug.cgi?id=1338004


thank god. I was afraid we will be back in the grasp of the google's moloch.


Curious what "moloch" means in this context. I can google it, of course, but I get "Biblical name relating to a Canaanite god associated with child sacrifice". Which doesn't help me much. End users are children that Google is sacrificing? Or?


To give a shorter answer: Scott Alexander (at the slatestarcodex link), through the poem, associates Moloch with negative-sum games, where no one comes out better than they went in. In extreme cases they force us to sacrifice the things we love in order to survive. You throw your children to Moloch to help you defeat enemies; otherwise, you die. Your enemies do the same thing. It would be better if nobody sacrificed their children, but nobody is in a position to bring that outcome to pass.

In this context, I would interpret "google's Moloch" along the lines of: Google is net-bad for the world, because of privacy issues and problems with centralisation and so on. Using Google's software (and services) makes them more powerful, so people don't want to use Google's software. But because everyone else is using Google's software, the world is optimized for Google users in a way that it isn't optimized for non-Google users, and so it's difficult to escape. And so Google grows yet stronger, and it becomes more difficult to escape.

(To clarify: this is my interpretation of grandparent's use of the phrase. It's not my own position, and there's a decent chance that I'm completely off-base and it's got nothing to do with grandparent's position either.)


It's a term that's been kicking around literature for centuries. It's in Paradise Lost, for cryin' out loud. Why on earth would its use be a references to some random prolix internet dude's weird verbal recreation of the La Brea Tar Pits?


Like I say, I don't know that it is. I presented a hypothesis that seems like it fits the facts fairly well.

Do you have another hypothesis about what the term means in context? I note that "it's a reference to Paradise Lost" is not very descriptive: for example, if I talked about "Google's Frodo" you might ask what I mean by that, and "he's a character in Lord of the Rings" does not answer the question.


I don't really have to have a hypothesis, it's not that uncommon a term. It's used for all sorts of things including a fairly generalized 'insatiable and demanding metaphorical monster'. If someone offhandedly mentions Icarus it seems reasonable to assume they're not really alluding to a review of the Hungarian brand of buses posted to alt.rec.bus in 1991.

Take a look at, say,

http://www.nybooks.com/search/?s=moloch&option_match=&year_a...

Lots and lots of Moloch. Your hypothesis is 'it's a reference to some logorrheic blogger'. Sure, it's possible you're right but it's one hell of a weirdly specific guess. It's not like the three people to ever mention Moloch were Milton, Ginsberg and internet-man-addicted-to-his-own-typing.


Well, there is more than one social allegory related to Moloch. Many of them have no obvious relationship to Google. Which is why I asked the question in the first place. So people tried to be helpful and identify any sources that might be related.

I just read the character description for Moloch in Paradise Lost, and don't see any obvious theme you might tie to Google.

In fact, I'm still not quite sure what the original comment was trying to say. Apart from some fuzzy notion that capitalism is sort of like Moloch and we keep feeding it with our "children". Where children is what? Privacy? Money? Open source tools?


I don't think that "Google's moloch" was proper usage, but in literature (ie - Howl by Allen Ginsberg), Moloch refers to something requiring a very costly sacrifice. In Howl, many critics argue that AG is referring to capitalism when he uses Moloch.


They are probably referring to http://slatestarcodex.com/2014/07/30/meditations-on-moloch/


https://www.poetryfoundation.org/poems-and-poets/poems/detai...

Heading II


And while people wait, you can already do a 'poor mans' headless Firefox thanks to SlimerJS and xvfb.

Phantomjs is less resource heavy if you're constantly spooling up and down lots of instances but I prefer SlimerJS w/ Firefox since it lets you keep up to date with a modern version of Firefox (rather than relying on sporadic QTWebkit updates from Phantomjs).

If you're using Casperjs, SlimerJS is virtually a drop in replacement for Phantomjs (though I worry about how long/well Casperjs will continue to be maintained).


One can use Xvfb to run normal Chrome as well (with few gotchas like --no-sandbox, --disable-gpu and dep on dbus-x11). A test I'm working on at the moment takes 28 seconds in PhantonJS and 6 in Chrome under Docker and Xvfb.


Very interesting comparison! Could you write a note on your experience somewhere?


If you need a real good alternative checkout https://github.com/arachnys/athenapdf it's based on electron.


Looks very nice but I'm not using Slimer for PDF stuff.


A colleague and I spent a couple of days setting up high-fidelity webpage->PDF rendering, and by far the best results were got with SlimerJS and xvfb.


I'm actually hoping CasperJS will support headless Chrome, and headless Firefox later on too.


Though a collaboration between the two projects might not be out of the question: https://groups.google.com/d/msg/phantomjs-dev/S-mEBwuSgKQ/PQ...


google should hire him


> I even bought the Mac for that!

I did too, then found out you also needed a dev license for users being able to run your app. Supporting Mac/OSX is damn expensive if your app is free.


For iOS development I believe they got rid of that requirement, you can develop iOS apps (at least) using a personal dev account; only when you want to go to the app store do they ask for the license fee.


Unfortunately they limit app capabilities (such as iCloud, keychain access), so you can't use those even for your personal account.


How do I do this without buying dedicated hardware?


Illegaly (virtualization) or in the cloud.


PhantomJS enabled us at the time to bootstrap a big project at work where at the end workflow the app had to turn HTML orders to PDF on the fly, eventually we moved to WKHTMLTOPDF (https://wkhtmltopdf.org/) which is much less hungry with resources but nonetheless PhantomJS played a huge role during the early days of the project and was easy to setup. If I remember correctly the only down side was to find the correct format for our HTML template so PhantomJS would render proper page break and repeat the header for super long orders.

I can understand why stepping down is the right decision, maintaining such project by himself is an amazing feat on its own and even more when it proves to be useful for so many companies. Sadly when it becomes your second job you might always be on the lookout for a clean exit and such opportunity just became a reality.

Good luck in your future projects Vitaly!


We have used PhantomJS and WKHTMLTOPDF both. Phantomjs hogs a lot of resources but is very good when you want print large PDFs (500+ pages). WKHTMLTOPDF struggles with larger HTMLs.


checkout https://github.com/arachnys/athenapdf. Its based on electron.


Good to know, this is not our use case at the moment, we mostly generate PDF of 2-3 pages long.


> Chrome is faster and more stable than PhantomJS. And it doesn't eat memory like crazy.

Wait, can someone tell me where to download this doesn't-eat-memory-like-crazy version of Chrome? Activity Monitor is showing me 2GB of Chrome processes right now and that's even with The Great Suspender having paused almost all my tabs.


I saw a trick where you can run Chrome and give it less memory, and it uses less memory. This is done using cgroups.

The blog post is somewhat old[1] and not in sync with the version that is in Git[2], you might find a way to do this without Docker (I was using an old version of Docker and kernel that couldn't get it to work. But I need the old version of Docker for reasons.)

Chrome will aggressively consume any memory you give it (up to a point?) to "make your browsing experience better" somehow. You're not wrong. But there is modern technology that can make it better. If you have a fast SSD, then Chrome can still use Swap to make your experience better. The later version in the Dockerfile linked on Git also leverages swapaccount with the seccomp setting.

This may be one great use of Docker for people that wouldn't yet have been convinced to use Docker for any serious reason.

[1]: https://blog.jessfraz.com/post/docker-containers-on-the-desk... [2]: https://github.com/jessfraz/dockerfiles/blob/master/chrome/s...


You can just use cgroups on their own without docker


I would guess ulimit -v might work as well, so you may not even need cgroups.


Hah! Something new is also something old. Thanks for that.

But won't you run into issues with child processes, of which Chrome tends to spawn a zillion? (I'm reading that each one gets its own limit under ulimit...)

And potentially get your browser killed when it hits the limit? I haven't tested this and I don't really know how ulimit works, but I think it's a less effective solution than the cgroups for at least one reason.


Yes, it is per process, so it would only limit memory-per-tab. I believe you get the "aw snap" window in chrome if it hits the limit.


Disable your extensions one by one until you identify the one leaking memory.


Or just look at the chrome process manager. Shift+Esc


That doesn't tell you if an extension is injecting code which causes a page to bloat significantly.


Ah right, good point.


I'm not sure that the results should be expected to be the same in GUI and in headless mode. I don't know - I'm just saying this is not clear without a test or clarification from someone who knows how it all works.


I'm assuming they are referring to: https://www.chromestatus.com/features/5678767817097216

Or something pretty close to it.


Is it Chrome eating memory of the websites you have visited? Suspended tabs suggests issues, which may not free up till you close them.


Wow, that was a quick reaction.

Thanks to the maintainers for all the good work!

To me, it's always a sad occasion to see diversity diminished. Nothing against Chromium, but I hope it won't be the one browser to rule them all. It's always good to have alternatives.


They know since June about the project. This is an interesting conversation:

https://groups.google.com/forum/#!topic/phantomjs-dev/S-mEBw...

( posted by askmike in https://news.ycombinator.com/item?id=14105613 )


Naive question here. What makes headless mode so difficult?


It's a very good question. One might imagine that the browser renders everything into a buffer at some point, and you could simply ask the engine to give you a pointer to that data.

The reality is very different. WebKit/Blink rendering is intimately tied with the graphics system of each platform, in particular through the use of native widgets and native window system compositors.

For example, on the Mac, a lot of compositing within the browser window is done using Core Animation layers. This is a really good idea for performance, because it leverages the work done by Apple to improve their GUI performance.

The downside is that capturing the output becomes very tricky when the browser doesn't do the final compositing. Previously this didn't really matter because 99.99% of browser rendering is for end users and they don't need to capture the output (or if they do, they would just use platform GUI functionality like screen capture).

An increasing demand for headless rendering has effectively forced browser engine teams to rethink some of the internal APIs so that a pipeline can be built to capture the final rendering.


It's the same order of magnitude of work, as it is maintaining a fully-fledged browser. It's sad to see open-source projects shut down, but being a sole developer is a lot different than having the resources of a giant like Google, for example.


Thank you for PhantomJS! Been using it for testing, generating PDFs and screenshot.


Yes, thanks PhantomJS!

I'm wondering, are there any examples out there for generating screenshots on headless Chrome?


From the docs it sounds like it is largely compatible with Selenium https://chromium.googlesource.com/chromium/src/+/lkgr/headle...

If that is true, the takeScreenshot() should work as normal hopefully.


I can confirm that. Here's an example of how we take screenshots when automated tests fail, using Selenium + Chrome:

https://gist.github.com/masonmark/2332c1238a2fa70b5e4fcfffdc...


I am happy for the guy as he seems to be able to let go without letting anybody down (which seems to be important to him). At the same time, it is sad when people have such a pressure for something they probably started as a fun project.


Question for those of you more involved with such headless tasks. Do you think that chromium and firefox supporting headless will induce a surge in bots crawling the open web from now on?


Right off the bat: No. The reason for that is that crawling using a proper browser (i.e. Chrome) is a lot more resource-intensive than using a dedicated tool which only gets the top resource and maybe tries to parse some additional resources. With these kinds of tools you're limited by available bandwidth and IO speeds if you want to store things. If you're looking at a browser, you'll be limited more by things like memory consumption and CPU time, so you'd need a bigger box or more of them to drive the same amount of traffic. There is also not the same amount of ready made applications which take care of crawling, storing and maybe even indexing your data, so not something you can do without actually implementing a lot of things yourself.

Of course, that is only talking about wide-scale scanning. If you're only looking to scrape a single target, for whatever reason, then having an instrumented headless browser will greatly simplify things. Headless Chrome should be more efficient than running it in a (virtual) framebuffer. Plus the whole setup for a powerful crawler is reduced to "install Chrome, start it, point $crawler at the API endpoint". My guess is that we might see turnkey crawling / automation tools appear where you supply a list of URLs and the library + Chrome does the rest. Then, browser-based large scale scanning will be within everyone's reach, only limited by their resources.

Background: I created https://urlscan.io which will simply visit a website and record HTTP interactions (annotating the resources with some helpful meta data). I've been preaching the power of headless instrumented browsers for the better part of a year now ;)


Not a whole lot more than they are at the moment.


With chrome headless we still need an api like phantomjs or slimerjs to have the same functionality.


My colleague from automatic testing says that phantomjs actually is much more stable than chrome...


That's true, but in my experience the instability of Chrome comes from opening and closing it's windows repeatedly as a large test suite often does - occasionally it doesn't seem to like opening a new instance while another is closing. Headless mode should resolve that problem.


There was also the issue a few years ago (last time I wrote automated scripts) where the Chrome driver for Selenium would go too fast for the browser to keep up, causing false failures.

I had to implement a "wait between actions" feature to handle it, while Phantom had no such problems. I'm assuming this will not be an issue with headless Chrome, since I think half of the problem was due to graphical rendering.


It's both sad to see an incredibly useful project be sunsetted and exciting that it's no longer needed. I remember a project that used phantomjs to scrape an old government camping site to build a compatibility layer on top.

Thank you, if you're reading.


Good riddance! PhantomJS is a non-stop firehose of random errors and productivity breakdowns. It's also a way better JS driver than anything else out there. I'm glad Google is following in their footsteps and integrating Phantom's features directly into Chrome, where it will be supported by a large team and (hopefully) headless use of the Blink engine will be standardized so your test integrity doesn't depend on a patch version upgrade of your underlying JS implementation.

So, cheers to you Vitaly and anyone else who's helped make Phantom & Poltergeist into my favorite Capybara web driver!


This is sad, phantomjs is better stripped than Chromium headless, if you ever try to install Chromium on servers without X, it requires shit ton of dependencies, while phantomjs was properly modified requires only minimal library.


I don't really see the problem. After all, you just install those deps once in your Dockerfile, right? ;)


chromium-browser requires these

    chromium-browser chromium-browser-l10n chromium-codecs-ffmpeg-extra cpp
    cpp-4.8 fontconfig fontconfig-config fonts-dejavu-core hicolor-icon-theme
    libasound2 libasound2-data libatk1.0-0 libatk1.0-data libatomic1
    libavahi-client3 libavahi-common-data libavahi-common3 libcairo2
    libcloog-isl4 libcups2 libdatrie1 libdrm-intel1 libdrm-nouveau2
    libdrm-radeon1 libfile-basedir-perl libfile-desktopentry-perl
    libfile-mimeinfo-perl libfontconfig1 libfontenc1 libgdk-pixbuf2.0-0
    libgdk-pixbuf2.0-common libgl1-mesa-dri libgl1-mesa-glx libglapi-mesa
    libgmp10 libgnome-keyring-common libgnome-keyring0 libgraphite2-3
    libgtk2.0-0 libgtk2.0-bin libgtk2.0-common libharfbuzz0b libice6 libisl10
    libjasper1 libjbig0 libjpeg-turbo8 libjpeg8 libllvm3.4 libmpc3 libmpfr4
    libnspr4 libnss3 libnss3-nssdb libpango-1.0-0 libpangocairo-1.0-0
    libpangoft2-1.0-0 libpciaccess0 libpixman-1-0 libsm6 libspeechd2
    libthai-data libthai0 libtiff5 libtxc-dxtn-s2tc0 libx11-xcb1 libxaw7
    libxcb-dri2-0 libxcb-dri3-0 libxcb-glx0 libxcb-present0 libxcb-render0
    libxcb-shape0 libxcb-shm0 libxcb-sync1 libxcomposite1 libxcursor1
    libxdamage1 libxfixes3 libxft2 libxi6 libxinerama1 libxmu6 libxpm4
    libxrandr2 libxrender1 libxshmfence1 libxss1 libxt6 libxtst6 libxv1
    libxxf86dga1 libxxf86vm1 x11-common x11-utils x11-xserver-utils xdg-utils
phantomjs:

    fontconfig-config fonts-dejavu-core libfontconfig1 libjpeg-turbo8 libjpeg8


Err, that's... simply not right. The difference isn't that dramatic.

  $ docker run -it ubuntu:16.04 sh -c 'apt-get update && apt-get install phantomjs'
    134 newly installed. After this operation, 339 MB of additional disk space will be used.
  $ docker run -it ubuntu:16.04 sh -c 'apt-get update && apt-get install chromium-browser'
    202 newly installed. After this operation, 615 MB of additional disk space will be used.


I'm assuming a lot of those dependencies (x11, libgtk) will disappear once installing a special headless version of chrome.


Not everyone is using Docker.


Besides testing, my team uses phantomjs to convert pages to PDF and also to convert javascript generated charts to images. This is sad news, I'm don't think chrome eventually will add support for.


There's wkhtmltopdf[1] when you need the PDF functionality. It is licensed with LGPL, so you can make it interop with your commercial product.

[1]: https://wkhtmltopdf.org/


And there's its sister project wkhtmltoimage for rendering to images as well :) Sadly, though, the browser seems way behind the times so I think Chrome will win the next round. I built a prototype library for rendering pages to animated GIFs using the Chrome debugging protocol, but headless will make it even easier: https://github.com/peterc/chrome2gif


While it's much better than PhantomJS, still causes a lot of issues. We run a paid pet project [1] for html to pdf conversion and most of our customers have stories beginning with PhantomJS then moving to wkhtmltopdf then finally going for something else due to issues with both.

Headless Chrome might solve this issue once and for all though.

[1] https://restpack.io/html2pdf


For me the only issue was kerning with some specific fonts on windows servers. Do you have any other examples?


I have been using it for almost 2 years on a production server to generate html orders to PDF on the fly, such an amazing tool.


Second wkhtmltopdf ... I love PhantonJS but wkhtmltopdf is a better tool for PDFs.


did you have luck with SPA to PDF? I recall I tried it but with no luck for good javascript support.


SPA as in Single Page Application?

We are using it for reports and we are tailoring our HTML specifically to the reports so we haven't had that issue. If I needed Javascript it may not be the best solution.

I like that it generates full PDFs (with real text objects) though not just a static image. I'm not sure if you can generate a PDF like that with PhantomJS. I haven't tried.


We recently ended up using NightmareJS with xvfb in Debian Docker image to produce high fidelity PDF's.

Seems to work well so far.


Chrome will have headless save to PDF fairly soon: https://bugs.chromium.org/p/chromium/issues/detail?id=603559


You can use WebDriver to take screenshots. OTOH I don't think there's any way to do PDF generation without fucking around with injecting window.print() and trying to go from there.


Hey - what do you use for the pdf conversion specifically? We've been looking for something like this. Thanks in advance.


Many thanks to the maintainer for his work. I think this isn't unexpected, and actually encourage other unpaid maintainers to follow. Reason I'm thinking this is that the current state of voluntary support is unsustainable anyway, and by letting it go we maybe could make the market for dev tools economically viable again.


PhantomJS is a great tool, I implemented it as a PDF report generating system. But will Chromium be able to replace it in this regard? Will Chromium have paging features? Will it be able to repeat table headers when a table body content extends to the next page?


I can't answer all of your questions, but this thread may interest you: https://news.ycombinator.com/item?id=14102248


Huge help, thank you!


You deserve a nobel peace prize.

I recall fondly informally introducing colleagues to chrome dev tools, injecting jquery via a booklet, and querying the dom like xml xpath. Then taking this headless (server-side) with almost minimal wrapping due to your work.

Hail you and damn regexp.. :)


Anybody know how to port code from using PhantomJS to headless Chrome? I have been using CasperJS that wraps to PhantomJS. PhantomJS had its own set of commands. headless chrome will have to be different one way or another.


I wrote a bit about how you can get started here using Node: https://objectpartners.com/2017/04/13/how-to-install-and-use...


Looks like it's already usable?

"Use headless chromium with capybara and selenium webdriver - today!" -- March 30th

http://blog.faraday.io/headless-chromium-with-capybara-and-s...


NW.js v0.23 will support headless with Chromium 59. I've been collecting feature requests and sharing the plan. Please see https://github.com/nwjs/nw.js/issues/769#issuecomment-259867... and https://github.com/nwjs/nw.js/issues/769#issuecomment-294064...


I found PhantomJS pretty unusable for my use-case. It always spits logs to stdout and there's no way to change that. Filed a bug and he said "fix it yourself"...sorry I don't know Qt.


As a developer, I wonder how hard can it be to disable console logging on any project if the source code is provided.


Welcome to open source software. Somebody has to fix it.

Better than it being closed source and you being told to go away.


You don't need to know Qt to comment out printf


I maintain a web driver too (Selenium-based) and have been wondering about how it would compare to Selenium ChromeDriver headless. Mine is built in Java and uses Java's embedded WebKit. If anyone has feedback this is where I'm discussing it, https://github.com/MachinePublishers/jBrowserDriver/issues/2...


I think is a wise decision too. As an OSS collaborator is hard to explain how important/demanding this work is. I really understand his feeling, and I hope that more people like him could collaborate in OSS projects. Thanks for everything!


PhantomJS was the first thing we used for Neocities screenshots and I've always had a special affection for this project for that reason. Neocities really wouldn't have been possible without the ability to do screenshots.


This is a great move. I appreciate what Phantom (and the maintainer) were trying to do, but I have always loathed PhantomJS. It has never worked well. In fact, I'd been away from it for some time, but just last night needed to install it to run some tests and it caused massive frustration.

I pulled a Node repo that ran tests using Karma (why people use Karma is a complete mystery to me). I pulled the repo, ran `npm install` and then `npm test`. Sure enough Karma explodes out of the gate.

Phantom can't start. I'm on Windows 8.1. I debug for an hour, eventually finding a magic custom binary Ariya created. I then have to copy this binary to the `/node_modules/karma-phantom-launcher/node_modules/phantomjs2-ext/bin` directory.

All this to run some Jasmine specs.

If Chrome headless support is really as good as "works just like Chrome without the GUI" then I will be one happy camper.


I personally use Chrome at the moment as on larger projects PhantomJS as non forgiving with syntax errors and tests will simply fail.


Where's Ariya in all this? He seems to have also completely abandoned Phantom...?


Is there any way to donate to the PhantomJS project? This seems like a good time to throw some money their way, in thanks for what the maintainers (mostly Vitaly over the past few months, at least, it looks like) have done.


I run phantomjscloud.com, I guess this is the writing on the wall and I better start building another [chrome] back end soon. Probably a name change is in order too!


I appreciate all the hard work.

In my last job, I used PhantomJS with highcharts to provide a web service for generating charts. And used it with the poltergeist gem for headless testing.


Hey dude... thanks from a grateful dev in Scottsdale, AZ. Your hard work enabled a lot of really cool stuff for us! Good luck in your future adventures!


Thank you for your work Vitaly, it's truly inspiring, you've made a difference for the community. Good luck on your next project.


Thanks for the work, PhantomJS made generating screenshots to show different UI states in bulk effortless.

It has saved me days of effort over the last year.


Thanks Vitaly for your work! SEO4Ajax would certainly not exist without PhantomJS. It helped us to deliver the service efficiently at the time.

Unfortunately, we had quite a few compatibility issues with it leading us to migrate to Chrome (with xvfb) one year ago. Since then, we must confess that we are very happy of this choice. Chrome is indeed very stable, fast and more importantly for us, always up-to-date.


Have been using PhantomJS since couple of years for data scrapping. It's a really good project.


That's quite sad.


It might be for the best. One of the many companies using PhantomJS to make money could go ahead and employ Vitaly to work on the project full time.


I'm curious - What is the utility of headless browsers?

Are there people who earn money by getting it to automatically fill out forms, enter competitions etc?


Testing, rendering, scraping, streaming, botting, and many more.

Think "what could I accomplish with a browser, with the slow human replaced by a fast program" and let your imagination run wild.

The space is interesting enough that people have jumped through a lot of hoops to make it work in the past; this makes it one less hoop.

Oh, and if you ever wondered why web captchas are a thing, one of the reasons is headless browsers.


A great example would be PDF generation for things like invoices. Rather than generating a pdf with something like PHP or Java. Render a regular html page with all the css you want (super easy compared to drawing a PDF using PHP) and then proceed to use a pdf printer on that page.

You could run such a thing as a microservice using a headless browser or PhantomJS. There are probably better ways to do this but that's one of the first things that popped into my head!


The Webkit/PhantomJS PDF export actually supports SVG embedding (as real vectors), Webfonts and many other things.

It's possible to create pretty advanced layouts with maps, graphs using that. Even embedding IFrames with e.g. Google Maps works.


This sounds horrible.. In general all these html->pdf ways of generating pdfs sounds horrible -- why don't people use latex for this?


Because you probably already did the layout work in HTML to display on screen to the user and now just want a PDF version of it.

Or you can redo the layout in latex and maintain two layouts.

The full print css is actually pretty complete, problem is the only browser that fully supports it is PrinceXML. None of the major browsers seem to care much about print layout.


But HTML layout is very different to page based layout. HTML is responsive and has no concept of pagination. PDF is paginated and has no concept of responsiveness.


CSS2 has the concept of paged layouts: https://www.w3.org/TR/CSS2/page.html


It is NOT horrible. However, Latex is another way of doing things. Perhaps your requirements should dictate which of the two you should go for!


A few little personal things I've done with PhantomJS:

• A script that would go to Comcast's TV schedule for my area and make a list of all movies upcoming in the next two weeks on all channels that are included in my subscription. I could then grep that for a list of movies I've been looking for.

I couldn't just grab the page with curl and parse it, because JavaScript does most of the work. JavaScript fetches the listings, and when you advance the listings it it fetches the new listings and replaces the old ones on the page.

• A script that goes to the FCCs license information site and gets a list of all ham radio callsigns issued recently [1].

• A script that given a URL to a tactics problem on lichess gets the FEN for the position. I'd use this if I was doing tactics training there on my iPad and did not understand why my answer was wrong or why their answer was right.

I'd mail myself a link to the problem, and then later on my desktop I'd give that URL to this script and it would go to lichess to the problem page, and then from there to the board editor page for that position, and grab and give me the FEN, which I could then use to set up the position in Stockfish to analyze.

(This is no longer useful. They have made some changes at lichess and now they have a browser-based version of Stockfish on the problem pages, so I can answer my questions right there).

• A script that goes to everquest.com and gets the server population levels from the server population display on that page.

I don't think that there was anything in this one that actually needed a headless browser. As far as I recall it could have all been done with getting the page with curl and parsing it. It was just easier to do it in JavaScript using the DOM. (The lichess one may also have been that way).

[1] https://github.com/tzs/todays_hams


Web scraping for sites that discourage simple robots by checking for JS or serving content via JS.

Also, the darker stuff like click fraud and all the other kinds of fraud where you pretend there are humans doing something when in fact there's just a bot.


I wrote a PhantomJS script to download data from my bank accounts. They offer no API and disfunctional text-based exports, their websites are ridden with "good" (=terrible) web & security practices, like 3-characters-of-the-whole-password authentication, single-tab sessions, frames, etc. that makes it pretty much impossible to scrape with Python but relatively easy with a fully-fledged browser (although that still requires a lot of bank-specific boilerplate code).


If your bank has a mobile app, it might be easier to MITM and figure out their API and use it directly.


That actually sounds like it could run you into legal issues (or worse), depending on your location (ie - access to a computer system without permission; they give you permission to use the app on the phone, but maybe not to use the API directly). YMMV.


Who do you bank with?


NatWest, Halifax and AMEX (not a bank, but I want my account data from there as well).


I can't help you with Halifax or AMEX (yet), but my company (https://teller.io/) has a Natwest API in production (private beta). If you would like access, please ping me. sg -at- teller.io


I use https://github.com/bfirsh/needle/blob/master/README.md for automated UI regression testing. Using a headless browser means your test suite can run faster and with fewer dependencies.


Downloading any kind of web page where a simple wget or curl turns out empty, for instance anything made with React or other advanced JS frameworks.


Server side automation of multiple kinds (not just tests): screenshots, advanced crawlers, etc.


At work I was given permission by a vendor to screen scrape their site while they worked on building a real API. This site was extremely dependent on javascript. Including doing some really complex token passing between multiple domains that the company owned. Not to mention all of their js was minified and uglified so I had a very hard time understanding what it was doing.

It was the first time I wasn't able to successfully reverse engineer a site enough to scrape what I needed with just requests/beautifulsoup. I was however able to get it working just fine using phantomjs via selenium via splinter. It was a fun exercise, but part of me still feels like it was cheating.


While we're on the topic, does anyone know where one might find scripts to scrape bank statements so you don't have to download them manually every month? (This is one thing I would find headless browsers useful for...)


>Scrape bank statements...

There is Kantu: https://www.a9t9.com/kantu/web-automation

It uses screenshots and OCR to automate web browsing and scraping, to you do not even have to "touch" the DOM. What you do is you simply draw a frame around the areas that you need to have extracted and OCR'ed. It also works with PDFs.


Boobank (http://weboob.org/applications/boobank) is such a collection of scripts (althought mostly centered on french banks).


If your authentication is basic enough, it should be a problem to write one. The problem with a lot of banks is things like two-factor auth to do anything. If I didn't have a mortgage locked in at a crazily great rate, I'd consider changing banks just to get one that'll let me automate statement downloads. The alternative is to build a device to press buttons on their 2FA device... Come to thing of it that might be a fun hack.


2FA isn't my issue here. Actually downloading the statements is. Lots of banks use JS, some go through really weird hoops getting you the PDF that are difficult for non-experienced people to automate. Stuff like JS in embedded iframes that generate the link on the fly and open a new tab that you have to navigate. It's hard to accurately detect all the links and handle things like "Next Page" and so on, especially for more than one bank. It's quite nontrivial.


If you don't have 2FA issues, then while I agree it's non-trivial, it's certainly doable with a headless browser. But yes, I'd love for there to be simpler ways to do this in general.


Who do you bank with? I might be able to help.


For UK banks check out https://teller.io/ for an API to your bank account (Disclosure: my company)


Every UK bank I've used allows some format of csv/qif/ofx export.

Are you in the US?


They generally do. But many UK banks make it hard to automate things. Either insisting on 2FA in all cases, or having a secondary login without 2FA that only gives very limited access.

Some way of authorising API access to read-only access to things like statements would be fantastic, to the extent that I'd consider changing banks over it, if you know of any UK banks that offer it.


As well, most bank OFX/CSV exports I've dealt with are truncated in some ways (e.g. truncated labels), which make it harder to really leverage sometimes.


Monzo will be offering current accounts soon.

https://monzo.com/blog/2017/04/05/banking-licence/


Ahh of course. Apologies for missing that.


The format isn't the issue. The problem is I want it to be API-friendly so I don't even have to think about it; my system should download it automatically.

But yes, I'm talking about the US.


My bank offers that, but it costs I believe ~$15 a month.


I used PhantomJS as part of a report generation pipeline which served no HTTP requests and contacted no outside servers. We made PDFs and ready-to-email, single-file HTML reports with some minimal interactive features. (Ready-to-email, single file == all images turned into data URIs, styles inlined, for HTML files sent as attachments)

PhantomJS loaded up an HTML file written earlier in the pipeline. The HTML consisted of a big slug of JSON containing all the relevant data (which would vary from one run to another) and a bunch of scripts and templates (which were fixed for any given report type). The scripts built into the HTML file would chew up the JSON slug and build up the DOM required for the report. Then the PhantomJS script would identify all the images in the DOM and replace all of them with data URIs, strip out the JSON slug to prevent giving away more data than contained in the DOM, and strip out all of the templating JavaScript, leaving behind only the JavaScript needed for the interactive features, which was inlined.

We went with PrinceXML for PDF generation. I was briefly nervous because I saw people praising PhantomJS' pdf generation capabilities... but then I saw the people saying, "we used PhantomJS for pdf generation, we used wkhtmltopdf, then we just paid some money to get something that wouldn't produce weird output some of the time." CSS Paged Media Module FTW, y'all.


We used it a lot for full automation tests for the UI. It's nice being able to interface with a full-featured browser that can run javascript, etc. And take screenshots when things go wrong.


We have a complex matrix of layouts and styles an user can chose from and we need to test them across all browser to make sure any improvement doesn't mess with others.

It's way cheaper to launch a browser headless at all the resolution we need and grab screenshot to visually compare them at glance, instead of goin one by one at hand


>Are there people who earn money by getting it to automatically fill out forms, enter competitions etc?

Yes. Tiny ex. https://news.ycombinator.com/item?id=1165680


My favorite use is automated browser testing. Examples:

- in Ruby, poltergeist (https://github.com/teampoltergeist/poltergeist)

- in Elixir, hound (https://github.com/HashNuke/hound)


We automate buying things on sites like Amazon.


Can you give me more details? Is it like some automated stock control system which orders from Amazon when stock is low?


I use it to automate turning webpages in pdf.


Automated tests.


As well as scraping more JS heavy stuff.


I have used them in the past to convert graphs and reports to JPGs and PDFs so that I can automatically email them to people in the company who are unable (or unwilling) to use a web page.


Server side rendering, testing, automation just to name a few.


Automated browser testing on a CI server that doesn't have a gui.


automated testing and web-scraping mostly.


Missing the forest for the binary trees


Yeah, I had the same thought. Just because you have a ton of experience that suggests you'd be good at doing a job doesn't mean a tech company would do something silly like hire you for that job.


Furthermore, you may simply be uninterested in the sort of tasks google might give you.


Like inverting binary trees


I want to say it'd never come up (because it's never come up for me) but I just remembered Google is a search engine.

It may very well be that they have inverted binary trees on every desk.

--

Yes... I am aware inverted indexes are not binary trees.


I am about to ask the same question. LOL


How does this news fit in with Selenium?


Selenium won't really be impacted, as it's higher-level than PhantomJS.

This is also a great example of why it's smart to use Selenium or something like it for scripting the tests. You can easily swap out for another backend in Selenium, but if you wrote tests in pure PhantomJS, you're now stuck with a codebase that depends on unmaintained software.


You'll use Headless Chrome instead of PhantomJS as your driver.


That sucks for any scraping use case. I have to imagine google has built in some way to detect headless browser mode serverside, even if only they can access it.


i use phantomjs with the chrome webdriver, or whatever it may be called, already, all the time.

if they wanted to do that, theyd already do it. this change doesnt seem to be all that big.




