
PhantomJS: Archiving the project, suspending development - gowan
https://github.com/ariya/phantomjs/issues/15344
======
emilsedgh
Chrome and Firefox gaining headless modes is the ultimate goal Phantom
could've achieved.

So I consider it a complete success.

Kudos to all contributors.

~~~
fareesh
Does headless chrome do PDF generation? That's the only thing I'm using
phantom for at the moment.

~~~
itslennysfault
Yes.

It's really easy to do using
[puppeteer]([https://github.com/GoogleChrome/puppeteer](https://github.com/GoogleChrome/puppeteer)).
The 2nd or 3rd example is PDF.

~~~
jonaswi
I just recently switched from using PhantomJS to pupeteer for PDF generation
in a production application. Works like a charm and has a very clean API.

------
wgjordan
This project has been effectively dead since April 2017, when Vitallium
stepped down as maintainer as soon as Headless Chrome was announced [1]:

> Headless Chrome is coming [...] I think people will switch to it,
> eventually. Chrome is faster and more stable than PhantomJS. And it doesn't
> eat memory like crazy. [...] I don't see any future in developing PhantomJS.
> Developing PhantomJS 2 and 2.5 as a single developer is a bloody hell.

One potential path forward could have been to have PhantomJS support Headless
Chrome as a runtime [2], which Paul Irish (of Google Chrome team) reached out
to PhantomJS about. However, it seems there hasn't been enough
interest/resources to ever make this happen.

[1]
[https://groups.google.com/d/msg/phantomjs/9aI5d-LDuNE/5Z3SMZ...](https://groups.google.com/d/msg/phantomjs/9aI5d-LDuNE/5Z3SMZrqAQAJ)

[2]
[https://github.com/ariya/phantomjs/issues/14954](https://github.com/ariya/phantomjs/issues/14954)

------
micimize
Timeline of what lead to this, from what I could gather:

• phantomjs is 7 years old, @pixiuPL has been contributing for about 2 months

• @ariya didn't respond to his requests for _owner_ level permissions

• @pixiuPL published an open letter to _the main page of phantomjs.org_
[https://github.com/ariya/phantomjs/issues/15345](https://github.com/ariya/phantomjs/issues/15345)

• the stress leads @ariya to close the repo.

• @pixiuPL intends to continue development on a fork

This is a good reminder of why non-technical skills are so important in OS and
in general.

~~~
rilut
I don't know man, but look at @pixiuPL's commits
[https://github.com/ariya/phantomjs/commits?author=pixiuPL](https://github.com/ariya/phantomjs/commits?author=pixiuPL)

Especially his own commits (non-merge commits)

~~~
bcherny
What about them?

~~~
rilut
He committed:

\- Removing one whitespace and adding an unnecessary file
[https://github.com/ariya/phantomjs/commit/98272b9752b2d505f7...](https://github.com/ariya/phantomjs/commit/98272b9752b2d505f7fcb9ae0d10651a808ea79f)

\- Conflicted files
[https://github.com/ariya/phantomjs/commit/63a69d9e2e9c31baab...](https://github.com/ariya/phantomjs/commit/63a69d9e2e9c31baab90b9711d88a836dce60e85)

\- Personal build env
[https://github.com/ariya/phantomjs/commit/d57ff74f36c5b79d82...](https://github.com/ariya/phantomjs/commit/d57ff74f36c5b79d829e36ba7febaafcae3c9e3b)

\- Deleted the whole project while changing cloud provider
[https://github.com/ariya/phantomjs/commit/a242fb8d605d9aa4af...](https://github.com/ariya/phantomjs/commit/a242fb8d605d9aa4afe5bf8d1267c1ded15f157c)

\- then re-adding the whole project again
[https://github.com/ariya/phantomjs/commit/ddaaa09785d453e415...](https://github.com/ariya/phantomjs/commit/ddaaa09785d453e41554fd6133a2054c659a317f)

and other weird/careless commits

~~~
enitihas
Now, he shows up at number 4 in the Github contributor list
([https://github.com/ariya/phantomjs/graphs/contributors](https://github.com/ariya/phantomjs/graphs/contributors))
with 179,603 lines added, not a single one of which seems to be a meaningful
contribution by him.

Are such incidences common in other open source software too, or does this one
seem a rare case?

~~~
micimize
I doubt it, but the line count / commit frequency heuristics really shouldn't
be taken too seriously anyways

------
TheAceOfHearts
Some people are mentioning headless Chromium, so I wanna mention another tool
I've used to replace some of phantomjs' functionality: jsdom [0].

It's much more lightweight than a real browser, and it doesn't require large
extra binaries.

I don't do any complex scrapping, but occasionally I want to pull down and
aggregate a site's data. For most pages, it's as simple as making a request
and passing the response into a new jsdom instance. You can then query the DOM
using the same built-in browser APIs you're already familiar with.

I've previously used jsdom to run a large web app's tests on node, which
provided a huge performance boost and drastically lowered our build times. As
long as you maintain a good architecture (i.e. isolating browser specific bits
from your business logic) you're unlikely to encounter any pitfalls. Our
testing strategy was to use node and jsdom during local testing and on each
commit. IMO, you should generally only need to run tests on an actual browser
before each release (as a safety net), and possibly on a regular schedule (if
your release cycle is long).

[0] [https://www.npmjs.com/package/jsdom](https://www.npmjs.com/package/jsdom)

~~~
AlphaWeaver
Cheerio [0] is fantastic for this as well...

[0]:
[https://www.npmjs.com/package/cheerio](https://www.npmjs.com/package/cheerio)

~~~
TheAceOfHearts
I've tried Cheerio as well, but I prefer JSDOM since it exposes the DOM APIs.
What I'll normally do is interactively test things out in the browser's
console, and then transfer em over to my script. Browser dev tools are just
super amazing.

~~~
madeofpalk
Agreed - I find the Cheerio APIs to be awkward when traversing deep into the
DOM. Last time I used Beautiful Soup I found it also had this problem. The DOM
API that JSDOM provides is such much more natural to work with.

------
enitihas
For those who haven't looked at some of the commits by @pixiuPL, the list is
here :
[https://github.com/ariya/phantomjs/commits?author=pixiuPL](https://github.com/ariya/phantomjs/commits?author=pixiuPL).

To summarize: It does not look like the guy has done a single commit with any
meaning. His commits are basically the following:

1\. Adding his own name in package.json 2\. Adding and deleting whitespace.
3\. Deleting the entire project and commiting. 4\. Adding the entire project
back again and commiting.

Just out of curiosity: How likely is that someone may be able to use a large
number of such non functional commits(adding and removing whitespace) to a
popular open source repository to boost their career ambitions.(e,g. Claiming
that they made 50 commits to a popular project might sound impressive in an
interview.)

~~~
nolok
Given that he hides those behind fake commit message (unless one counts
removing a comment or a whitespace "code refactoring), I would say rather
likely.

~~~
enitihas
It seems there are no limits to this madness.e,g
[https://github.com/ariya/phantomjs/commit/970edb9b683175a6b1...](https://github.com/ariya/phantomjs/commit/970edb9b683175a6b1f22fb77bff4fbbda957a19)

In this commit the guy deletes two spaces from a file, and adds copyright for
his name at the top. Going through his commits has made me extremely shocked.
I mean how did such low quality commits made it into the master branch of the
repo. It is like these commits were invisible to all the visitors and users of
the repo.

------
petercooper
Two alternatives:

Headless Chrome with Puppeteer:
[https://github.com/GoogleChrome/puppeteer](https://github.com/GoogleChrome/puppeteer)

Firefox-based Slimer.js:
[https://github.com/laurentj/slimerjs](https://github.com/laurentj/slimerjs)
(same API as Phantom which is useful if using a higher level library like
[http://casperjs.org/](http://casperjs.org/))

~~~
mrskitch
I maintain a puppeteer-as-a-service repo here:
[https://github.com/joelgriffith/browserless](https://github.com/joelgriffith/browserless).
It’s pretty feature rich at this point, allowing you to specify concurrency,
sessions timeouts, and comes with a robust IDE (which you can play with here:
[https://chrome.browserless.io](https://chrome.browserless.io)).

I’m working on building out a serverless model, which is the holy grail of
headless workflows, but it’s a bit more challenging to operationalize than one
would think.

I’m hoping that these efforts will lower the bar for folks wanting to get
started with puppeteer and headless Chrome!

~~~
skinnymuch
Browserless seems awesome. Thanks for sharing your project!

------
lukebennett
As has been said, this point was somewhat inevitable with the advent of Chrome
and Firefox's headless modes. However, as the project slips into the mists of
history, let's not forget the vital stepping stone it provided in having
access to a real headless browser environment vs a simulated one. I for one
will remain grateful to Ariya, Vitallium and all the team for their efforts.

------
tnolet
I’m super biased in this, having spend considerable time programming against
PhantomJs, Selenium and now Headless Chrome / Puppeteer for my startup
[https://checklyhq.com](https://checklyhq.com). This whole area of automating
browser interactions is an extremely hard thing to get stable. In my
experience, the recent Puppeteer library takes the cake but PhantomJs is the
spiritual father here. I will not talk about Selenium for blood pressure
reasons

~~~
iaml
Having dabbled with both selenium and phantom, I can vouch for both being PITA
to work with.

------
rumblefrog
Within the issue @pixiuPL created, I listed some of the things that he has
shown incompetence on:
[https://github.com/ariya/phantomjs/issues/15345#issuecomment...](https://github.com/ariya/phantomjs/issues/15345#issuecomment-370269460)

~~~
mkarnicki
Nicely put github comment, well done. Thank you. I feel sick in my mouth
seeing PL in his username, which clearly indicates my home country. I am
beyond baffled.

------
hrasyid
Ariya wrote a bit about his reasoning here:
[https://mobile.twitter.com/AriyaHidayat/status/9701730017013...](https://mobile.twitter.com/AriyaHidayat/status/970173001701367808)
also mentioning an old post in
[https://github.com/ariya/phantomjs/issues/14541](https://github.com/ariya/phantomjs/issues/14541)

------
hartator
I still think it's premature. There is still couple of fields PhantomJS is
better than Headless Chrome. Notably proxy support, and API aviability.

~~~
transreal
That's not really true. You can use proxies with Headless Chrome using the
--proxy-server command line parameter. And the API is richer that PhantomJS.
See the underlying API documentation here:
[https://chromedevtools.github.io/debugger-protocol-
viewer/to...](https://chromedevtools.github.io/debugger-protocol-viewer/tot/).

~~~
hartator
It's only for proxy without auth. So mainly local ones. There is no way to use
username and a password right now for proxy with headless chrome.

------
redka
Well with Chrome going headless there isn't a whole lot of place for PhantomJS
anyway. Or is there? What is it still good for?

~~~
apocalyptic0n3
Legacy systems for one. The Cooperative Patent Classification group releases
their classifications en masse as HTML (single zip download, which is great).
I built a parser for a PHP project that could parse all several hundred
thousand records from the HTML in a few minutes. In 2017, they switched to a
system that loads in the data from JSON stored in Javascript in the HTML (it
is every bit as terrible as you imagine). Obviously loading in the HTML and
trying to use regex to match the JSON was a terrible idea (especially since it
was encoded to boot...), so I instead used Phantom to load each file, render
it, and save it to a temporary file which I then parse using the original
pre-2017 parser. Like 10 lines of code in Phantom to do it.

Obviously with my situation, this is not the end of the world. I use the
parser twice a year and Phantom will continue to handle that task just fine.
But I also know that the switch to using headless Chrome would be an expensive
one if necessary; we have to research it, we have to update local dev
environments, we have to implement it, we have to write new tests for it, we
have to test it, we have to updating our deployment strategy, update our
server deployment configuration, and, worst of all, get all of these changes
and new software installations approved by the USPTO which is a nightmare. My
situation is simple, but would take several weeks to several months to
actually deploy to production. As it stands, I will likely have to explain why
we have a now-unmaintained piece of software on the server and may be forced
to switch regardless.

I can easily imagine how this project sunsetting, even though there is a clear
alternative and successor, could be a nightmare to a lot of people. It's not
the end of the world, but it's definitely unfortunate

~~~
redka
Why would you need PhantomJS for that? Can't you just parse the HTML files
with Nokogiri and be done with it? That would be orders of magnitude faster
anyway

~~~
tnolet
Big misunderstanding in browser land. The HTML delivered to you over the wire,
the stuff Nokogiri sees, is not the stuff you see on your screen or even when
doing a “view source”

~~~
nkozyra
OK, obviously the stuff you see on your screen not matching the HTML delivered
makes sense, but explain the HTML source not matching what's sent via the HTTP
response. DOM can be modified, of course, JS can introduce more dynamic HTML,
but view-source should always represent any non-redirected HTTP response. What
is Nokogiri getting that the browser isn't (or vice versa)?

~~~
joatmon-snoo
> view-source should always represent any non-redirected HTTP response

Not the grandfather, but generally in browsers you have two versions of HTML
"source" \- the canonical source, the stuff pulled down over HTTP, and the
_repaired_ source, the version that actually gets rendered.

I'm unfamiliar with Nokogiri, but I suspect that from context, it doesn't
repair HTML in the same way that browsers do.

~~~
Kiro
But it should be the same as "view source" right? The post replied to claims
otherwise.

~~~
dewey
No it's not.
[https://news.ycombinator.com/item?id=16514517](https://news.ycombinator.com/item?id=16514517)

~~~
acdha
It sounds like you are confusing View Source and the live developer tools DOM
view.

------
Analemma_
There is one thing about this that saddens me: PhantomJS still starts up much
faster than headless Firefox or Chrome, at least for me, which makes some of
our integration tests take a long longer than they should.

Has anyone here figured out any tricks to get headless Chrome booted fast?

~~~
vaviloff
Also PhantomJS was a single statically linked binary with no dependencies that
you could literally drop into a server and run scripts at once.

~~~
oelmekki
For those who may struggle with using chrome headless on server, here is a
dockerfile example to get your started :
[https://github.com/oelmekki/chromessr/blob/master/Dockerfile](https://github.com/oelmekki/chromessr/blob/master/Dockerfile)

godet is the lib I use for chrome piloting, replace with your favorite one.

------
sergiotapia
End of an era! Congratulation to team for all their hard work and excellent
contribution to help teams build better software.

All the best to everybody!

------
pknerd
Somehow I am having issue to use both headless FireFox|Chrome. Unlike
PhantomJS where all I had to do is to drop the binary and set the path, both
FF and Chrome are not following same route thus I am happy to use PhantomJS
for a while

------
isuckatcoding
I would think PhantomJS is still quite heavily used so having some kind of
migrator to puppeteer would be useful. I’m sure people would pay $$$ for it.

------
skrebbel
Thank you, PhantomJS contributors. You built a life saver.

------
chx
Drupal dropped PhantomJS too
[https://www.drupal.org/project/drupal/issues/2775653](https://www.drupal.org/project/drupal/issues/2775653)

------
kschiller
Does anyone here know if there's a way to set SSL client certs with Headless
Chrome? With PhantomJS I could use

    
    
      --ssl-client-certificate-file and --ssl-client-key-file

------
Changu
I do lightweight web automation via Chromiums "Snippets". It is super nice to
work that way because you see on screen what happens and can check everything
realtime in the console. Only problem is that they dont survive page loads. So
when my snippet navigates to a new url I have to trigger it again manually.
What would be a good way to progress from here so I can automate across pages?

~~~
icebraining
Greasemonkey and its descendants (e.g. Violentmonkey) can run user scripts
which work across pages.

~~~
Changu
Maybe it is even easier to write a Chrome extension?

------
moondev
I remember taking full page screenshots with phantom back in the day. Really
cool project. Nightmarejs is another alt with a friendly api.

------
rutierut
One of the guys working on P-JS just linked from a GH issue to his open
letter... He isn't very happy with the owner blah blah blah and is going to
fork the master branch to make phantom great again, I'll just put this here:

"Will do as advised, as I really think PhantomJS is good project, it just
needs good, devoted leader."

~~~
enitihas
It does not look like the guy has done a single commit with any meaning. His
commits are basically the following: 1\. Adding his own name in package.json
2\. Adding and deleting whitespace. 3\. Deleting the entire project and
commiting. 4\. Adding the entire project back again and commiting.

------
chirag64
Shoot, I was just planning to use this for generating PDFs out of a URL on
nodejs. Does anyone know of any other library / module out there that is good
at this?

~~~
randlet
You can generate pdfs with headless Chromium/Chrome pretty easily.

    
    
        chromium-browser --headless --disable-gpu --print-to-pdf=output_file_name.pdf file:///path/to/your/html

~~~
bluehatbrit
Sadly you get 0 control over headers and footers of the output PDF, meaning
you get lovely crappy page numbers around the place with no way to turn them
off. This is why, sadly, I have to keep my command line markdown -> pdf
converter
([https://www.npmjs.com/package/mdpdf](https://www.npmjs.com/package/mdpdf))
using Phantomjs.

So this does work for very basic pdf printouts, but so far phantom is the only
tool that offers full control over the PDF output. Even down to things like
margins, paper size, etc.

------
wnevets
is headless chrome's API just as easy to work with? Taking a screenshot or
saving a page as pdf is stupid simple with phantomjs

~~~
andrewguenther
yep, just as easy

------
wxyyxc1992
Thanks & Goodbye

