Hacker News new | past | comments | ask | show | jobs | submit login
Why curl defaults to stdout (haxx.se)
166 points by akerl_ on Nov 17, 2014 | hide | past | favorite | 96 comments



I actually want printing to stdout more often than I want printing to file, it is more often what I need. I guess different people have different use cases.

I will admit that rather than learn the right command to have curl print to file -- when I _do_ want to write to file, I do use wget (and appreciate it's default progress bar; there's probably some way to make curl do that too, but I've never learned it either).

When I want writing to stdout, I reach for curl, which is most of the time. (Also for pretty much any bash script use, I use curl; even if I want write to a file in a bash script, I just use `>` or lookup the curl arg).

It does seem odd that I use two different tools, with mostly entirely different and incompatible option flags -- rather than just learning the flags to make curl write to a file and/or to make wget write to stdout. I can't entirely explain it, but I know I'm not alone in using both, and choosing from the toolbox based on some of their default behaviors even though with the right args they can probably both do all the same things. Heck, in the OP the curl author says they use wget too -- now I'm curious if it's for something that the author knows curl doesn't do, or just something the author knows wget will do more easily!

To me, they're like different tools focused on different use cases, and I usually have a feel for which is the right one for the job. Although it's kind of subtle, and some of my 'feel' may be just habit or superstition! But as an example, recently I needed to download a page and all it's referenced assets (kind of like browsers will do with a GUI; something I only very rarely have needed to do), and I thought "I bet wget has a way to do this easily", and looked at the man page and it did, and I have no idea if curl can do that too but I reached for wget and was not disappointed.


I think the biggest nuisance with this strategy is that neither tool is included by default on the machines I'm usually working with— wget is missing from my Mac, and curl is missing from my Ubuntu servers.

Both can be quickly rectified, but it's still a pretty big pain.


The obvious solution is to submit a patch to curl, such that when it's called as "wget", it emulates wget's command-line options, and vice versa.

Witten only partially in jest. I've submitted patches to both projects and both are relatively straight-forward code bases to dive in to.


Yes, the stdout default is great for working with/testing REST APIs on the command line, for example.


Indeed, I have never considered using wget at all with my REST APIs. In fact, the only time I do use wget is when I want to directly download a url to a file on the command line, but if I'm writing a script, I use curl.


I myself have an alias to get wget-like behavior (still not as verbose, but great for downloading binaries): alias w='curl -#O'

I'm afraid I side with the left-handedness argument. Years of muscle memory make me want to type "w".


To save to a file it's just

curl -O https://example.com/file.txt

It'll save to file.txt.


"to have curl print to file -- when I _do_ want to write to a file"

You have it reversed. Wget will output to the screen/pipe if you output to the "file" --. Curl does not. "curl http://www.google.com/index.html -o --" does not output to the screen. It creates a file named "--".


To clarify, I think the parent was using -- as a poor-man's emdash in this case.


yes, that was just an emdash not meant to be command line syntax. Sorry for the confusion.


And don't forget all the options available for TLS certificates that are available with cURL!


I think his argument is valid, and thinking about curl as an analog to cat makes a lot of sense. Pipes are a powerful feature and it's good to support them so nicely.

However, just as curl (in standard usage) is an analog to cat, I feel that wget (in standard usage) is an analog to cp, and whilst I certainly can copy files by doing 'cat a > b', semantically cp makes more sense.

Most of the time if I'm using curl or wget, I want to cp, not cat. I always get confused by curl and not being able to remember the command to just cp the file locally, so I tend to default to wget because it's easier to remember,


Ah, I was trying to figure out how to express my view of wget and curl as different tools, and you've done exactly that, thanks.

Yes, I think of wget more like cp, and I think of curl more like cat, and there are times when I want exactly curl as cat, as opposed to wget as cp.

Different tools, I like them both, and I use them differently. And, I only scratch the surface of capability for both tools.


Pipes (like much of UNIX) are stuck in ASCII. The response encoding has no relationship to your UNIX locale. The response would know the encoding but can't pass it on to the pipe because it's a concept from the 70ies. In the end all of this only works as long as everybody keeps to ASCII.


I find that weird. Why are you wanting to copy stuff from http all of the time? I only ever rarely do that because I want to examine something like an API response further for an extended period of time. Usually I just want to see the response once, or view the headers.


>I find that weird. Why are you wanting to copy stuff from http all of the time?

This kind of question (and the other question you make above) remind me of the people that answer Stack Overflow questions with their opinions of "best practices" and "what you should be doing instead" instead of answering what the poster asks. E.g:

Q. "How do I store JSON in mysql?"

A. "Why do you ask? What do you want to achieve with this? You'll be better served with a NoSQL database".

etc, etc.


That is called the XY problem[0]. It is a fairly common practice especially in technical IRC channels. Honestly, it can be very frustrating but a lot of times it is useful and actually helps both the person asking the (wrong) question and the group of people trying to provide a proper answer. Some (maybe most?) people do take it too far and even refuse giving you a straightforward answer just to be pedantic and annoying. Those types of people have made this much of a bigger problem than it actually is, however thinking about the XY problem before you ask a question is always a good idea, regardless.

[0] - http://mywiki.wooledge.org/XyProblem


I actually like this, because posters often have XY-problems. Especially if the solution the poster wants seems strange if often makes sense to question and propose a more sane solution to the whole problem he is actually having.


That's not what I was doing at all. My use of curl/wget is much different so I was curious of what they use cases are.


A common use for wget for me is downloading a tarball or something to a remote server. Browse for it on local browser, then type wget and paste the download URL into an ssh connection. It is a bit biased toward wget just because I happened to learn it first, but I could use curl -O the same way.


Could also do `curl ${url}.tgz | tar xz`.


Which fails badly if the download gets interrupted.


Yeah, and you can't check file integrity (hash or crypto signature) first, or list the contents of the file before extracting.


Not the parent, but as another data point for you: my history | grep wget contains: a movie, certificates bundle, zipball from github, DTD document from ncbi, GeoIP country list, some script from github, composer.phar, qemu source, some gist, steam deb, ... So basically "stuff". I download "stuff" when needed ;)


I don't use wget for testing APIs.

I use wget for long-running downloads when I need to be able to continue later, for simple mirroring, and for bootstrapping web scraping scripts. In fact, the combination of wget -i (url-file) -k and egrep can be enough for many simple web scraping needs. The -k is really handy - converting relative links to absolute links.

If I must use curl, I generally bootstrap it with "Copy as cURL" in the network tab of a browser's developer tools. But it's just not needed very often.


People aren't lying to you when they say that they spend a lot of time on the command line:) Also - how else (other than wget) do you download something to some remote server somewhere?


You could always use netcat:

    echo "GET / HTTP/1.1
    Host: news.ycombinator.com
    
    "|nc news.ycombinator.com 80 >yc


> (other than wget)

Curl? :p


Aren't we talking about curl?


For people who have difficulty with reading: If you aren't downloading things with curl (which is the entire subject of the thread), how else (other than wget) would you download things to a remote server?

No need to answer, just helping out the non-native speakers in the thread.


I think he may be missing what people mean by "it's easier without an argument". It's not just "only one option" - what I see in reality quite often is: "curl http://...", screen is filled with garbage, ctrl-c, ctrl-c, ctrl-c, damn I'm on a remote host and ssh needs to catch up, ctrl-c, "cur...", actually terminal is broken and I'm writing garbage now, "reset", "wget http://...".

I'm not saying he should change it. But if he thinks it's about typing less... he doesn't seem to realise how his users behave.


That's the reason I use wget and only when necessary I switch to curl. It's not that I wouldn't forget about that nasty behaviour (eventhough I sometimes do forget) but it usually goes like this:

    $ curl -o news.ycombinator.com
    curl: no URL specified!
    curl: try 'curl --help' or 'curl --manual' for more information
    $ curl -O news.ycombinator.com
    curl: Remote file name has no length!
    curl: try 'curl --help' or 'curl --manual' for more information
    $ curl -O foo news.ycombinator.com
    curl: Remote file name has no length!
    curl: try 'curl --help' or 'curl --manual' for more information
    <html>
    <head><title>301 Moved Permanently</title></head>
    <body bgcolor="white">
    <center><h1>301 Moved Permanently</h1></center>
    <hr><center>nginx</center>
    </body>
    </html>
    $ wget news.ycombinator.com
    --2014-11-17 14:27:18--  http://news.ycombinator.com/
    Resolving news.ycombinator.com (news.ycombinator.com)... 198.41.191.47, 198.41.190.47
    Connecting to news.ycombinator.com (news.ycombinator.com)|198.41.191.47|:80... connected.
    HTTP request sent, awaiting response... 301 Moved Permanently
    Location: https://news.ycombinator.com/ [following]
    --2014-11-17 14:27:19--  https://news.ycombinator.com/
    Connecting to news.ycombinator.com (news.ycombinator.com)|198.41.191.47|:443... connected.
    HTTP request sent, awaiting response... 200 OK
    Length: unspecified [text/html]
    Saving to: ‘index.html’

    [ <=>                                                                                                                                  ] 22,353      --.-K/s   in 0.07s

    2014-11-17 14:27:19 (331 KB/s) - ‘index.html’ saved [22353]
With wget, I can just through any URL to it and it‘ll probably do the right thing with the least amount of surprises. „Grab a file“ is my usecase 99.99% of the time, „Print a file“ is the rest 0.01%.


Well, I find curl easier to type:

  curl www.example.com | ...
  wget -O - www.example.com | ...
I guess it depends on what you're trying to achieve.


I asked this of someone else, just out of curiosity, why do you find yourself downloading content from http that often? What are you doing with these files?


I use wget a lot. I don't use a desktop manager and I generally don't trust my web browser to do The Right Thing when I download various non-html files. I prefer using the command line so I always "Copy Link Adress" and then do whatever I want with it.

For instance when I download some archive I don't have to bother selecting where to download the file from the GUI (I hate navigating filesystems from GUIs), waiting for it to finish before switching to the command line to extract it. I just get the address, "curl $link | tar xzv" and I'm done.


Current versions of the chrome debugger (and maybe others!) will let you print out the curl command to retrieve a resource that is completely inaccessible in the web interface. If you keep a little media server like plex, et al, it's a really convenient tool for time shifting/stealing/whatever the word is some video on a website that won't let you at the video source. You just open the debugger, view the network resources, click play, sort by size, right click the one that is growing (the video), get the curl, paste it into a terminal with a -o flag, and now you have the video. Whether you should have it is another thread, not this one.


It sounds like maybe you are most familiar with GUIs or the web. Curl is a UNIX tool and doesn't work like those things. If it's the only unix tool you use, that might seem jarring.

Of course it would be a smoother experience for you if it worked more like the rest of the tools you are familiar with, but changing it would be a mistake. If curl were changed, then it would be one of the few unix tools that differs from the rest. That would make it jarring to people who use unix. Nobody wants to erode the good parts of unix and destroy the unix way just to make some commands slightly less jarring to people who don't regularly use unix.

Especially when you don't even need curl. That's the wrong option for you. Right next to the option in Chrome to copy a curl command is the option to copy just the url. Just use that and paste the url into your browser, which will work more like you'd expect.

If you must use curl, you almost certainly don't want to use "-o". That's an advanced option for when you are downloading multiple resources with a single command. You use it to provide a template for the multiple output filenames necessary in that particular situation.

If you only want to specify a single filename, then do like you would with any other typical unix command and redirect the output to a file. For example, append ">filename.html" to the end of the command.

Unix tools rarely have options to specify output filenames. They are as unnecessary as a button in a web browser that will let you "Display this webpage." Displaying webpages is what a web browser does, and writing output to a file is what a unix command does. It's just that the default file for a unix command is your terminal. Your shell makes it dead simple to pick a different one if that's what you want.


On my laptop, to view something while offline. On a server, to install something that is not in a package manager.


On that note, sites that don't easily offer copyable links for download annoys me to no end.

Most of the time downloading anything, I need it on another machine, and normally the easiest way is copy/pasting an URL from my local web browser to a remote terminal, and fetch it using wget. In which case an auto redirect to a mirror selector that again auto downloads the item, all handled by javascript, is pretty frustrating.



I noticed this with mega.co.nz and I hate it, they do all the downloading with javascript then start an actual browser download (which is just copying from localstorage to your download directory) which finishes in an instant but I have to remain on the page now.


I don't think it would be possible to use curl for mega because it has to locally decrypt the stream first. I do remember using a (now deprecated) python library that wrapped the mega api [0] a while ago, so that might interest you if you want to do everything from the command line.

[0] https://github.com/richardasaurus/mega.py


With Mega it's rather unavoidable as the javascript is running the decryption.


You must understand that I live and breath in the terminal. If I'm working, my computer has at least one terminal window open.

When I'm in the middle of projects, there are several terminals, all on different 'machines' (ssh/chroot/and now docker) in different directories, and I need a file, be it source code or a binary or an rpm/deb/tgz/zip or just plain data to be where I am right now. I might use scp for this purpose, I might use rsync if the connection is flakey. Some times there's a shared file server and I can just use 'cp' instead.

Http is just yet another protocol where files come from, and when I use curl/wget I don't want the file in ~/Downloads along with all the other random crap, I want that it downloaded to where my terminal window is pointed, which may be a machine half-way around the world from me, or at the very least, in the right directory.

It doesn't make sense to download it to my machine, then upload it to where it needs to go either - just wget it directly, on the machine it needs to be on, in the directory where it needs to go.


I used to deal with shitty, unreliable internet all of the time (oh wait, I still do, I have TWC...), so wget was really useful for ensuring that I could keep documentation and other reading material handy when I inevitably lost my connection. Point it at the base URL, tell it to recursively download the whole thing, and within minutes you've got a complete mirror of the static site of your choosing, with the URLs patched up to point to local content and everything.


I often play around with new languages and other software that isn't in my distro's repo. For that, when they have a .deb available, it's often downloadable, and if I'm just playing around, I don't want to mess with my sources. So I copy the URL in my browser on my local machine, and paste it after "wget " in my terminal on the server, and go from there.

I could obviously just use curl for this, but wget requires no additional options or thought.


All the browsers I've tried have a shitty UI. It's faster for me to copy-pasta an URL into the terminal than it is for me to wait for the stupid download dialog to pop up, click save as, then use that silly widget to navigate to the path I likely already have as the cwd in my shell.


I have a bad connection, so it's better for me to use a download manager that can resume failed downloads properly. Browsers usually fail at that.


I never know what wget will use as the file name so I only use it if the resource has a real name, if it is only like in your example I would always prefer:

curl https://news.ycombinator.com > hn.html

That way I am sure I don't overwrite some other file I already have in this folder with the same name which wget will give it.


wget won't overwrite files by default. Instead it will append .1 to the filename and so forth. You'd have to use the -c/--continue option.


What always bugged me is that curl doesn't follow redirects by default. Instead you have to pass it the -L option. curl always made more sense to me as a programmer's tool. It seems to be made for those who want a tool that you have to predictably guide step-by-step rather than something that tries to be smart and uses sane defaults. That's not to say it doesn't have it's uses though.


You appear to be complaining that curl shows you exactly what's at a URL ('cat url'), and that wget is better for fetching an object served by that location. To me, those are the use cases. They're both fire-and-forget cli tools; it's not like you have to mentally invest in one over the other like vim and emacs. Just use both as required.


Which users? I only ever use curl to print stuff to stdout, but I use it for that a lot. When I want to download a file as lazily as possible, I use wget.

If you don't use the tools often enough to remember how they work, I don't think your needs are going to come up high on the developer's priority list.


Turns out it depends on which tool you learned first (my case and many others mentioned here). I'm not ashamed to say I've always used cURL for everything and never installed wget. wget is awesome but cURL serves me right. I do a lot of HTTP and API stuff so I only need the headers most of the time. If I want to examine the (usually) JSON output of a call, I pipe to a custom alias that combines pygments (http://pygments.org) and `python -m json.tool` to beautifully format it. (alias pretty='/usr/bin/python -m json.tool | /usr/local/bin/pygmentize -O style=monokai -f console256 -g')


That's my approach too. But he wrote himself "a very common comment to me about curl" - if someone actually finds the time / motivation to write to him about this difference, I'd count them as a user.


Maybe they count as a user, but they count comparatively to all the people who don't write him. I'd hardly expect him to get tweets and emails saying "It was so great how your tool printed what I wanted to stdout", even in cases where people do prefer that.

It's tough to get a representative sample from something like voluntary correspondence, because it doesn't tell you the majority opinion, just the majority opinion from people who decided to write in.

EDIT: To note, it's entirely possible that the silent majority of curl users really do hate the default-to-stdout choice. But absent a real way of surveying, it's difficult to make a call either way.


Most of the time I'm using curl, I want to see the output. Otherwise I start it with >output.file.


>I'm not saying he should change it.

I'd say that it's too late for a change. Changing the default behaviour would break way too many existing scripts and cronjobs.


Do one thing and do it well.

IMHO cURL is the best tool for interacting with HTTP and wget is the best tool for downloading files.


Pretty much. i keep seeing curl being used as the "back end" of web browsers, fueling the likes of webkit.

Wget on the other hand end up within shell scripts and similar (i have before me a distro where the package manager is made of shell scripts, core utils and wget).


This is a good way to put it, especially since people tend to use them analogously.


+1,

curl is like swiss army knife and wget is fixed blade knife ;-)


This "-O" seemed dubious to me so I took a look. Turns out... yep, it's not as simple as that.

"curl -O foo" is not the same as "wget foo". wget will rename the incoming file to as to not overwrite something. curl will trash whatever might be there, and it's going to use the name supplied by the server. It might overwrite anything in your current working directory.

Try it and see.


According to the manpage, the filename depends only on the supplied URL:

  Write output to a local file named like the remote file we get. (Only the file part of the remote file is used, the path is cut off.)
  The remote file name to use for saving is extracted from the given URL, nothing else.
wget is a hugely useful tool for making local copies of websites and similar things -- the no-clobber rule is useful there, and the built-in crawling and resource fetching is fantastic. OTOH, for most things, I actually like curl's 'dumb' behaviour; it seems to match up better with the rest of the UNIX ecosystem.


I think of curl as a somewhat more intelligent version of netcat that doesn't require me to do the protocol communication manually, so outputting to stdout makes great sense.


It would be really nice if curl took the content-type and results from isatty(STDOUT_FILENO) into consideration when deciding whether to spew to stdout.


Yes, I can't imagine there are actual scripts dumping binary data to the terminal, and it would help everybody whose terminal would otherwise experience what viraptor nicely describes:

""curl http://...", screen is filled with garbage, ctrl-c, ctrl-c, ctrl-c, damn I'm on a remote host and ssh needs to catch up, ctrl-c, "cur...", actually terminal is broken and I'm writing garbage now, "reset", "wget http://..."."

I admit it happened to me more than once.


For debugging home made servers it's quiet nice (if you don't want to wireshark):

    curl -silent http://i.imgur.com/0nCbgbi.jpg |hexdump -C |less
Now you could automate this for testing, etc.

There are use cases for terminal output and changing this behavior now would probably wreck many 3rd party scripts. And if you really just want a simple download manager you probably should use wget anyways.


The thing is, you might want to pipe the output to something else instead of saving it to a file. If you check that you're outputting to a terminal, then the terminal behavior is different than the pipe behavior, which might be confusing.

Maybe if it's a terminal print a warning with a simple explanation, or a Y/N prompt?


HTTPie is a command line HTTP client, a user-friendly cURL replacement. http://httpie.org


I find it very useful for debugging (and it's in the Ubuntu repos, and can be installed via homebrew on OSX).


Chrome dev tools have a super useful "Copy as cURL" right-click menu option in the network panel. Makes it very easy to debug HTTP!


Same with Firefox dev tools. I use it all the time.


There's also an awesome Firefox extension called "cliget" that will give you curl and wget (and some windows only thing I've never heard of) command lines—It adds a "Copy curl for link" context menu item for every link.

It's quite nice because it will put all your cookies on the command line so you can trivially download files protected by a login page directly to remote servers.


We all have some user-bias and in this case it is geared towards seeing Curl as some shell command to download files through HTTP/S.

Luckily, Curl is much more than that and it is a great and powerful tool for people that work with HTTP. The fact that it writes to stdout makes things easier for people like me that are no gurus :) as it just works as I would expect.

When working with customers with dozens of different sites I like to be able to run a tiny script that leverages Curl to get me the HTTP status code from all the sites quickly. If you're migrating some networking bits this is really useful for a first quick check that everything is in place after the migration.

Also, working with HEAD instead of GET (-I) makes everything cleaner for troubleshooting purposes :)

My default set of flags is -LIkv (follow redirects, only headers, accept invalid cert, verbose output). I also use a lot -H to inject headers.


Having known both tools for a long time now, I never realized there was a rivalry between them - I just figured they're each used differently. cURL is everywhere, so it's a good default. I use it when I want to see all of the output of a request - headers, response raw, etc. It's my de facto API testing tool. And before I even read the article, I assumed the answer was "Everything is a pipe". It sucks to have to memorize the flags, but it's worthwhile when you're actually debugging the web.


> people who argue that wget is easier to use because you can type it with your left hand only on a qwerty keyboard

Haha I would never realize that


I've worked with multiple people who chose passwords based on whether they could be typed with only one hand. I guess there's a perverse sort of sense in it, if you're really that lazy.


Makes sense but... I actually think the exact opposite.

For me the perfect password is one that you type the consecutive characters alternating between left and right hand.

Words that you type with just one hand you do a little 'twist' with the hand that, IMO, is a little slower and uncomfortable to do. As soon your finger reach a key the other hand is already moving to the next and this goes back and forth.

But I guess I'm making a point more about comfort than laziness.


I see what you're saying. I think the one-handed-password people mainly did it so that they could do something else with the other hand, like hold a cup of coffee, flick through the newspaper, that sort of thing. Seems like madness to me but there you go.


The "c" in "curl" stands for "cat". Any unix user knows what cat(1) does. Why the confusion?


I think the confusion is probably that people didn't realize that "c" stood for "cat" in "cURL".


I am surprised there is no mention of the BSD fetch(1) http://www.freebsd.org/cgi/man.cgi?query=fetch%281%29 , which probably pre-dates both curl and wget.


I was recently playing with libcurl (easiest way I know to interact with a rest api in c), and libcurl's default callback for writing data does this too.It takes a file handle, and if no handle is supplied, it defaults to stdout. It's actually really nice as a default... you can use different handles for the headers vs the data, or use a different callback altogether.

I really, really like libcurl's api (or at least the easy api, I didn't play around with the heavy duty multi api for simultaneous stuff). It's very clean and simple.


I use curl over wget in most cases, just because I learned it first I guess. I use it enough that I rarely make the mistake of not redirecting when I want the output in a file.

The one case where I will reach for wget first is making a static copy of a website. I need to do this sometimes for archival purposes, and though I always need to look up the specific wget options to do this properly, this use case seems to be one where wget is stronger than curl (especially converting links so they work properly in the downloaded copy).


"cat url", huh, that makes sense.

Why not just alias it ("make a File from URL" -> furl?) if people want to use it with -O flag set as default?


I find it pretty cool how authors of text-mode UNIX programs are still around. In fact the GNU culture has kind of grown up around that. And yet, to me text-mode stuff is just a part of a much larger distribution, not something to be distributed to so many systems. Oh, how times have changed.


I am in the opposite camp, where I always try to pipe wget to file. Then I end up with two files. Argh.


> if you type the full commands by hand you’ll use about three keys less to write “wget” instead of “curl -O”

Unless you forgot what the option was since you don't use it multiple times a day.


OK, the screen filled with garbage happens the first time you use curl, then you read the README or --help, which you should have done before, you learn -o and… it never happens again.

No big deal.


curl could parse mime type and decide where to push the stream, POC:

    #!/usr/bin/env sh
    
    case $(curl -sLI $1 | grep -i content-type) in
        *text*) echo "curl $1"
                ;;
        *) echo "curl $1 > $(basename $1)"
           ;;
    esac
https://gist.github.com/agumonkey/b85cef0874822c470cc6

Costs of one round trip though.


tl;dr Because the author says so.


99% of the time I'm using curl/wget is to download a compressed file. So, for me, `curl | tar` is shorter than `wget -O - | tar`, and much better than `wget` -> download -> decompress -> delete the file.


I don't trust tarballs to decompress cleanly and not explode into umpteen smaller files which I have to hunt in my download folder.


So don't run it from your downloads folder, run it from the place you actually want it extracted to. You're only locked into a downloads folder if you're using a web browser, which in this case you aren't.


sudo time-travel -10s




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: