For a while I had expensive internet and low bandwidth, but I loved listening to music and lectures on YouTube. At some point I realized that getting only the audio stream would save me 90% in bandwidth costs. [0]
youtube-dl (and yt-dlp) has a flag, I believe -G, which gives you the URL(s) for the requested format/quality. I used the command line on my computer and put the link in VLC. On my phone I had this elaborate workaround involving downloading the file to my VPS first over SSH, then downloading it to my phone, until I realized my phone browser can consume the URL directly, so I set up a PHP frontend for `youtube-dl -G -f bestaudio {url}`
It's no longer online and I lost the code, but it was like one line of code.
I mention this because you-get seems to support the same usecase (via --url / -u), so I wanted to let people know how useful this is!
(While it was online I shared it on some forums and got very positive feedback, people used it for audiobooks etc.)
[0] Also playing with screen off saves 80% battery life! YouTube knows these facts and that's why they made background playback (which fetches only audio stream) a paid feature...
There are a lot of people that don't like Brave's business model. But I've never given Brave a dime and turn off their ad network stuff and they've saved me hundreds of dollars on Youtube Premium over the years.
For me, it was as easy as adding a shortcut to the YouTube homepage on Brave that it basically acts like the YouTube app, but with ad blocking built in. It's the only way I watch YT videos on mobile.
Seems like youtube-dl doesn't work anymore. I had to use a fork yt-dlp. Always makes me nervous hopping from one project to another like that, but seems like it's commonly used at least.
youtube-dl has been dead for many years now; it's a lot like XFree86, where the original project is a zombie with very few real users, but the original maintainers refuse to acknowledge reality and just shutter the thing, so newcomers get confused. Everyone's moved to yt-dlp now.
For providing a tool that lets people download YT videos? The same should apply to yt-dlp, but that project is hosted on github.com with no apparent problems. What's stopping the youtube-dl people from doing the same and just ignoring this silly court ruling?
> It's no longer online and I lost the code, but it was like one line of code.
My mom is really into audiobooks and I made telegram bot for her via node-red where she can share youtube links. The links are sent to a fifo where a little daemon (1) downdloads them sequential and saves them into an incoming folder I setup for this special purpose. The bot can either download only audio (for her telegram ID its only audio) or complete videos.
This folder is watched by jellyfin which she can access to download or listen. She loves it and in two years she downloaded way over a TB of audiobooks.
I often choose 144p for low bandwidth scenarios. It is very similar to a good quality audio-only stream in terms of size, but you get the added benefit of looking at the speaker even if that's a glance every 45 mins. Same for battery life, you save that too.
Also 144p is somehow so peaceful and relaxing to watch, you don't get distracted with all the shiny colors and intricate details in the video and can just listen and not have your mind focus on some random stuff.
You can glitch out Safari on IOS -by quickly pressing the play button for the video in the now playing interface after switching off from the Safari app- to play the audio of Youtube videos on the background.
I mainly use this to play membership only videos that don't play in the background even if you have a Youtube Premium.
Every IOS version I pray that they don't patch this "glitch".
BTW if you browse YouTube with Firefox browser on Android you can play back YouTube videos with the screen locked using background player fix extension.
YouTube deliberately tries to prevent background playback by using web APIs to detect when the page is no longer visible and pausing the video. The extension prevents YouTube from doing that.
That’s the -F option to list all the formats, including the audio streams. Pick the audio format with -f to download the audio. I usually pick the .m4a format and then run it through ffmpeg to convert to mp3.
If I would make any assumptions, I would post another 30 options from my config that are nice to have when you download audio from youtube. These 3 are exactly equivalent to what gp does.
What’s the point of converting it to mp3? AAC inside an m4a container usually has better sound quality than similarly compressed mp3, and definitely better than reencoding.
Many of the little generic MP3 player modules that cost next to nothing will play MP3 (obviously) and WAV, and sometimes OGG and WMA, but AAC support is relatively rare.
A good use case would be when you own a car built somewhere between 2005 and 2015 that accepts CDs and usb drives but only mp3 files. Some supported AAC and ogg files without them being advertised as compatible, but some might not.
Or when you keep using an old mp3 player from the early 2000's.
I think the point here is that you can run `yt-dlp --extract-audio --audio-format mp3` instead of saving as .m4a (a lossy compression) and then covering that to .mp3 (another very different lossy compression).
Under the hood, there's probably an additional lossy conversion. I'm not sure if YouTube converts uploaded videos to specific formats but if they do, then the worst case scenario is:
- original uploaded video uses .ogg audio
- YouTube converts that to opus and puts it into a container format (wbem?)
- You download the video and extract the audio to .m4a using yt-dlp
- and then you convert that to .mp3 using ffmpeg
That's 4 consecutive lossy formats, each one throwing away different data.
Honestly the best thing to do here is use yt-dlp to download whatever format YouTube provides and use ffprobe to find out what audio format is already there. Then do one conversation if required.
I usually just extract the raw Opus audio, then run it through Picard to tag and save it in my music directory. I don't see any point in converting to MP3 these days -- Opus provides better audio quality at the same bitrate (or, equivalently, lower file sizes for the same audio quality), and pretty much all player software supports it now. I've actually been going the other way and converting most of my music library to Opus and getting rid of MP3s.
Same but I converted to Opus, because I was trying to squeeze it into as little bandwidth as possible. It was mostly speech content and Opus auto detects and optimizes for speech at low nitrates.
You can download the Opus directly with -f 249 / 250 / 251 (~48kbps / ~80kbps / ~128 kbps respectively, but youtube don't always make them all available, where are -f 140 for the ~128kbps AAC (.m4a) is always available, and often the format code 139 (~48kbps) - the lower bitrates being adequate for most speech based content.
When I use NNCP [1] on flights to download Youtube videos, the script I wrote to nncp-exec the download accepts a quality friendly name. I have a specific friendly name which gets the lowest quality video and mid-tier audio. These settings usually result in a pretty enjoyable experience and offer plenty of entertainment even on a longer flight.
Just switching tabs will dramatically lower energy usage without using any scripts. I assume the YouTube video player may download and just drop the video information if not showing it, no need to render decrypt the video stream.
on Android YTDLnis solves this very nicely. simply share the video URL to the app and it can download whichever format you like https://github.com/deniscerri/ytdlnis
The advantage is that you're setting this for a utility which can play media directly (e.g., you don't have to separately download and then play content), and that you can set preferences independently for mpv vs. other tools.
You can also of course configure your own aliases, shell functions, shell scripts, or config files for various preferred configurations, whether using mpv, ytdl, or other tools.
It's worth noting, though, that ytdl can output video content to stdout, so it is possible to stream video by piping to any player, although mpv's method is much more convenient.
Sometimes it is necessary to download files regardless, though, due to DASH separating audio and video into distinct files. You often need to remux them with a container like MP4, AVI, or Matroska in order to use them locally.
mpv's bonus is convenience, and the fact that it will seamlessly play much of what you throw at it, whether that's local files, network resources, web sites, etc. Much of that majyck is in fact ytdl/yt-download, but having a standard interface is quite handy.
And yes, I've run into streams that need mixing, which ytdl/yt-download can handle; you point it at both the video and audio sources, typically. (I do this infrequently, thankfully).
As the other commenter said, they want a failing test, not a fix.
A detailed description of the encountered problem;
At least one commit, addressing the problem through some unit test(s).
Examples of good commits: #2675, #2680, #2685
"Addressing" is probably a bad word to use here. "Demonstrating" would have been better, IMO.
the most expensive piece of writing software is scoping work.
i’m almost tempted to add a test suite just to give people more agency over my output because right now i’m only soliciting feedback in person to cut down on internet bullshit, like what happened to xz-utils
The Chinese version of the text has an extra header line that translates to "to prevent abuse via GitHub Issues, we are not accepting general issues". An earlier commit has this for the English text:
`you-get` is currently experimenting with an aggressive approach to handling issues. Namely, a bug report must be addressed with some code via a pull request.
By the way, `tests/test.py` seems to just run the extractors against various websites directly. I can't find where it's mocking out network requests and replies. Maybe this is to simplify the process for people creating pull requests?
I can get this, but I aggressively report accounts and issues. I'm not sure how GitHub handles them but they seem to not come back.
Though what I'm unsure how to deal with is legitimate users being idiotic. For example, recently one issue was opened that asked where the source code was. Not only was there a directory named "src" but there were some links in the readme to specific parts. While I do appreciate GitHub and places like hugging face [0], there are a lot of very aggressive and demanding noobs.
I'd like ways to handle them better.... I'm tired of people yelling at me because 5 year old research code no longer works out of the box or because you've never touched code before.
[0] check any hugging face issue and you'll see far more spam. Same accounts will open multiple issues that just barate owners and hugging face makes it difficult to report these accounts.
The solution is to ignore them and close their issue. Open source maintainers have enough to worry about and are unpaid, it's okay to be a little dictatorial when it comes to "bad questions".
It addresses the specific issue but does nothing to prevent future similar issues. A solution to a cold is not handing someone a tissue.
I like that these platforms are open to everyone but at the same time there are a lot of people who have no business participating. Being able to filter those people out is unfortunately a necessary tool to not get overloaded.
Worse, I find that due to this many open source maintainers and up being quick to close issues and say rtfm. I can't tell you how many times I've had this happen where in my opening issue I quote the fm and even include a reproducible test. It's also common to just close and say "not our problem".
I kind of like this. It's a more formal proof of concept. You prove the bug exists by writing a failing test. If they cannot construct a failing test then it's either too hard to mock or reproduce (and therefore maybe not even worth fixing, for a free tool), or it's impossible because it's not a bug. Frees up maintainer time from dealing with reports that aren't bugs.
The problem with popular tools is that they have more bugs that can be fixed. So bug reports are pretty much worthless: You know that there are 1000 bugs out there, but you only have resources to fix 10 of them.
By asking users to provide reproducible test cases, you can massively reduce the amount of work you have to do. Of course that means 90% of bugs will never be reported. But since you don't have the resources to fix them anyway, why not just focus on the bugs that can be reproduced and come with a test case...
I don't think it's necessarily about fixing those bugs, but I think a lot of times it's more about at least having those bugs be documented in order to raise awareness on (probable) issues down the line for whoever would want to use that project in the future.
It’s your prerogative if and how you want to limit the amount of people who can contribute, but I was explicitly replying to someone claiming that a person’s inability to code is in any way related to the validity or importance of the bug.
If the bug is egregious enough, somebody else will find it. If the bug is important enough to you but esoteric, then ask on a forum or enlist the help of someone you know who does know Python.
How do you currently submit bug reports on e.g. MS Word or Adobe Photoshop? This way is certainly more open than those commonly-deployed software.
Good chance you wouldn't be writing good bug reports either, then. Github issues have enough noise that a first-pass filter like this feels like a good idea, even if it has some false positives.
This in no way aligns with reality. I frequently interact with users who can’t code at all but make good bug reports. One of the best ways to ensure success is to have a form (GitHub allows creating those) which describe exactly what is necessary and guide people in the right direction.
What you're saying is even worse, since you’re implying someone could be an expert computer programmer or power user, but because they’re unfamiliar with the specific language this project chose, they are incapable of making good bug reports. That makes no sense.
This isn't really a metric though. It's a formal existence proof that the bug exists. The key difference IMO is that you have to create a test which A) looks (to the maintainer) like it should pass, while simultaneously B) not passing. It's much harder to game.
There are other cases where Goodharts Law fails as well: consider quant firms, where the "metric" used to judge a trader is basically how much money you pull in. Seems to be working fine for them
Interesting. I like the idea of encouraging people to try creating a test or even a whole fix, but saying that’s all you will accept is a bit much. On the other hand, I’m not doing the work to maintain you-get. I don’t know what they deal with. This may be an effective way to filter a flood of repetitive issues from people who don’t know how to run a command line program.
I believe there are two extremes. On one end you get a bunch of repetitive non-issues, while on the other end you only get issues about (say) bugs in FreeBSD 13.3 because only hard-core users have the skills and patience to follow THE PROCESS.
I know how to make an isolated virtual environment, install the package, make a fork, create a test and make a PR. But I don't know whether I care enough about a random project to actually do it.
It’s relatively easy to write a failing test and it massively cuts down the work related to moderating issues. Also, reduces the danger of github issues turning into a support forum.
If this results in the project being easier to maintain and being maintained longer, then I’m fine with this.
In the case of this tool, adding a failing test case looks trivial if you've got the URL of a page it fails on.
Provided the maintainer is willing to provide some minimal guidance to issue reporters who lack the necessary know-how, it even seems like a clever back door way of helping people learn to contribute to open source.
Given the title and the first few sentences from a description I assumed that it's some heuristic-based tool to try and grab whatever there is on the page, which would be useful if there's no tool which implemented the support for this site (which in most cases just means "yt-dlp doesn't support it"). But apparently it's also extractor-based with a separate extractor for each somewhat-popular source. So, basically it's just less sophisticated clone of yt-dlp?
you-get: [error] oops, something went wrong.
you-get: don't panic, c'est la vie. please try the following steps:
you-get: (1) Rule out any network problem.
you-get: (2) Make sure you-get is up-to-date.
you-get: (3) Check if the issue is already known, on
you-get: https://github.com/soimort/you-get/wiki/Known-Bugs
you-get: https://github.com/soimort/you-get/issues
you-get: (4) Run the command with '--debug' option,
you-get: and report this issue with the full output.
Tried with debug flag but didn't really help
pattern = str(pattern, 'latin1')
^^^^^^^^^^^^^^^^^^^^^^
TypeError: decoding to str: need a bytes-like object, NoneType found
I was curious to see if it can bypass age restriction (though I tried on non-age-restricted video too with the same error).
That's an interesting question. They only depend on a single library, but I wonder how much code is really their own. I found it curious, for example, that there is a dedicated mp4 joiner (I mean, if you already have ffmpeg, there is probably no way you can do it better yourself).
That is interesting, huh, yeah they list ffmpeg as a dependency so I wonder what that didn’t cover for them.
Though there are some cases where using pure ffmpeg is just to difficult or impossible. Recently i had such a case where I wanted to merge multiple video files from my GoPro (they come out in 4GB parts if the video is longer than that), but while keeping / correctly merging all metadata such as date / accelerometer / gps / any custom binary streams. Ended up using this and worked great https://github.com/gyroflow/mp4-merge
Please don't litter HN with LLM generated slop, this is actively reducing the quality of discussion. No one wants a future HN where people just spam LLM responses at one another.
I wouldn't exactly call a ytdl-style media downloader with a whole library of site-specific extractors and converters "dumb" but still cool that more projects like ytdl exist.
I’m not sure I understand why Bandcamp is on the list of supported sites: they allow you to just download the files on the condition you first pay the artist for them.
The fact you can download it with this tool is because the artist is letting you listen to it for free before buying it. Downloading it with this tool seems totally unnecessary and a bit of a jerk move. Bandcamp hosts mostly small and independent artists and labels.
Their list of supported sites isn't a declaration of where you should use this tool for moralistic reasons. It's just a list of popular sites it works on.
I presume you could subscribe and still use this tool? People use automation tools like this to download things that they already pay for because it saves them the effort of logging into 5 different apps depending on which walled garden it's in.
It’s Bandcamp: you buy the album and they give you a link to download it in whatever format you want so you can listen to it using whatever app or device you wish. You can come back later and download it as many times as you like.
What if I don't want to manually follow links or "come back later", but just add the album I purchased to the config for the same script I use for the rest of my media library?
Personal assumption, but probably not everyone lives in a country where they can pay Bandcamp.
And from personal experience I can say, sometimes people want or have to consume media offline, because of unreliable or slow network or expensive providers. But this doesn't mean that they won't pay later if they like it.
Different pirates have different moral codes regarding piracy; there is no standard.
Personally, I consider it immoral to pirate Windows without a very good reason (which doesn't include "I like using Windows", or "I want to play game X" or "I want to use the enterprise version which doesn't have baked-in ads").
It's probably an inverse of non-pirates who consider it a moral obligation to pay for a digital copy of something that does not diminish or remove the original
There are reasonable moral arguments against copyright infringement, but the false equivocation with stealing is not one of them.
The same moral principles that argue against depriving people of natural property they already own do not imply further arguments against reducing sales opportunities for non-rival goods for which scarcity has been artificially established via positive law. Other, unrelated principles, can make for compelling arguments, but not the same ones.
Most people do, since groceries are physical, rival goods to which natural property rights apply. But if it were possible to make copies of groceries while leaving the originals intact, far fewer people would argue against that on the same grounds. They might make other, unrelated arguments against copying, but moral principles applicable to physical property wouldn't be relevant.
Nice work. But as a consumer, Why should I use you-get over yt-dlp? What are its strengths over yt-dlp, which works quiet well on a huge range of websites[1]
I like this. I am imagining a companion extension for chrome/ff that uses you-get as a backend to implement it in a seamless way. Forward thinking idea: imagine going on youtube and have you-get extension bypass the youtube player and playing the content directly without ads. When I say youtube I might also say any other platform.
Can it back up a text webpage? Can it remove popups for newsletters, or subscription, or logins, or cookies' notifications? Can it read pages that require signing in?
the project appears to have an inaccurate description. perhaps it is the long-term goal, but today it behaves more or less like a subset of what yt-dlp does.
there is some polish to the interface, so it might still be worth trying out if it does what you need it for.
A future in which YouTube will refuse to stream you data because you didn't pass client attestation is definitely coming and I wish we could stop it.
It is a dark future where some of us will accept it, and rest of us will be constantly taking part in a cat-mouse chase in which we glitch out attestation tokens from vulnerable devices to get by.
Client attestation is a mechanism for servers to get cryptographic proof from a client about what software the client is running. Modified browsers, or software like yt-dlp, would have a harder time providing such proof. How hard a time would depend on the security hardening of the attestation mechanism. It'd almost certainly be broken, just as most attempts at DRM get broken, but it would be one more speedbump.
There are legitimate purposes for attestation; for instance, server attestation can allow a user to run software on a server and know that software matches a specific build and hasn't been subverted, so that the client can trust the server to do computation on the client's behalf.
But one of the leading uses of client attestation is DRM.
This library/program solves problems that people have with pages like youtube... too many ads, no way to download videos for offline use (or archive for when they get removed), and better performance with a native player.
If I was forced to watch all the ads on youtube, i wouldn't watch videos there at all.
Consuming not only their resources but also their incredible streaming algorithm for free is just a dirty move.
Doing this just puts you into the statistic of bad users that incentives companies like Google to push more intrusive DRM. Which at the end of the day makes us all suffer.
And I consider them bad advertisers and just don't care.
We had the internet without ads. Then we had the internet with small banner ads, that didn't track users but were there because of the page content. Then those ads moved from text/image to animated gif, and they became annoying. Then flash was used and it became a security concern. Then the number of ads went up and up, video was introduced, autoplaying, with audio, and at some point most of the internet is unusable without an adblock. It's not something we, the users, wanted, the advertisers (google included) made the internet unusable without an adblock, and that includes 3 25 second ads on a 1m20s video on youtube.
If advertisers returned to non-animated banners above/below the video, we wouldn't have to install adblocks everywhere anymore, and people would see the ad without wanting to kill the advertiser for bad intrusive advertising practices.
They started it, they went too far, we're just reacting to what they're doing. And as we've seen with other platforms, even paying for premium doesn't mean no-ads. And still an adblock is needed to remove the in-video ads ("this video is sponsored by shadow vpn audible").
youtube-dl (and yt-dlp) has a flag, I believe -G, which gives you the URL(s) for the requested format/quality. I used the command line on my computer and put the link in VLC. On my phone I had this elaborate workaround involving downloading the file to my VPS first over SSH, then downloading it to my phone, until I realized my phone browser can consume the URL directly, so I set up a PHP frontend for `youtube-dl -G -f bestaudio {url}`
It's no longer online and I lost the code, but it was like one line of code.
I mention this because you-get seems to support the same usecase (via --url / -u), so I wanted to let people know how useful this is!
(While it was online I shared it on some forums and got very positive feedback, people used it for audiobooks etc.)
[0] Also playing with screen off saves 80% battery life! YouTube knows these facts and that's why they made background playback (which fetches only audio stream) a paid feature...