Hacker News new | past | comments | ask | show | jobs | submit login
YouTube’s Piracy Filter Blocks MIT Courses, Blender Videos, and More (torrentfreak.com)
1129 points by gooseus on June 18, 2018 | hide | past | favorite | 250 comments

As you might know, this is related to the disastrous EU copyright reform directive proposal that would mandate such filters put in place everywhere. Please don't be complacent, it's very important to do something about this.

At risk of repeating my comment[0] from 3 days ago, I'd like to appeal to you to contact the MEPs that will vote on Wednesday in the EU's JURI committee. This was the first time I called politicians' offices to voice my opinion. The assistants were shocked that a private person actually bothered to do so and seemed quite humbled - I could hear the change in the tone of their voice when they realised an ordinary citizen is calling. I think it's a lot more effective than the 1000s of e-mails piling up in their spam folders.

Please help out to make sure this directive proposal is voted against on Wednesday by the majority of the JURI committee. List of Twitter handles of reportedly yet undecided MEPs:

@jbergeronmep @1PavelSvoboda @FrancisZD @KaufmannSylvia @mcboutonnetfn @rohde_jens @marinhopintoeu @mady_delvaux @emil_radev

[0] https://news.ycombinator.com/item?id=17320122

According to some, YouTube's copyright filters / markers aren't good enough to satisfy the EU directive. It's insane. The DMCA is bad enough already. Copyright should be loosened, not tightened. Then maybe we'd see some competition in video hosting.

I've always felt like false assertion of copyright should cause the loss of copyright. Seems like a fair balance for me.

And before someone gets on me...I realize that these are automated filters that are rarely asserted by the actual copyright holder.

I agree with the sentiment, but I think in practice that would lead to something like patent trolls — organizations designed no face no threat from the normal consequences.

Plain old money based fines should work. Even a tiny fixed fee of $50 per error might be enough, given the scale of the internet.

I really like the per error idea - allows for large cases to have no barrier to contest but for the whole thing to not become like patent trolling. Maybe also some sort of threshold for mass cases, like a progressive taxation:

If you're right 95% of the time, you get a discount. If you're wrong 50% of the time, you pay the full fee.

Who would be in charge of setting these?

Or what about a system like they use in Finland to deal with speeding tickets - the fine is based on your income. The local paper has a weekly list of top offenders, which is amusing to say the least.

Sorry to hijack the thread slightly, but is only to introduce some visibility to the fact that this particular issue is nothing to do with the EU, or content ID, or anything like that.

Google is blackmailing the Blender foundation to try and get them to enable adverts.


edit - and Blender have now got https://video.blender.org/ in testing already. I didn't think they'd be the ideal people to issue ultimatums to on video hosting, given their technical specialities.

A situation like this triggered the development of git, (after the licence to use bitkeeper was withdrawn from the linux kernel team) so I will be watching the fallout with interest.

Apparently Blender had more than 2 years to sign an updated contract - a contract which every partner had to sign to comply with updated legal regulations. Their existing contract would not work anymore (legal reasons?).

The Blender team never did that, and also never reacted to any youtube support mails.

That doesn't map onto the Blender account of what happened.

They said YouTube was requiring them to enable ads, but the ads checkbox was grayed out. Among other issues. Not that they wanted to enable ads on a not for profit site. But just taking this one issue, it's not clear how they could have possibly complied with Google's requirement, when Google keeps the checkbox UI element grayed out and unchangeable.

> Even a tiny fixed fee of $50 per error might be enough

If the fine went to the person who uploaded the work, this would be good because it would force the filter to be robust in the face of adversarial attacks.

That would create a cottage industry of borderline-infringing works designed to trigger the filter but win in court.

I.e. they would trigger bad copyright takedowns that weren't legally justified, forcing companies to stop doing those.

If it's possible to trigger those intentionally, then it's likely that it's also happening accidentally, and this policy would make that stop.

Possibly, but machine generating edge cases for the purpose of collecting fines feels like it should also be wrong.

True, but those edge cases have to convince a court made up of humans, who will probably have similar feelings. That might be a decent check on abuses.

I’m no lawyer, but deliberately performing adversarial attacks to get rich from being automatically and incorrectly fined feels like it ought to be somewhere between fraud and hacking (in the Computer Misuse Act sense rather than thr hackernews sense).

Or for a minimum starting point, the accusers should be obligated to compensate the victims for lost profits, as well as time cost. The current system of penalty-free reckless false assertions is insane.

The complexity here is that there are multiple parties involved: the copyright holder, who suspects there may have been an infringement; the uploader of the potentially infringing content; and the host, who wants as little to do with copyright claims as they can.

The DMCA strikes a balance for the copyright holder and the host, by letting the host avoid any liability for the copyright infringement, as long as they enforce valid requests by no longer hosting the content.

The new EU law builds on this principle, giving hosts more responsibility to check whether copyright might be being infringed. I'm strongle against the idea, but I can see the logic that the host is best-placed to actively monitor all uploaded content.

A big issue I have, both with the DMCA and this new legislation, is that there's no balancing in the direction of the uploader. Aggressive solutions like your proposal do address this imbalance, but they don't do anything to address the actual concerns each side has.

If we want more reasonable copyright laws in this arena going forward, we should also be pushing for an uploader's right to appeal, as a balance against the rights of and affordances for the other two parties: * There should be an 'appeals'/'counterclaim' process for content uploaders to take responsibility for the content. Presumably, to balance against the copyright holder's needs, they should submit their name and address when they claim that their content is non-infringing. * From the host's side, the uploader following this process should remove all responsibility for that content from them, so they have no concern about liability and can re-enable access to the content. * From the copyright holder's side, they need to be able to seek recourse if they still think the content is infringing. This can be done by taking the uploader to court, using the details they have provided.

I desperately hope this new legislation fails, but I think you were spot-on about aiming for a 'fair balance', and even acheiving that would be a great improvement.

I don't think your proposal would do anything to reduce the rates of false negatives, because there's no incentive to avoid them. Many false negatives will never get identified in the first place, because if a side-project pedestrian creator wasn't going to make any money off a creative work, why would they expose themselves to a possible court summons?

This is all true, but it does do something to start redressing the balance. Once we're at the stage where content creators have some of the control back, then we can start looking at reducing the stakes for non-profit content, limiting the claims rights holders can make if they abuse the system, etc.

Expecting a solution that covers everything first time -- and trying to persuade people that the special cases are necessary without seeing the system in action -- is a great way to achieve slow/no progress.

> I've always felt like false assertion of copyright should cause the loss of copyright.

Which copyright should be lost, if the assertion was false?

I'm guessing here, but if you claim that work Y infringes your copyright of X. If it turns out to be false you lose copyright of X.

The one that is falsely being infringed upon.

I've always felt like false assertion of copyright should cause the loss of copyright. Seems like a fair balance for me.

I'd say that a pattern of such behavior should trigger a penalty. It's not a single claim that's the problem. It's patterns of such claims en masse.

> I've always felt like false assertion of copyright should cause the loss of copyright. Seems like a fair balance for me.

I don’t understand. I put out a takedown request for something I don’t have the copyrights to and as a result the real copyright holder loses their copyright? How is that fair?

I hold copyright on the song "I OWN THE COPYRIGHT ON THIS SONG"

You use my song in a fair use way

I claim you are infringing my copyright

you challenge, I fail to prove my case

I lose copyright on the song and it enters the public domain.

Merely attempting to enforce a copyright that you legitimately hold against one instance that might be fair use, and then having a court find in a real case that it was fair use, would be enough to lose copyright of the work entirely in your view?

That's so disproportionate that you might as well just abolish copyright.

Also, most of the world doesn't have fair use like the US does, though many places have similar but weaker provisions in some respects. US fair use law is somewhat controversial, because it arguably doesn't comply with international obligations.

> That's so disproportionate that you might as well just abolish copyright.

Ideally that's what we would do.

Maybe one day we will, but first someone had better come up with another economic model that demonstrably supports creation and distribution of new works anything like as effectively as copyright has. There are a few ideas with potential, mostly in specific niches, but no-one has come anywhere close to reaching that goal in the general case yet.

No, that's not what's being argued will happen.

I own the copyright on the song X.

Person A uses the song X in a fair use way.

Person B claims A is infringing.

Person A challenges and wins.

Then I lose my copyright, without any involvement?

Why would you lose your copyright in that case? Person B loses their (non-existent) copyright to the song. This would only fail if it'd be literally "all copyright on song X gets wiped, without looking at who owns it", which as you pointed out is obviously a non-workable interpretation of the proposal.

You are replying to the person who proposed the “lose copyright” idea.

That's exactly right. The person repeated the concept, in a comment dated later than the other replies to the original; I tried to clarify why it wouldn't work.

Video competition certainly exists already [0]. But I agree that copyright terms are trending in the wrong direction.

[0] https://en.wikipedia.org/wiki/List_of_video_hosting_services

Yeah, things like this will have similar effects as certain highly regulated sectors in the US, where software is 20 years behind the time and built for billions of dollars by huge contracting firms because it's so difficult/expensive to make anything compliant with the rules.

No one can build filters that are good enough beyond the big tech cos. They can build these libraries and hopefully open source them for everyone to use...but if they don't...

> No one can build filters that are good enough beyond the big tech cos. They can build these libraries and hopefully open source them for everyone to use...but if they don't...

It is backwards: nothing but whatever can be build by big tech companies is good enough, because their lobbying and interests define what's good enough. It's also obvious, why it wouldn't get open-sourced then. Kicking the ladder and so on.

what are you talking about copy right needs to be abolished

That might help, however, for some reason I don't understand, this proposal hasn't created an outrage similar to the one we had seen before ACTA.

EU "wisely" has chosen voting time to be in the middle of FIFA Soccer World Cup, so many people are just sitting watching matches. This is also almost the beginning of the holiday season and many people are just relaxing, travelling, etc.

Apparently someone really needs that regulation to go through (hopefully it will not).

SOPA / ACTA happened in parallel with FBI shutting down Megavideo - a place where a lot of "illicit" streaming services offloaded the content to. Basically, a lot of people lost the free TV they were used to. The events were unrelated, but I remember them being widely seen by regular people as the shape of things to come.

Take away people's TV for some bullshit reason again, and they'll go to the streets.

Thanks for this - I wouldn't have gotten involved, but I believe I have a much higher chance to influence policy in this case than I've ever had after seeing Emil Radev's name, and I've now done everything in my power to sway him.

I hope Google & Youtube will troll the EU and block all official state content for a while, until they see that this law is not working.

Google and YouTube have a more than $60 million solution in place that everyone hates. But that is more than potential competitors have. If it passes, they will surely benefit.

There's a parallel in the tobacco industry of how seemingly harmful policies are actually beneficial for certain parties.

Philip Morris, the manufacturers of Marlboro cigarettes, lobbied in favour of laws restricting tobacco advertising. The logic behind this is that they have such strong brand name recognition, if tobacco advertising was banned they'd cement their position as the market leader, as there would be no way for other brands to gain market awareness.

I had never thought of that. That's cynical genius of the highest order. AND they get to play the moral high-ground card of "seriously, we're sorry! We're on your side-- look at what we're doing to help!"

When tobacco advertising was banned from TV, literally all tobacco companies profited. It was a classic example of game theory - If one of them advertised, they all needed to advertise to keep up, but if none of them do (or were unable to because of regulations), they all profit because people were still buying tobacco.

All, except a hypothetical new tobacco startup company. Or not so hypothetical new wave of vaping devices (I don't know much about this industry, just an armchair analyst here).

Thanks for your post and report; it made me think that my voice could actually make a difference and I just called, using the Mozilla tool! I got to speak with an actual person and in my native language. The guy seemed to know what's their stance and not to will discussing it, but he for sure mentioned taking note that they got some 6000 emails and also many phones. Anyway, I'm super happy I phoned, I talked with a real person, and the Mozilla website was super easy to use :)

Would you mind linking to some information about what the proposal is, why it's bad, what the alternatives are, etc?

> The Article stipulates that platforms should “prevent the availability” of protected works, suggesting these ISSPs will need to adopt technology that can recognise and filter work created by someone other than the person uploading it.

But there is no requirement that at the same time protects fair use. To remove content just in case has no negative consequences.

> Article 13 is “incompatible with the guarantee of fundamental rights and freedoms and the obligation to strike a fair balance between all rights and freedoms involved”

This looks like a one-sided law. It takes into account "authors rights", you can read it as big media conglomerates financial interests, over any other consideration.

This is an example of a law that puts some companies profit over any social consideration and will impact EU socially and economically in a negative way.

> This looks like a one-sided law.

Yet the law says:

> Those measures, such as the use of effective content recognition technologies, shall be appropriate and proportionate.

Ignoring fair use is neither appropriate nor proportionate. Also, taking someone's works down falsely would be a violation of this same article, since the content provider is obligated to

> take measures to ensure the functioning of agreements concluded with rightholders for the use of their works

which obviously precludes YouTube from just deleting your content on a false positive and not providing you with any recourse. Furthermore, article 13 only applies to

> providers that store and provide to the public access to large amounts of works

and not smaller companies. The whole article basically sums up the idea: work together.

It's never going to be enforced in that way, though. It would just take too many lawyers to cover your ass if you tried that as a host.

Obviously you cannot just allow copyright infringement because you are a small content provider. However, this particular article would not apply to you. You would have the ordinary duties already in law. Once you start to be large, then this article places an obligation on you to work with copyright holders directly in an appropriate and proportionate amount.

Other than misreading or ignoring parts of the law, I don't really see how it is unreasonable.

What do most EU citizens have as far as upload bandwidth and the cost for that?

100/10 Mbps is a very common baseline in large parts of Europe. Some only has 10/1 (often old people on old contracts), some has gigabit speeds. Prices vary between countries a lot, often $10-20 for the cheaper options, rarely above $100-150 for gigabit (when available). Usually no data caps on home internet.

I saw a great "joke" the other day. It said, "If Shazam or Siri can't identify a song for you, make a quick video and upload it to YouTube. Within a few seconds you'll get a copyright takedown telling you the name of the song".

I've seen nature videos where there's no sound but the birds chirping, and YouTube is still able to hear someone playing All-Star on Mars and flag the video. It's really incredible tech.

EDIT: The article actually mentions some similar examples: http://c4sif.org/2012/02/youtube-identifies-birdsong-as-copy...

I know the guy who created ContentID. He's a good guy and only had good intentions, namely to keep YouTube in business by identifying legit copyright issues. At the time, a human had to review all the claims.

This is a good example of how AI can go bad.

Looking at the DMCA requirements, it doesn't appear like it would require a lot of human intervention.

1. Copyright owner provides written notification of the alleged infringement.

The notification has to provide reasonably sufficient information to allow Google to locate the alleged infringing material; provide contact information for the claimant; include a statement that the claimant has a good faith believe that the use is not authorized by the copyright owner, its agent, or the law; and a statement that the information the claim is accurate, and that the claimant is authorized to act for the owner.

2. Google then has to remove or disable the material, and take reasonable steps to notify the uploader of the issue.

If the uploader does not make a counter notification, it looks like that's the end of Google's obligations.

If the uploader files a counter notification, Google has to notify the claimant. If the claimant does not file a lawsuit against the uploader within 14 days, Google has to restore the material.

Note that nothing in here seems to require Google to actually look at the material and decide if they think it is infringing. Nor does it require them to proactively seek to find infringing material and notify content providers or block such material. They just have to check that the claimant (and the uploader if they get a counter notice) have included the things in their notice that the law requires. They should be able to automate that for most claims.

The ContentID system seems to be going far beyond what DMCA requires them to do to keep their safe harbor. I wonder why they are doing that? Maybe because they want to also be content sellers, and so want to keep the content providers happy with them?

The problem is a lack of liability for the content creators. Confusion can happen (such as particular renditions of a work who's basis is public domain information (IE a classical music score)), but willful lies should be punished with actual consequences.

Right now things are stacked entirely against one class of content creators (video creators). Additionally if I'm a public institution or someone else that's against having ads I can't control what happens during the contest period* (or so I've heard, it's never happened to me).

As someone that has posted a few 'how to do X' in software videos before, who believes that ads should be far more limited (like the old soap operas / PBS sponsors at the end of programs), I'd rather my videos were in 'hold on publishing limbo' while not under my control.

In the case of classical music, there are two things that need to be clarified:

1) Classical music is still being composed today, and there’s a huge volume of post-1927 work that is very much under copyright in the US (excessive though that may be IMO).

2) Even if a composition is PD, the recording is absolutely copyrightable if it’s post-1927, and the arrangement may be as well. This isn’t a matter of confusion; it’s an actual piece of intellectual property.

Not to say that any of this is right; I believe fair use protections need to go much, much farther than they do.

Will fill abuse of enforcement should invalidate the copyright entirety...it becomes public domain. Repeated violation revoke all copyrights under your control.

The ContentID system seems to be going far beyond what DMCA requires

AFAIK the content id system was built to monetise works on YouTube. The system must know where a song or a video is being included to direct advertising money to the original author if s/he wishes to participate in that scheme.

ContentID does go far beyond DMCA requirements. I think for a while, some big companies has carte blanche to delete any video. I guess narrowly winning a billion-dollar lawsuit against them will scare a company that way.

I'm not demonizing the people making it, or even necessarily the people demanding it be made.

The DMCA was the beginning of the end of a free and open internet, and now it's accelerating. I just hope the next generation understands what they're missing under the spectre of copyright law.

Prior to the DMCA, ISPs had full liability for copyright infringement committed by their customers online. Small ISPs were being sued out of existence, and it was only a matter of time before nobody would be willing to run an ISP at all. The DMCA (specifically Section 512) was the thing that saved the free and open internet by introducing limited liability for ISPs.

That's fair and I don't want to downplay that aspect. But that's not the only thing the DMCA does.

While the DMCA has been abused by large corporations, it has been invaluable in helping protect the little guy.

It's been a huge boon to independent content creators that you can just send somebody a DMCA takedown notice and they'll comply. Before the DMCA, you had to pay out the nose for a lawyer, and you'd be in for a long and very, very expensive legal process. Your average photographer can't afford that, so copyright enforcement was solely a privilege of the rich prior to the DMCA.

Here's an example. I know someone who shared pictures of herself in a private women-only group. Someone who ran a really disgusting cyberbullying page on Facebook had a mole in the group and posted the pictures publicly while making fun of her appearance and calling her slurs. Since the photographer was a friend of hers, the photographer simply sent Facebook a bunch of DMCA notices telling the cyberbullying page to take down the pictures. Facebook complied in under an hour. The cyberbullying page tried to post the pictures again; more DMCA notices were sent, and the whole page was unpublished.

Humans are to blame for using horrible AI/software. At least if you know your AI is that broken then have some human supervision.

The real problem is when the task is no longer doable by Humans. Then the AIs have taken over and we can't turn them off. Imagine what happens when the AIs run our cars and food production, security, etc. We will be unable to turn them off and they will be unaccountable except to the people who set their policy which will be mired no doubt in layers of deep obscurity.

Or the problem where the developers will not know why the AI did something, say an AI decides to do some bad thing, the judge will ask the developer why and the dev will say: -I don't know, we feed them a ton of input and it worked fine on my machine and without test data so we shipped it

Google prefers to automate everything. Having human supervision would affect the bottomline so is out of the question.

Well, let's say it takes 30 minutes to fully check for copyright compliance 10 minutes of video. This means that the speed of one person checking videos is 1/3 unchecked-minutes per minute, or 1/3upm. 400 hours of content are uploaded every minute to youtube, which makes for 24000upm. You will need to employ 96000 people 24/7 to check all videos. Of course, they can't work 24/7, so a more realistic figure is 150000 people. This isn't menial work, it requires legal knowledge and familiarity with copyright law, so you're looking at hiring entry-level law clerks, which is about 16$/hr or about 33k$/yr. It will cost Google about 5bln$/yr to check all uploaded content manually. Which seems like a lot but is like 10-20% of Alphabet Inc.'s gross profit, so not actually impossible.

Why not simply have humans review the videos that are automatically flagged? That would be a significant improvement over the current state of affairs, and would command a significantly more reasonable price.

Exactly, have the code extract the exact section of video that it considers problematic on left side and on the right side put the copyrighted work, the human will lose a few seconds to have a look.

Also I do not understand why Google does not use a reputation system, if a channel has a good reputation then take this in consideration in the automatic detection.

At the very least do it for major accounts (4k hours a year). It's pretty ridiculous that pulling in hundreds of dollars won't get you even a 30 second review.

The question is whether you are more worried about false positives or false negatives.

If you are only worried about false positives, then the humans only need to review the content flagged by the AI. This has 2 benefits: 1) the "starting number" would be much much lower than 400 hours of content per minute. 2) If a 10 minute video is obviously pirated, you don't need to watch the entire video to decide that.

Are the only 2 choices (1) no human in the loop and (2) only humans in the loop?

There are some strong corrections that need to be made to your math.

1. Only a subset of uploaded videos need to be checked by humans, those that were flagged by the automated system.

2. Only a subset of the runtime of each flagged video needs to be checked. In other words, if a human decides within 12 seconds of a video that it is infringing, there is no reason to check the remaining portion of the video.

today in order to do a Google search I had to prove to Google that I wasn’t a robot. And even though I did it right the second time, they still made me do it a third time. I just remember sitting there going, Google, I actually am a human and your AI sucks so if I have to tap on storefronts and cars one more time to do a freaking Google search then I’m just using Bing.

It's partly checking that you're a human.

It's partly using you for free labour to train their next generation of SkyNet.

We've reached a point with CAPTCHAs (originally conceived as a way for humans to exert their will over machines) where the machines are now setting tasks for the humans, and the humans are carrying out the tasks without putting up too much of a fight.

That. CAPTCHAs are used as free labour. The only reason you don't see them everywhere is because people would stop using Google.

We've done it. We've cracked the code.

This thread is complaining about the fully automated process used by Google to check for copyright violations, and how infeasible it is to hire someone to check each video.

You're talking about how CAPTCHAs are now being used to get free labour.

> Prove you're human: Does the following video contain Beat It by Michael Jackson?

I am pretty sure they have considered it. Probably there is a legal hiccup laying here -- you either have reasonable suspicion that the material you show is copyrighted, or you acknowledge your algorithms and AI suck and you require crowd-sourcing. Kind of a lose-lose legally, and they would possibly introduce liability.

Recall the goal of Google is to avoid liability via the statement "we are pretty good and have really low false negatives", and have as few false positives as not to make the platform unusable.

Google does not care about content creators or propagation of knowledge (instructive videos) and ideas. Simple statement here. Google engineers might (at least the friends I have), but not as a whole organization.

But yeah crowd-sourcing is a good way to do a second sifting through results. I know it is being used, by Google and others, already to classify video sources.

Why do site owners use Google CAPTCHA? Even Cloudflare uses it. I resent being forced to help Google when not even on a Google site. I won’t put any of my stuff on Cloudflare specifically because they force people to help train Google’s image recognition.

Why do site owners use Google CAPTCHA?

Because it's free, reasonably hard to beat by bots and reasonably low-effort for the user if they let Google track them (enabling the single-checkbox captcha).

A few days back, to allow me to download the software I had just bought, the site kept putting up those damned pictures for me to click on. Street signs were the first three, I apparently got all of those wrong (even though I got them right), shop fronts was the next lot (that's a little harder to work through, I can't tell Chinese shop fronts from Chinese warehouses by text alone) and I had two of those, then more street signs, and finally cars.

It took longer to confirm I was human than it did to download the software. My partner came through to see if I was alright, after I was shouting abuse at the computer for an extended period. She sat through one of them with me, confirmed that I had it right, then swore and walked away when it decided I was wrong.

If I'd known that it was going to screw me over that much, I would have recorded it and put the video on YouTube.

Would that have infringed on Google's copyright?

ContentID is not AI by any definition of AI, it's signal processing.

I believe the implication is that youtube has replaced the human reviewers with an ML-based system.

ContentID isn't learning because it doesn't train to make inferences; it looks for exact signature matches on the audio, normalizing for alterations like pitch changes and equalization.

All sorts of computer applications that are not considered AI have replaced humans. For instance, databases replace clerks having to shuffle papers to and from filing cabinets.

I think saying signal processing is AI is like saying the entire CS is AI, which is bizarre. There are all sorts of algorithms and disciplines in CS that are not AI. I think the commenter you're replying to was complaining about this.

It's automated decision-making.

Any algorithm that produces a Boolean result is "automated decision making". You pop in the inputs, let the automaton run, and there is your yes or no.

Yes, that's right. The boundaries of AI are qualitative, not quantitative. AI can be simplistic or complex, dumb or smart.

AI doesn't have to learn to be classed as AI. Source: Every video game ever made. Just look at deep blue. Most would consider that AI, but AFAIK there is zero machine learning involved.

Edit: I think I hit the wrong reply button, meant to reply to parent, haha.

I guess he won't go down in history like the leaded gasoline and freon guy. But I guess it's more like Nazi Germany where if it wasn't him, someone else would have.

Dr. Godwin, I presume? tips hat

I got a copyright strike once for a video with no apparent reason. After closer inspection, I could in fact hear a radio playing somewhere in the background if I really turned up the volume. I found it incredible that YouTube’s algorithm was able to pick it up at that signal-to-noise ratio.

For people with a signal processing background, this is actually a very trivial thing. The basic technique has been in use for probably a hundred years - radar being a classical example - nothing at all to do with Google. Basically, you can get a signal-to-noise ratio improvement proportional to the duration of the signal you're attempting to detect (in this case, audio), thus allowing you to detect very weak signals in the presence of strong noise or other unwanted signals. Look up "pulse compression" or "correlation detection" if you're interested.

This is very true. In UC Berkeley, the very first linear algebra course all EE/CS freshman take (EE 16A) has a lab that does EXACTLY this. You match part of a song with some very noisy sound. If freshmen are taught to do this, it's easily something Google can do.

Go Bears. I weaseled my way out of the requirement for 16A (when it was transitioning to replace EE 20/40), and I kinda regret it now.

Still, it is quite incredible for us laypersons :-)

If it's background noise, is it still a violation? Not suggestion an answer, just asking. Like, how (not in all countries) it's okay to get passers-by in your video shot and broadcast them uncensored?

if it can identify the song why can't it process it out of the soundtrack ?

They don't care. Individual videos don't matter to them.

I actually tried that and got a name of a song, and it wasn't a takedown but an agreement to monetise (i.e. add ads, which I didn't have much of a problem), except that it was the completely wrong song. Naturally, that song was also available to listen to in a few other YouTube videos, and the only thing that they had in common with mine was a generic-sounding section of 120bpm bass drum. The melody, the duration, everything else was different.

I disputed, explicitly pointing to the other ones which I assume was identified correctly and how they had basically nothing in common with my video, even explaining that I'd be perfectly fine with it being monetised if it was actually the right one, and was... (not) surprisingly denied. I have a feeling that either no humans were involved at all, or if one were, it did not listen or have any common sense. In the end, I just deleted the video.

Now that I think about it, perhaps reuploading with the (wrongly) detected name, letting it monetise, and show up in search results might be a better idea --- at least that will make it more likely for others to stumble across it, listen, and possibly leave comments like "this isn't X, it's Y"...

Honest question: do you guys and gals think YouTube is really able (does it even try?) to identify the songs that Shazam or Siri can not identify - at all? I presume that if a song is unidentifiable for Shazam (or another service like it), then it's not popular and "important" enough on YouTube too so it doesn't get caught OR it's a folk song from a non-Western country.

It's highly probable that you can get the name of such a song much faster by asking about it in the comment section.

I've actually tried this with popular western songs. The difference seems to be that YouTube is better at identifying songs with a lot of background noise. Shazam/Siri/Google need to have cleaner audio to be successful (I'm not sure why the Google assistant doesn't use the same tech as YouTube, but it doesn't seem to).

I wonder, how many laws would I break if I implemented a Song ID service based on youtube APIs that way?

I know this is just filtering for now, but I've been backing up entire channels for fear they'll one day disappear because global policy or legacy media conglomerates won't allow for it. I imagine the "you" in YouTube will be mostly gone in the next decade. The writing's on the wall.

The really frustrating aspect is that one would need to implement advanced search functionality on those videos otherwise the more you archive the less chance you have of finding anything.

The "hosting" aspect of YouTube is really a small part of why the service dominates. The connections between channels and links from one video to another channel -- network effects and discovery -- are what make disrupting YT seem impossible.

Kind of crazy given how YT originated in juvenile joke vids, but YT may have the potential to become Google's biggest source of revenue despite any number of overbearing content policies. The new ads below videos for products of potential interest based on the video content are annoying but probably very profitable.

> network effects and discovery -- are what make disrupting YT seem impossible.

i highly doubt that. The youtube monopoly is due to both audience, and the monitization model available (which, atm is getting shoddier and shoddier every day with random demonitizations).

Hosting is a big issue, but it's a technical issue, and other platforms (such as vemeo, or some other platform) has "solved". But those platforms don't dominate (or even show up as a blip!) because content creators can't use them to make their living.

>I've been backing up entire channels for fear they'll one day disappear because global policy or legacy media conglomerates won't allow for it.

Not a bad idea, as entire channels have gotten the boot for much less, from Youtube itself.

i have been doing the same, in particular unofficial music remixes, which likely will disappear at some point and not allowed on spotify etc.

There's a lot of fan made instrumentals out there(of astounding quality) that I've been archiving for fear of losing forever.

I learned my lesson after Nintendo went berzerk and requested takedowns for gameplay videos and walkthroughs; smashrockgroin's series for Goldeneye 64 was both funny and informative when playing that game, but now those videos are gone forever. I've vowed never to buy a Nintendo product again because of that alone.

I learned my lesson after Nintendo went berzerk and requested takedowns for gameplay videos and walkthroughs

Going after pirates and such I can sort of understand, but people enjoying your product (and obviously helping to promote it)? WTF.

It reminds me of the complete opposite of this old meme: http://lurkmore.so/images/9/94/360_Kid.jpg

> but people enjoying your product (and obviously helping to promote it)? WTF.

The usual rationale is that walkthroughs / let's play videos show enough of the game that people may opt to just watch the video, instead of playing the game itself.

I can sort of understand this position when it comes to videos that e.g. show the entire single-player campaign with zero commentary. But I'm not convinced that people opting for such videos would have bought the game if the walkthroughs were not available.

> I'm not convinced that people opting for such videos would have bought the game if the walkthroughs were not available.

also, if the game has nothing else to offer when compared to a video playthrough, may be it's the game that has the problem!

Not necessarily. Story-rich games are essentially one-time only experiences, even if they are very fun to play through.

Just like a good book, a good story-rich game invites you to experience it more than once. Even if it is just to notice nuances you missed the first time.

If we lose Mouth Moods[0] it will be a horrific cultural loss for the internet itself. Bustin' actually came up next on my "everything" playlist while I was typing this comment.

[0]: http://www.neilcic.com/mouthmoods/

Thankfully— at least in this case— Neil offering the album as a free lossless download on his website means countless archival-quality copies are saved on PCs the world over!

Yep, though I must admit I downloaded the lossless and uploaded it straight to Google Play Music where it gets converted to lossy.

Another reason to backup HD Youtube videos is that they'll "demote" them to SD if they don't get enough views.

I've never heard or seen that. Seems weird as well. Source?

they are probably talking about their caching. Videos with low views are not cached at all servers, especially not higher definitions. So you want to watch the video, the server realizes "oh hang on, I threw this one out, let me grab it again from another server". while the server fetches the fullHD video, you get it in 360p

I recently had a video with very low views that just wouldn't load for minutes on end. Was wondering if they were down, or if it had to be grabbed from some tape archive because a disk crashed. After about 10-15 minutes, it slowly loaded. This was at all qualities, browsers, and even when trying from another continent.

Is that true? Do you have a source on that?

Maybe I'm an isolated case, but many of my videos from 5 years ago with few views.

They were filmed on a Nikon DSLR in 1080p. A few weeks after uploading they were fine, 1080p. Several months later, I noticed a few were 720p, and now many are 480p, but a extremely poor quality 480p that looks more like < 240. For a few that are still 720p, and some that are 480p, the framerate has been reduced to maybe 4 fps.

Maybe it was the codec Nikon was using, but regardless, they're basically gone, and I no longer consider Youtube a "safe" place for videos.

I would send links, but I don't want to dox myself.

Nice, do you have any summary of what process you take to back up channels? I've considered the same in the past, but found that there are a lot of crappy/sketchy plugins, and not too many legit/decent ones that actually get the highest quality copy of the content.

youtube-dl is great for this, you can run it periodically against a channel and only download new videos. See the --download-archive option. Using "-f best" will fetch the highest quality format, but you can be more specific with filters like "-f best[filesize<50M]".

I actually use -f bestvideo+bestaudio, since it downloads and muxes the best combination of DASH video and audio.

ArchiveTeam also has a recommended list of flags for archival: https://www.archiveteam.org/index.php?title=YouTube

>I actually use -f bestvideo+bestaudio,

It does that by default. (If you have ffmpeg)

Thanks! I'll be doing this from now on. Had no idea about that.

I'll throw in my vote for youtube-dl as well. I love that it's so cross-platform; there's even a port for OpenBSD which makes it super easy to enjoy YT videos on that OS without having to run a resource-heavy browser, and it can be scripted.

Yeah, I'm basically doing what you describe:

`youtube-dl -f best -citw -v <url of channel or playlist>`

If I recall, that's going to ignore errors. At an airport right now so I can't remember exactly what those flags are.

youtube-dl has this in their FAQ:

> Do I always have to pass -citw?

> By default, youtube-dl intends to have the best options (incidentally, if you have a convincing case that these should be different, please file an issue where you explain that ( https://yt-dl.org/bug )). Therefore, it is unnecessary and sometimes harmful to copy long option strings from webpages. In particular, the only option out of -citw that is regularly useful is -i.

But OK, here's explantion of the options you used:

• -f best: Select the best quality format represented by a single file with video and audio. (By default, yt-dl will merge best video with best audio if that's what's available.)

• -c: Force resume of partially downloaded files. (By default, yt-dl will resume downloads if possible.)

• -i: Continue on download errors, for example to skip unavailable videos in a playlist.

• -t: Use title in file name. (Deprecated. This is the default.)

• -w: Do not overwrite files.

• -v: Print various debugging information.

I thought it already fetches the highest quality by default?

It has for a while.

Really! I had no clue about this one (it has been a while since I searched around for such a thing).. that's great! Thanks for the tip, I'll check it out.

Recommended storage device and disk drives?

https://www.backblaze.com/blog/open-source-data-storage-serv... ; it's what I use to keep anything in cold storage that can't immediately go into the Internet Archive or that they're unable to dark after I upload.

Wait so you bought a $10k 60 disk storage platform? Or built one?

Built (several). My background in tech is ops/infrastructure/networking/devops.

That's awesome. I would like to build something similar, though maybe scaled down to 1/2 size/capacity. Did you follow their parts list exactly or deviate (e.g. what did you use for chassis since theirs is custom)?

> what did you use for chassis since theirs is custom

FWIW, in addition to their awesome disk stats Backblaze also provides blueprints for their chassis as well as a BOM for the parts inside

> Backblaze also provides blueprints for their chassis

Wow, I had no idea they did that. I definitely have a new respect for them.

What does "dark" mean in this context?

Publishing of content that would cause legal or other harm to the Internet Archive. It’s still stored on disk, but not accessible.

To clarify, to “dark” an object is to make it unavailable but continue to store it.

You can also use Yizzy[0]. It’s just a GUI on top of youtube-dL but it’s very easy if you don’t like command line. It’s only MacOS compatible by default but should work anywhere if you change the python path. Like youtube-dl it also allows you to download whole playlists by just pasting the link.

0 - https://github.com/biko-the-bird/Yizzy

also https://github.com/MrS0m30n3/youtube-dl-gui

and there are plenty of others too.

it's quite a shame. google are not being good chaperones of what has become an amazing service and resource.

hopefully a completely decentralized replacement will arise.

(OT: wow, HN strips emoji!)

hopefully a completely decentralized replacement will arise.

Ironically, we used to have a nice, decentralized Internet, and services like YouTube have been probably the biggest cause of damage to it.

Not so many years ago, everyone got a bit of web space and their own URL along with the connection from their ISP. You could put your own web site there, for all to see. Many ISPs would allow running some basic software. Visitors would find these sites through search engines or [gasps] manually created links from other sites with relevant material.

An analogous service today would probably involve providing a useful amount of space and bandwidth, and probably some ready-made blogging or video hosting software so non-geeks could just start creating something instead of having to install anything first. But in reality, almost no ISPs (at least here in the UK) routinely offer that sort of service any more, because the likes of YouTube and Medium have killed it.

i do think the search, recommendations, following, ease-of-uploading, and unlimited hosting capacity that youtube gives you are of real, incredible value. i can find videos on youtube of how to do practically anything, uploaded by every kind of person imaginable.

practically speaking, people creating their own webpages and hoping to get indexed by a search engine or included in a hand-curated directory is not going to get anywhere close to that. you restrict your creators to a tiny, tiny fraction of the population w/ those skills.

so the trick is how to have the great ux of these centralized services without the downsides of centralized control. seems like there's a good chance it's possible.

(for other services, we did used to have (well, still do) pretty good decentralized variants. slack vs irc, reddit/hn vs newsgroups, email vs fb messenger. but even the more minor usability improvements of the centralized alternatives seems to be enough for them to displace what came before. maybe the tide will swing back.)

i do think the search, recommendations, following, ease-of-uploading, and unlimited hosting capacity that youtube gives you are of real, incredible value.

They're certainly useful. I just question whether we wouldn't have found other ways to achieve these kinds of benefits if we didn't have YouTube, with the entire Internet interested in the results and well over a decade to come up with something.

perhaps. hopefully we still might.

random example of youtube shittiness i just stumbled over:


What reason is there to think it would? There used to be a bunch of services YouTube competed with; now few of them are even worth mentioning, let alone serious competitors.

i think there's a couple of reasons for optimism:

1. high-speed net access is becoming more and more common; higher speed help makes the possible overhead of distributed solutions less of a barrier

2. there is more awareness now of the downsides of having a centralized provider. many youtube creators and users have been burned by google; i think that'll generate real demand for a decentralized solution.

3. existence/arguably success of decentralized currencies provides a lot of psychic energy/motivation for making decentralized things.

Same here. Backing up a lot of channels I like but also lots of music videos, interviews, live shows, informational videos, etc.

Hopefully the next great online video service will have support for displaying the original encoding for uploaded videos, saving them all another generation of quality degradation due to transcoding.

I've been doing the same to back up fan translations of various things. Almost certain they will disappear to some point, as they are definitely infringing, but it's the only way I can enjoy those media (most of which I otherwise own).

Out of curiosity, what sorts of channels/content are you backing up?

People have a tendency to misundertand everything I do, so I'd rather not get into specifics, but in summary:

- Intellectual discussions

- Fan made instrumentals and remixes (especially for Rammstein)

- Found footage, old shows, and other nostalgic material from VHS tapes. (wouldn't surprise me if they get flagged for having Folgers commercials from 1993)

- Random funny videos with swearing or vulgar language

Not a whole lot of things that are likely to be relevant to most people, but I've a culture of my own and I'd like to preserve that. To forever lose something I enjoy means losing a bit of myself and not being able to share that with others.

I've been thinking a lot lately about how various small internet subcultures and communities that I've been a part of have been lost to the void. It's one of those things that makes me a special kind of sad and nostalgic.

Would you be at all interested in software that helps to construct a permanent showcase of the stuff that matters to you? It would handle things like archiving to enable a focus on curation and presentation.

I've been thinking of making something like this because services like the internet archive don't seem quite right. There's no context. I just worry that nobody would be interested in doing the work of providing that context.

Yeah, I'd definitely be interested in seeing something like that.

That definitely seems like the kind of content that would end up getting removed, but ironically is the kind of the content that services like YouTube grew up on.

How do you back up these? Just to a drive on your PC? To a network drive / server etc?

And what software do you use? Streamlink?

I have a hard drive connected to a Raspberry Pi with Samba installed. Samba is terribly slow, so I just ended up bypassing that and installing youtube-dl directly on the Raspberry Pi. I'd like to find a more robust solution, but for now that works.

Streamlink looks like a great project. Thanks for mentioning it!

There's an update buried at the bottom of the article. Ton Roosendaal tweeted[1] that they sent Blender a contract to enable monetization. I can see why he doesn't want to do it based on the principles of the Blender Foundation. Sounds to me like Youtube is holding these creator's content hostage because they don't want to put up ads.

He put the contract up for review as well: http://download.blender.org/institute/Contract%20between%20G...

[1] https://twitter.com/tonroosendaal/status/1009010581549060097

Ought to tell Torrentfreak and all involved. It's not a piracy filter gone wrong, it's monetization humans at Youtube.

Let me first say that this law is taking too far. There is a need for better legislation around copyright, but this one is just moving the needle from platforms abusing copyright to rights holders abusing the platforms.

However the situation around YouTube is not a direct consequence of poor technology, but rather poor policies. YouTube built 2 separate products, a) Content ID (CID), b) partnership program.

CID is the system that, based on fingerprinting algorithms, is able to identify videos containing same segments (both audio and visual). Technically it's actually pretty good. However through the partnership program, YouTube allows rights holders to abuse the system by providing reference files that they may not have direct rights for, or allow them to claim any content just because they say so.

This however has nothing to do with the technology itself, but rather with YouTube's decision to not to have to challenge right holders directly (they can and will sue, while users uploading content will most probably not).

There are ways to deal with this kind of situation, but most of the platforms, especially YouTube and Facebook are somehow not interested.

They can, will, and have. ContentID came after the Viacom lawsuit.

YouTube seems to be more of a monopoly than Google Search. If only Facebook created a dedicated site for Facebook videos. Fbtube, fbv.com...

Microsoft has a YouTube like service but limits it to enterprise.

Since video hosting is such a resource intensive undertaking, I wonder why many free porn sites don't have outages and have no limit on uploads for free users - unlike YouTube competitors.

> Since video hosting is such a resource intensive undertaking, I wonder why many free porn sites don't have outages and have no limit on uploads for free users - unlike YouTube competitors.

Because they can run far riskier and less-vetted ads. They can also run as many of them as they want - the "target audience" doesn't really care if there's 1000 banner ads on a page. There's a reason much of malware is spread from porn sites.

For consumer facing video sites, they have to compete with YouTube on UX, which means much more stringent ad vetting, and far fewer ads. This likely tips the balance just enough that normal online video startups are not feasible without 3-4 years of runway.

>There's a reason much of malware is spread from porn sites.

I thought this was no longer the case

With the death of Flash, if malware ads are getting past Chrome’s sandbox isn’t there like $100k at Pwnium just waiting to be claimed?

100k is peanuts compared to what you can make bitcoin mining on the computers of all pornhub users.

I meant to say, if such a piece of malware were ever actually deployed in the wild researchers would be scrambling to decompile and reverse engineer it, to determine how it worked in order to report the vulnerability.

What about Vimeo? I know it’s target market has pivoted to More high end indie creations however aren’t they somewhat positioned at least to exploit any holes in YouTube’s strategy? This is one of those things where I feel like a real competitor will only sort of be able to creep up not explode on the scene. I always thought Vimeo had a decent chance here because it’s pretty easy (imo) to use just like YouTube and they have free tiers (or st least when I signed up they did)

Honestly, I wouldn't be surprised to see PornHub launch a YouTube competitor strategically as YouTube is flailing.

They already run a high functioning video network. They really just need a SFW frontend with alternate branding (and no adult videos).

With the recently released VPNHub [0], it seems like they're willing to branch out and I see this as a possibility.

[0]: https://vpnhub.com

I think a big chunk of the problem now is that when people want to watch regularly released video content from internet creators they reach to YouTube, there's no integration of subscriptions across sites. You'd basically have to scrape video feeds from youtube to compete in this way.

>You'd basically have to scrape video feeds from youtube to compete in this way.

Youtube still offers RSS and it's reliable and chronological unlike youtube's subscription feed algorithm.

If only there were some way to Syndicate website updates to a centralized reader in a Really Simple way...

Seriously though, I dream of a world in which all apps had a "subscribe on your news reader" button which could open an RSS client, maybe something implemented by mobile OS providers so it's easy for new users but overridable by users like the browser or calling apps would be ideal.

I think RSS is a vastly underused technology/tool. I wonder at what point people lost hope in it and abandoned. Not to say, it's completely abandoned, but it's not used as much as it should be used.

I personally feel like and I remember very distinctly that RSS (and Atom Feeds to boot) seemingly hit their peak with google reader and when google reader was discontinued RSS seemed to wane quickly thereafter. I’d be very interested in knowing if hard statistics back this up or not but my anecdotal experience suggests that this is indeed the case

I think Twitter's coming of age was the big blow. When Reader shut down it was already waning.

This isn't a competition problem: Any competitor that gets big enough to challenge YT will attract the same scrutiny from content rights owners. and will face the same copyright issues that YT does.

The problem isn't in the content hosting industry, it is on the policy & legal side.

Yet this issue is on youtube preemptively blocking first party content by established organizations. This isn't Babby's First Copyright Violiating CS:GO Montage, this is content released by the same university that trained a chunk of youtube's employees.

The issues we are seeing can be easily fixed by some simple common sense controls on what levels of certainty are required to take down a video and/or channel that scale with how established the channel is, including requiring escalating levels of human review.

With the "advertiser friendly" issue, the same approach can be taken, topped off with a monetize time prompt or form that allows them to assert how advertiser friendly the uploader understands the video to be (weighted by the history of how accurate their answers have been in the past).

These are such solved problems, that youtube really shouldn't be having them.

> some simple common sense controls on what levels of certainty are required to take down a video

This is solved already, but there will always be false positives.

What isn't solved, and likely never will be, is copyright.

But it seems that the issue isn't due to content ID, but contractual issues.

Ya, youtube is obucating this, but it seems they are changing the terms to force monetization on certain kinds of content where they never did before. As in, force the creator to take a cut (as well as enable more ads to be shown.)

It seems like an odd play, increase supply of ad slots while dealing with a demand decrease from random companies dropping ads during drama. They must have signed some kind of deal recently.

If they are trying to increase the ad slot supply expect actions against ad blockers to come next.

There's also bitchute.com and d.tube as credible youtube alternatives.

Very annoyingly youtube-dl refuses to support bitchute for reasons that aren't exactly clear: https://github.com/rg3/youtube-dl/issues/14052

If I can't get data off bitchute easily using de facto standard tools (I get that I could use a bittorrent client, but that's beyond the point), I have little interest in putting data on it.

Doesn't seem so? Seems to me that some user @tv21 was arguing against it for some reason, and @dstftw locked the conversation to avoid the argument. The issue is still open, implying it's free to be worked on. Has anyone submitted a PR?

It's open-source so... just fork it.

(I know you're not a developer, but this is a perfect use-case for forking and continuing.)

Easier to just not put anything worthwhile on bitchute to be frank. youtube-dl needs constant vigalent updates to keep it working, maintaining a fork of it is a serious long-term commitment.

Just pull in updates from them... No need to maintain it all by yourself.

For very loose values of credible.

You can upload videos to facebook though. In fact, that's the only way to get videos to play inline in peoples' timelines.

Yes but you can't browse Facebook videos without a Facebook account. You can view them if people directly link them, but viewing someone's video page is blocked, let alone actually searching for videos.

Terrific, I can't wait for what will happen with the forced content filter that will be voted on in the european parliament on june 20th. what could go wrong...


For as much grief as the US Congress gets for its lack of knowledge in tech, I would take (current) US laws over EU and GB restrictions any day.

Somewat related article from yesterday: https://news.ycombinator.com/item?id=17333920

("Politicians, about to vote in favor of mandatory upload filtering in Europe, get channel deleted by YouTube’s upload filtering")

Its also possible to upload content to the contentid system without having any claim to the work. Often nefarious companies will upload swaths of royalty free or otherwise content (large sound effects libraries) and claim all of the videos ad revenue. They never contest the response and the flag dissappears.

I don't understand why no other video platform has broken into YouTube's market share, after the last couple years' constant stream of PR disasters which mostly have to do with the scale of the platform

Beyond the issues of infrastructure cost and litigious publishers as mentioned by others, there is also the user base.

If users are on YT, that is where the content creators will be. And if the content creators are on YT, that is where the users are. Moreover, users have curated lists of subscriptions and creators have large lists of material. Finally, YT has access to a lot of data on viewing behavior, and likes. This means YT should be a lot better at recommending videos to users.

Finally, the creators that most want to switch to an alternative are those who are the most censored. These tend to be more controversial, which means these alternative services gather controversial content. This tends to give these places a bad reputation which pushes away the non-controversial content.

The best bet would probably to take a non-controversial niche, start there, and try and integrate well with YT to keep down switching costs. Its still hard to provide actual added value to such a niche. Alternatively one might take a controversial niche with wide acceptance. For example, gun channels or LGBTQ+ channels. Both are currently demonetized, but both are still widely accepted in broad swathes of society.

> YT should be a lot better at recommending videos to users.

this is hardly the case, as any avid youtube watcher can tell you. Recently, the algorithmic suggestions are terrible, and either show repeated content, or content that's not even close to what's being watched (but the SEO/words in the description/title matches), or is a huge list of videos from the same channel.

It's expensive at least partially because publishers will sue the crap out of you if you don't invest heavily in content ID etc. It creates a huge barrier to entry, which is why copyright laws should be reformed to respect the idea of "common carrier" sites which don't discriminate on content.

(US) Copyright law does have that idea, the DMCA encodes it explicitly as the "safe harbor" provision.

Because it ties in two barriers to entry that have rarely existed together before: minimum efficiency scale and network effects.

Facebook only has network effects, there isn't anything particularly efficient or advanced about what it does. Given its minuscule infrastructure when compared to Google and Amazon, it would be very hard for it to achieve the same level of cost that Google has on YT. Pure search doesn't rely so much on scale as well.

Any other provider doesn't have either the network effects that YT does nor the scale to be able to compete with Google on YT.

From a scale perspective, the only companies who could effectively compete with Google would be Microsoft and Amazon. Microsoft is still trying to figure itself out, especially on the consumer side. Amazon is geographically limited and has very little expertise in open content and ad-funded platforms, but it is trying with Twitch.

It’s so expensive that only Google can afford.

Pornhub is doing well.

As I understand it MindGeek is pretty big but still only a small fraction the size of Youtube in terms of traffic. And they've yet to create a non-porn version of their site. I've heard rumors that they've at least considered it, but so far it hasn't happened.

Youtube video quality and speed is unparalleled. There are times on my phone when even jpgs on imgur take longer to load than 1080p video on youtube. I don't know how exactly they pull it off but I guess it's a combo of local caching and deals with ISPs and phone carriers. Smaller competitors have no chance to compete.

It's very hard to do and very very expensive. An Indonesian company tried to build a Youtube competitor: https://vidio.com. I know many friends who work there. From their stories, we can assume that it takes a lot of smart people and a lot of money to build one.

So YouTube takes down educational content while at the same time recommending knockoff cartoons featuring violence, sexualization, and gore aimed at traumatizing and corrupting the minds of our children. Way to tip the scales YouTube!

YouTube has [addressed the issue](https://venturebeat.com/2018/06/18/youtube-is-working-to-res...). There is no mention of copyright violation, it says it was updating user agreements.

It seems YouTube's statement here is really disingenuous. When they say "We are working with MITOpenCourseWare and Blender Foundation to get their videos back online." they mean that they are holding these non-profit organizations' video hosting rights hostage until they agree to enable ads! https://news.ycombinator.com/item?id=17347560

Does YouTube plan to deploy its "enhanced" upload filters globally, or just in Europe?

It seems Phase 1 of analog to digital media (music, TV shows and movies) was late 90s to early 2000s. It was going to happen but the teething pains lasted quite a while. CDs and DVDs were abandoned for USB then just a file.

Now it seems Phase 2 will be trying to control the massive amount of media users share. It's a whack-a-mole game against things like Kodi and sharing media.

For YoutTube it's trying to delete copyright violations and it gets rid of hundreds of accounts which are just created again and the material uploaded again. Even an "AI" can't cope.

At some point it would be nice to see all video, all music available to anyone for a reasonable price. "You will!" as Tom Selleck said. Really it's going to be that way eventually otherwise enormous amounts of money and effort will be spent trying to chase down endless illegal copies.

CDs, DVDs, and Blurays are still very much around.

Blender seems to be testing #peertube

but: "PeerTube uses the BitTorrent protocol to share bandwidth between users. It implies that your public IP address is stored in the public BitTorrent tracker of the video PeerTube instance as long as you're watching the video"

Does this have GDPR ramifications?

Not really, since you opt-in to use the service and accept that your IP is shared there is no issue.

I'm hopeful that as everything on the internet becomes illegal in Europe, it will be the equivalent of nothing being illegal. It seems like companies could just play the stall in court game forever with all of the claims that can be made

It's becoming a common occurrence to fall victim to false positives from algorithmic filters. False positives are just always going to be part of any kind of machine-learning algo. In the end it's statistical probability, and probability doesn't equal certainty. Sure there are techniques to reduce false positives, but I sincerely think that any user-affecting policy based on ML / algos should always have a real-live human based support fallback to get real resolution to the real issue of false positives.

I never suspected the balkanization of the internet would occur, particularly with such a coordinated effort across nations.

Why is this content being blocked in the U.S.?

For the moment all content is blocked at Google's entire discretion, because no European law has been approved yet. So they are probably jut blocking everywhere.

Same reason why DMCA takedowns remove content uploaded in EU.

Damned if you do, damned if you don't. Google gets roasted by the press no matter what it does.

They are blocking original educational content from everybody, including members of the organizations producing that content. Seems perfectly reasonable to criticize such a mistake on its own merit, independent of whatever other criticisms you are referring to.

They have to block them. Apparently these partners had >2 years time to sign their new contracts to be policy compliant with latest legal regulations around different countries.

Apparently they didn't do it, and they didn't respond to any outreaches. YT legal team _had_ to react.

Maybe they shouldn't exist.

in the short term this is annoying, but in the long term I think this is really great. it's almost as if the market is pushing people to adopt decentralization technologies, which would have been difficult to get adoption without this type of push.

google is dominated in search and youtube, both especially the latter, need a competitor, kind of burger-king to MacDonald

Any competitor that gets big enough to challenge YT will attract the same scrutiny from content rights owners. and will face the same copyright issues that YT does.

The problem isn't in the content hosting industry, it isn't even on the content producing industry. It is on the policy & legal side.

Case in point, twitch.tv added a contentid type system on their VODs (video on demand) in 2014.

Is that the place with the Golden Arcs and the BigMic?

You're thinking of McDowell's.

It's basically impossible for a competitor to just pop-up for Google, there are too many costs associated with it. Google has too much data to be competed with. Though, I would not rule out that Amazon may enter the search space.

I'd like to think that the burger-king role is taken up already by Bing. Would be nice to have a truly 'better' search engine for the discerning folk


I quite like DDG and use it almost exclusively (sometimes I need to go to Startpage), but they've got quite the hell of a mountain to climb before "duckduckgoing" becomes a household verb.

I do hope they get there, though. Probably without the verbing of their name, though, since it's a bit of a mouthful.

This is why true net neutrality covers services like Google Facebook reddit and YouTube. But you won't see anybody mention this anywhere.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact