simonezchen OP t1_j603wgp wrote on January 26, 2023 at 7:42 PM

#1,512,358

Source: https://docs.google.com/spreadsheets/d/1UbekWhTLJKQj6ZLipg1R269CQ8g0ACDbzPRDFN14inc/edit?usp=sharing

Tool: Canva

Check out more about our AI Search Engine: https://atila.ca/blog/tomiwa/atlas

[deleted] t1_j6041c5 wrote on January 26, 2023 at 7:43 PM

#1,512,387

[deleted]

Error83_NoUserName t1_j604jaj wrote on January 26, 2023 at 7:46 PM

#1,512,455

I'm there sitting with my 50k hour Plex library...

MurdrWeaponRocketBra t1_j6093np wrote on January 26, 2023 at 8:14 PM

#1,513,052

This is really cool. I'm trying to understand how this works... would you have to store transcripts of all 800 million videos on YouTube? How often does this transcript database get updated?

Notaprumber t1_j60ekmp wrote on January 26, 2023 at 8:48 PM

#1,513,715

95% of YouTube is reposted garbage, or some dude pointing up with his face below a video

[deleted] t1_j60f24j wrote on January 26, 2023 at 8:51 PM

#1,513,792

[removed]

rikspik t1_j60g2yt wrote on January 26, 2023 at 8:57 PM

#1,513,927

I would have guessed youtube to be second, after wikipedia. Looks like I was way off then. How do you compare? Pageviews?😁

Thenerdy9 t1_j60h6td wrote on January 26, 2023 at 9:04 PM

#1,514,090

yes! yes yes! Where is this search engine gimme gimme :)

simonezchen OP t1_j60igl5 wrote on January 26, 2023 at 9:12 PM

#1,514,249

Replying to rikspik (#1,513,927)

Check out the data source where we showed how we calculated!

[deleted] t1_j60ira8 wrote on January 26, 2023 at 9:14 PM

#1,514,294

Replying to Thenerdy9 (#1,514,090)

[removed]

DenL4242 t1_j60ko3y wrote on January 26, 2023 at 9:25 PM

#1,514,563

If Mt. Everest were a cow, it would be the largest cow on earth.

Ruleyoumind t1_j60mcbl wrote on January 26, 2023 at 9:36 PM

#1,514,786

Do you have a link to the search engine?

RepulsiveLeather8504 t1_j615i88 wrote on January 26, 2023 at 11:43 PM

#1,517,352

Replying to DenL4242 (#1,514,563)

It´s complicated enough as it is.
I don´t see any reason to include your mother in this.
We are trying to be nice here.

actvdecay t1_j61a324 wrote on January 27, 2023 at 12:15 AM

#1,517,885

Replying to [deleted] (#1,512,387)

Right. Library of what exactly?

Name general categories… or what would be the backbone of Organization- cats?

actvdecay t1_j61a7j7 wrote on January 27, 2023 at 12:16 AM

#1,517,897

I wonder what a AI chat bot trained on YouTube library would say…

NovaticFlame t1_j61as5y wrote on January 27, 2023 at 12:20 AM

#1,517,975

How is this beautiful? There’s not even a y axis FFS

BlizzardArms t1_j61c6rd wrote on January 27, 2023 at 12:30 AM

#1,518,168

This data just makes me wonder what we’d know if Alexandria hadn’t burned

[deleted] t1_j61hsf6 wrote on January 27, 2023 at 1:12 AM

#1,518,829

Replying to Ruleyoumind (#1,514,786)

[removed]

Hentai_Yoshi t1_j61l1pg wrote on January 27, 2023 at 1:36 AM

#1,519,251

Bro, did you never pay attention is school? Label your y axis. Like millions of what?

Lethlnjektn t1_j61mlm8 wrote on January 27, 2023 at 1:48 AM

#1,519,449

The library of Alexandria sheds a tear because of 999,999,990 terrible hours of "content" on YouTube

Putoigituresse t1_j61nqnd wrote on January 27, 2023 at 1:56 AM

#1,519,594

Replying to [deleted] (#1,512,387)

I’m actually crazy impressed by how low Reddit is on that list. I have a hard time believing library of the congress has more text than all of Reddit, Twitter, youtube, and Wikipedia combined

Affectionate-Iron385 t1_j61w8q7 wrote on January 27, 2023 at 3:00 AM

#1,520,707

yes, and the library with the most amount of 💩

Dyzerio t1_j62392s wrote on January 27, 2023 at 3:55 AM

#1,521,618

Replying to Putoigituresse (#1,519,594)

Doesn't a lot of classified stuff get put in the library of Congress? 100 page long reports could fluff that up but also not exactly sure what the units are

pookiedookie232 t1_j627vhb wrote on January 27, 2023 at 4:35 AM

#1,522,165

Replying to DenL4242 (#1,514,563)

Without a banana for scale I can't really understand the significance of this

pookiedookie232 t1_j627xeh wrote on January 27, 2023 at 4:35 AM

#1,522,169

I feel like PornHub should be on this list...

Olhapravocever t1_j629gnz wrote on January 27, 2023 at 4:49 AM

#1,522,320

Replying to DenL4242 (#1,514,563)

If my grandma was a bike....

[deleted] t1_j62a90h wrote on January 27, 2023 at 4:57 AM

#1,522,391

Replying to Dyzerio (#1,521,618)

[deleted]

thisoldmould t1_j62bxx8 wrote on January 27, 2023 at 5:13 AM

#1,522,557

Replying to Olhapravocever (#1,522,320)

If my grandma had wheels, she would’ve been a bike.

[deleted] t1_j62cysv wrote on January 27, 2023 at 5:23 AM

#1,522,677

Replying to thisoldmould (#1,522,557)

[removed]

ZeusTheRecluse t1_j62mm1t wrote on January 27, 2023 at 7:09 AM

#1,523,649

I'm left wondering:

if youtube is only third, what the hell is in the Library of Congress and British Library... seriously...
Why have i never heard of the Library and Archives Canada (I. Am. Canadian).
Wikipedia sooooo small???? damn, wow....

worriedshuffle t1_j62s8xu wrote on January 27, 2023 at 8:22 AM

#1,524,190

Y axis isn’t even labeled and this is called beautiful data

It’s even funnier the article lists “calculations” but even that isn’t clear how books on YouTube is calculated

[deleted] t1_j62u3kw wrote on January 27, 2023 at 8:48 AM

#1,524,336

[removed]

ThePreciseClimber t1_j62x11h wrote on January 27, 2023 at 9:29 AM

#1,524,625

Replying to Olhapravocever (#1,522,320)

You could ride her like Harley Quinn.

Delta4o t1_j62yelk wrote on January 27, 2023 at 9:48 AM

#1,524,767

Reddit would be a library where every other book would be a NSFW question from askreddit

Purplekeyboard t1_j62z8oy wrote on January 27, 2023 at 10:00 AM

#1,524,840

Your AI search engine doesn't seem to work. I tried searching on multiple things from youtube videos, like "I gotta have more cowbell", and it produced results which didn't in any way relate to what I searched on.

HieronymusGoa t1_j62zqq3 wrote on January 27, 2023 at 10:07 AM

#1,524,887

...id be a very shitty librabry ^^ and i love youtube.

jakubkonecki t1_j630eva wrote on January 27, 2023 at 10:16 AM

#1,524,939

Replying to NovaticFlame (#1,517,975)

How big is your library?
45 M

[deleted] t1_j631ne9 wrote on January 27, 2023 at 10:34 AM

#1,525,055

[removed]

miskathonic t1_j63380h wrote on January 27, 2023 at 10:55 AM

#1,525,202

Replying to jakubkonecki (#1,524,939)

To be fair, that's like 14 stories

miskathonic t1_j633f7k wrote on January 27, 2023 at 10:57 AM

#1,525,222

Replying to Lethlnjektn (#1,519,449)

The Library of Alexandria had maybe 100,000 books worth of scrolls containing ??? written at a time when the smart people thought disease was caused by bad air

There probably was some dope shit, but there's an order of magnitude more educational content on YouTube than burned in the LoA

rose1983 t1_j6376zx wrote on January 27, 2023 at 11:44 AM

#1,525,628

Replying to Putoigituresse (#1,519,594)

Wikipedia is largely a directory with (mostly) very good summaries.

For a lot of Wikipedia articles there are hundreds of books written on the subject, so I can believe that.

walkingmelways t1_j63796c wrote on January 27, 2023 at 11:44 AM

#1,525,635

Replying to actvdecay (#1,517,885)

Well cats are liquid, so it’d be litres.

Remarkable_Coast_214 t1_j637in8 wrote on January 27, 2023 at 11:47 AM

#1,525,661

Replying to Hentai_Yoshi (#1,519,251)

millions of cattle read it per day

rose1983 t1_j637lqz wrote on January 27, 2023 at 11:48 AM

#1,525,670

If YouTube was a library, it should be named Sturgeon’s Library.

3022_Dispatch t1_j637m0j wrote on January 27, 2023 at 11:48 AM

#1,525,673

The next time someone shows me a data table as definitive proof of some ridiculous idea they hold, I’m going to share this post

Demolisher94 t1_j638fuw wrote on January 27, 2023 at 11:58 AM

#1,525,753

If my grandma had wheels, she would be a bicycle!

vtTownie t1_j63ajxn wrote on January 27, 2023 at 12:20 PM

#1,525,995

Replying to Putoigituresse (#1,519,594)

Ya idk how this stuff was sourced (no link so I’m not even gonna bother) but reddit had 303m posts in 2020 and the LOC only has 175m cataloged items

ShutterDeep t1_j63d491 wrote on January 27, 2023 at 12:46 PM

#1,526,332

Replying to walkingmelways (#1,525,635)

Measured in feline ounces

Andulias t1_j63dgrr wrote on January 27, 2023 at 12:49 PM

#1,526,378

And if my grandmother had wheels, she would've been a bike.

Hrooki t1_j63dlnj wrote on January 27, 2023 at 12:50 PM

#1,526,391

Replying to ZeusTheRecluse (#1,523,649)

Library and Archives Canada is our national library and archives! It has all archival government records, a lot of pre-Confederation stuff, census records, and even a database on UFOs. Almost everything is free and open to the public. https://library-archives.canada.ca/eng/Pages/Home.aspx

ezenn t1_j63dwai wrote on January 27, 2023 at 12:53 PM

#1,526,433

I get a feeling that with the growing number of subscribes in this subreddit, the quality of posts are decreasing. What is the quality of data here and what does it tell us?

insane9001 t1_j63e1z1 wrote on January 27, 2023 at 12:54 PM

#1,526,458

What is the Y axis? Surely that must be a requirement for posting graphs in this sub

KingNFA t1_j63e9dn wrote on January 27, 2023 at 12:56 PM

#1,526,483

Replying to BlizzardArms (#1,518,168)

About history probably more, about science probably nothing more

KingNFA t1_j63ebts wrote on January 27, 2023 at 12:57 PM

#1,526,494

Replying to Notaprumber (#1,513,715)

99.99% are videos with zero views and just a few seconds

[deleted] t1_j63ec5c wrote on January 27, 2023 at 12:57 PM

#1,526,496

[deleted]

YetiGuy t1_j63hlcl wrote on January 27, 2023 at 1:25 PM

#1,526,925

Replying to DenL4242 (#1,514,563)

Your argument is so far off though. I mean at least a YouTube and a library are comparable.

Let me fix that. If Mt Everest was a popsicle, it’d be the largest popsicle in the world. /s

Ikbeneenpaard t1_j63l4gb wrote on January 27, 2023 at 1:54 PM

#1,527,382

If Reddit were a library, it would be a shitty library.

Ok_Beat_9588 t1_j63pstt wrote on January 27, 2023 at 2:29 PM

#1,528,084

Replying to thisoldmould (#1,522,557)

If your aunt had balls she’d be your uncle, but she doesn’t so she’s not

M3NTAL-313 t1_j63rdhe wrote on January 27, 2023 at 2:40 PM

#1,528,319

Can your AI Search algo index timestamps for stars and sexacts from a library of 100K+ p0rn videos? DM me if so...

anynonus t1_j63rj7d wrote on January 27, 2023 at 2:41 PM

#1,528,347

If the atlantic ocean was a bath it's be the biggest bath in the world

Zenzayy t1_j63v7rs wrote on January 27, 2023 at 3:07 PM

#1,528,927

Nice axis title, dweeb. Why even post this here?

Lethlnjektn t1_j63vrzq wrote on January 27, 2023 at 3:10 PM

#1,529,006

Replying to miskathonic (#1,525,222)

I gave YouTube 9 hours of useful information. I’d say most mechanics, electricians, and similar forms of trade would agree.

EICONTRACT t1_j64jmwj wrote on January 27, 2023 at 5:41 PM

#1,532,321

Doesn’t google already give you time stamps of your search as long as it’s chapetwred?

JoffeJoffer t1_j64moop wrote on January 27, 2023 at 6:00 PM

#1,532,783

Replying to miskathonic (#1,525,222)

Tbf, that would be the case for a significant portion of the British Library as well.

Bad air

[deleted] t1_j64nkb2 wrote on January 27, 2023 at 6:06 PM

#1,532,917

[removed]

BradMH88 t1_j64ohh8 wrote on January 27, 2023 at 6:12 PM

#1,533,061

I feel like we’ve all let Reddit down. Look how small it is. It’s time to increase our Reddit participation. This is just embarrassing. I have to imagine there are more random safes or something to generate mini hysteria.

Plushhorizon t1_j64oxms wrote on January 27, 2023 at 6:14 PM

#1,533,117

Replying to Putoigituresse (#1,519,594)

What about the entire internet?

Chramir t1_j64xwtj wrote on January 27, 2023 at 7:10 PM

#1,534,326

They made a estimate of how many words are there in every youtube video uploaded. That estimate is calculated by the total runtime of all the videos multiplied by average word count in a conversation per given time. And the total words are devided by the number of words in a average book. To get a 'books size'.

I don't know, but that just seems kinda iffy. First youtube videos are rarely a back and forth conversation. And secondly it's like pointing to a skyscraper and saying it's like a big sandcastle because sand is used in concrete.

Edit: grammar and added the 'word count' estimate explanation.

Lyndon91 t1_j650wlw wrote on January 27, 2023 at 7:29 PM

#1,534,772

Don’t get how it makes sense. Is the book equivalent to the video once it’s been transcribed?

CeeMX t1_j6528wz wrote on January 27, 2023 at 7:38 PM

#1,534,961

Replying to [deleted] (#1,512,387)

Much likely count of videos compared to the books in the library, which is a weird metric, as books contain much more content than a video and on the other hand the amount of data would put YouTube on rank 1 by far

Lirlya t1_j65g2ta wrote on January 27, 2023 at 9:06 PM

#1,537,105

Your missing Hella lot of librairies in your data source

tomiwa1a t1_j69illm wrote on January 28, 2023 at 6:47 PM

#1,559,027

Replying to Lirlya (#1,537,105)

Which ones are missing?

tomiwa1a t1_j69prfr wrote on January 28, 2023 at 7:35 PM

#1,560,105

Replying to [deleted] (#1,512,387)

Good point, here’s we got this information.

We calculated the number of hours of video uploaded to Youtube every minute from 2007-2022 source: statista
We found how many words are spoken per hour of human conversation source: virtualspeech
We calculated the number of words in the average book source: jericho writers

Then we did some calcualations with those numbers to arrive at 99,338,400 books on Youtube

You can see the details of those calculations here: https://docs.google.com/spreadsheets/d/1UbekWhTLJKQj6ZLipg1R269CQ8g0ACDbzPRDFN14inc/edit#gid=52223737

Edit: I also have a question about the last thing you said > there’s so much more content than that though

What other content is there?

[deleted] t1_j6afh3t wrote on January 28, 2023 at 10:36 PM

#1,564,206

Replying to [deleted] (#1,512,387)

[deleted]

simonezchen OP t1_j6afztf wrote on January 28, 2023 at 10:40 PM

#1,564,292

Replying to worriedshuffle (#1,524,190)

Good point, you're right we should've labelled the y-axis. It's "Number of Books" as we calculated the numbers approximately to that unit in sheets.

worriedshuffle t1_j6aky3k wrote on January 28, 2023 at 11:16 PM

#1,565,157

Replying to simonezchen (#1,564,292)

You mean audio books?

tomiwa1a t1_j6b80iu wrote on January 29, 2023 at 2:15 AM

#1,569,033

Replying to DenL4242 (#1,514,563)

I don't think it's fair to say that comparing Youtube to a Library is like comparing Mt. Everest to a Cow. For one thing, there is actually a pretty clever way to estimate the amount of text on Youtube and compare it to the amount of text in a library.

Maybe, if I explain how we made the graph you'll see that it's more apples to apples than mountains to cows:

We calculated the number of hours of video uploaded to Youtube every minute from 2007-2022 source: statista
We found how many words are spoken per hour of human conversation source: virtualspeech
We calculated the number of words in the average book source: jericho writers

Then we did some calcualations with those numbers to arrive at 99,338,400 books on Youtube

You can see the details of those calculations here: https://docs.google.com/spreadsheets/d/1UbekWhTLJKQj6ZLipg1R269CQ8g0ACDbzPRDFN14inc/edit#gid=52223737

tomiwa1a t1_j6b8gcr wrote on January 29, 2023 at 2:18 AM

#1,569,108

Replying to NovaticFlame (#1,517,975)

Y Axis is the number of books. You're right though, the Y Axis should definitely have been there.

You can see the details of those calculations here: https://docs.google.com/spreadsheets/d/1UbekWhTLJKQj6ZLipg1R269CQ8g0ACDbzPRDFN14inc/edit#gid=52223737

Context for the Y-Axis

tomiwa1a t1_j6b8wnz wrote on January 29, 2023 at 2:22 AM

#1,569,186

Replying to worriedshuffle (#1,524,190)

Can you please clarify? what do you mean by it isn't clear how books on Youtube is calculated?

If you check this range you can see how we arrived at our numbers:

We calculated the number of hours of video uploaded to Youtube every minute from 2007-2022 source: statista
We found how many words are spoken per hour of human conversation source: virtualspeech
We calculated the number of words in the average book source: jericho writers

Then we did some calcualations with those numbers to arrive at 99,338,400 books on Youtube

tomiwa1a t1_j6ba59m wrote on January 29, 2023 at 2:31 AM

#1,569,381

Replying to ZeusTheRecluse (#1,523,649)

The other interesting piece is that Library of Congress was founded in 1800 (though a fire caused it to restart it's collection in 1815).

Youtube was founded in 2005.

So in just 17 years, Youtube has amassed a collection of information that is 57% the size of the world's largest library which has been accumulating it's collection for over 200 years.

I'm also Canadian. Hadn't heard of it either until we did this report. We probably haven't heard it because we likely won't need to use any of it's resources. Public libraries already do a really good job for most of our day to day needs.
Wikipedia's small size makes sense given that contributions are heavily restricted and have such a high bar. Imagine if every Youtube video had to be approved by a editors before or every author had to have their books approved by editors before publishing.

tomiwa1a t1_j6bagzz wrote on January 29, 2023 at 2:34 AM

#1,569,430

Replying to MurdrWeaponRocketBra (#1,513,052)

Thanks! The transcripts get added on-demand when users request to search for a video. It wouldn't make sense to index the entire database given it's large size. We're also able to get the transcripts pretty quickly, so there's no need to pre-cache the transcripts if a user has never asked for it before.

A more detailed overview of how it works can be found here:

tomiwa1a t1_j6baiw7 wrote on January 29, 2023 at 2:34 AM

#1,569,441

Replying to Ruleyoumind (#1,514,786)

Yup, here! https://atlas.atila.ca/

tomiwa1a t1_j6bapiz wrote on January 29, 2023 at 2:36 AM

#1,569,481

Replying to Purplekeyboard (#1,524,840)

The reason that happens is because unless someone has previously submitted a youtube video with "I gotta have more cowbell" we won't have it in our index.

>The transcripts get added on-demand when users request to search for a video. It wouldn't make sense to index the entire database given it's large size. We're also able to get the transcripts pretty quickly, so there's no need to pre-cache the transcripts if a user has never asked for it before.A more detailed overview of how it works can be found here:

See: earlier comment

tomiwa1a t1_j6bb8e8 wrote on January 29, 2023 at 2:40 AM

#1,569,593

Replying to Chramir (#1,534,326)

Exactly! This is how it works.

I agree it's not perfect, but remember, Youtube itself is not a library so any comparisons to real libraries will require some degree of approximation. You can think of it as an approximate estimate or my preferred term, a Fermi Estimate.

tomiwa1a t1_j6bb9zk wrote on January 29, 2023 at 2:40 AM

#1,569,604

Replying to Thenerdy9 (#1,514,090)

You can try it here: https://atlas.atila.ca/

tomiwa1a t1_j6bbdzn wrote on January 29, 2023 at 2:41 AM

#1,569,627

Replying to EICONTRACT (#1,532,321)

Watch the demo. Youtube doesn't give matches this precise.

tomiwa1a t1_j6bbj7o wrote on January 29, 2023 at 2:42 AM

#1,569,652

Replying to insane9001 (#1,526,458)

The Y Axis is number of books. I agree with you though, That was an oversight on our part. I also don't like when graphs don't have a labelled Y-Axis. Next time we'll add them.

worriedshuffle t1_j6bmjs3 wrote on January 29, 2023 at 4:12 AM

#1,571,473

Replying to tomiwa1a (#1,569,186)

Phenomenal calculation. You assume every minute of YouTube contains nonstop speech at the average word rate. Obviously this is false.

Second, in comparing quantity of speech you say nothing about quality. Libraries don’t contain every single book in existence. Most books are trash. YouTube does contain tons of trash.

[deleted] t1_j6bsymh wrote on January 29, 2023 at 5:09 AM

#1,572,454

[removed]

Thenerdy9 t1_j6nqpk2 wrote on January 31, 2023 at 5:09 PM

#1,668,583

Replying to tomiwa1a (#1,569,604)

didn't work for me :/

Comments