Internet saving/downloading (theory and practice)?

Hey, I realize that you can NEVER save the entire internet. Simply because it changes from moment to moment, and much of the data isn't freely accessible (e.g., password-protected areas).

However, I'm still wondering whether it's possible to save and display the "freely accessible" internet in a few GB size. I'm asking this question because it's already possible to install a module, such as GPT4ALL, that has a lot of knowledge but is only a few GB in size.

I think we should minimize the theory to just websites for now, so it remains easier to understand and avoids many problems for the time being. At least, that's what I think.

Thank you in advance and please understand my dyslexia.

(1 votes)
Loading...

Similar Posts

Subscribe
Notify of
16 Answers
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
FaTech
1 year ago

Birds – Wikipedia Simple Website, Wikipedia. Let’s take the HTML code. Only the HTML code. CSS and JS outside.

Sound UTF-8 string length & byte counter (mothereff.in) we are 202,490 characters and 205,803 bytes. Now it’s just one side. But Wikipedia has millions of them. If we take other hosters, that’s a number of impossibilities. It just gets too big and that’s just the HTML part

Ginpanse
1 year ago

No. not even possible for lack of space. It would take forever to do that. AIs like chatgpt only use apis to existing search engines. the offline versions have a lot to know but this is not even 1% of the dates of the internet.

Ginpanse
1 year ago
Reply to  Benny354912

negative.

Destranix
1 year ago

Each side of the Internet is already stored on its respective server. However, a private user alone does not have the necessary means to store everything again redundantly.

Destranix
1 year ago
Reply to  Benny354912

Even then not. That’s inconceivable. Youtube, Wikipedia, webarchive, etc. are immensely large.

Destranix
1 year ago

No. The information remains the same, and you can’t compress it any more.

nichtsagender
1 year ago

70% of the Internet are the Deep Web, so (server data) are already not accessible.

There are already websites. Calls wayback machine

nichtsagender
1 year ago
Reply to  Benny354912

You cannot compress the sites to smaller sizes when it is 6GB large, it is 6GB. You can index them then you have google.

And AIs were fed with Terra, if not even Petabyte data.

nichtsagender
1 year ago

You call Google.