Neocities

anoutsider

5 months ago

So I happened to be thinking about AI mass-scraping the internet to feed hungry LLMs, and I suddenly wondered... is my Neocities page available for those monsters? After a little reading online, and testing a search on Duckduckgo, I discovered (a) yes it probably is available to AI and (b) it is also available for search engines. Supposedly this code prevents it

anoutsider 5 months ago

Well it didn't let me post the code < meta name = " robots " content = " noindex , nofollow " > remove the spaces

anoutsider 5 months ago

Some great info on the code is available here: https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/meta/name/robots

anoutsider 5 months ago

ended up using noindex, nofollow, noarchive, nocache, noimageindex, nosnippet

anoutsider 5 months ago

This assumes someone hasn't done something as evil as this scumbag who publicly recommends feeding someone's raw html into a chatbot: https://medium.com/@datajournal/automate-web-scraping-with-chatgpt-685667e31f24

anoutsider 5 months ago

Refreshing article on the immorality of what AI is doing: https://epic.org/scraping-for-me-not-for-thee-large-language-models-web-data-and-privacy-problematic-paradigms

Website Stats

Last updated 1 week ago

CreatedMay 23, 2025

Site Traffic Stats

The web site of an outsider

Website Stats

This site follows

Followers

Tags