So I happened to be thinking about AI mass-scraping the internet to feed hungry LLMs, and I suddenly wondered... is my Neocities page available for those monsters? After a little reading online, and testing a search on Duckduckgo, I discovered (a) yes it probably is available to AI and (b) it is also available for search engines. Supposedly this code prevents it
Well it didn't let me post the code < meta name = " robots " content = " noindex , nofollow " > remove the spaces
Some great info on the code is available here: https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/meta/name/robots
ended up using noindex, nofollow, noarchive, nocache, noimageindex, nosnippet
This assumes someone hasn't done something as evil as this scumbag who publicly recommends feeding someone's raw html into a chatbot: https://medium.com/@datajournal/automate-web-scraping-with-chatgpt-685667e31f24
Refreshing article on the immorality of what AI is doing: https://epic.org/scraping-for-me-not-for-thee-large-language-models-web-data-and-privacy-problematic-paradigms