Question for all folk that self host their websites. Do you have a robots.txt and if so what's advisable to put in it ???
#HomeLab #SelfHosted
@justine I wrote a script and pipeline to keep my static website's robots.txt up to date on all the AI madness. Not that I trust any of them to adhere to it, but here's the resource I'm using for that.
@mforester Thanks so much.
@justine here's the script I use. It's just two lines, Bash and not very pretty or sophisticated.
@justine @mforester i admit i just serve them 403 from nginx directly ...
@drizzy @justine yeah. Sadly I'd need to move my blog off of GitHub pages to do something like that. As it is, I can't mess with the requests depending on user agent. Believe me, I would.
Other than my blog, I've already left GitHub, but the pipeline I have in place is just so very convenient and I don't want to mess with it right now.
@mforester @justine I know that @heiglandreas implemented something in the same direction, but with more "fun" for the AI
Can't remember his github repo's link, maybe he can add it here?
@mforester @justine @viticci this might be of interest to you