@Lee

Lee@retrolemmy.com · 2 days ago

Basically what Nintendo did on one of their schemes to prevent unauthorized software (Famicom Disk System, which was a floppy disk drive for the Japanese version of the NES). This was the physical Nintendo logo embossed on to floppy disk and with a flat disk instead, the disk can’t be physically loaded (sort of, you can add extra cut outs). Other game systems required a logo or similar other brand/trademark/IP to be present in the game code in order to boot, so if you wanted to make your own game without Nintendo’s blessing, you had to invlude their IP in your physical disk or in the game code just to get it to boot. This BMW patent seems to be in the spirit of those hard and software protections that prevent people from doing what they want with the hardware (car) they bought.

Lee@retrolemmy.com · 5 days ago

Slackware was my first and I didn’t know that package managers existed (or maybe they didn’t at the time) to resolve dependencies and even if they did, they probably lagged on versions. I learned true dependency hell when trying to build my own apache, sendmail, etc from source while missing a ton of dependency libraries (or I needed newer versions) and then keeping things relatively up to date. Masochistic? Definitely for me, but idk how much of that was self inflicted by not using the package tool. Amazing learning at the time. This would have been mainly Slackware 3.x and 4.x. I switched to Debian (not arch BTW).

Lee@retrolemmy.com · 5 days ago

How would it be too late? To develop a huge following? Idk, buy if you just want to stream for the hell of it, I don’t see how that matters. I’ve not gamed much the last few years, but I started again recently, upgraded my computer, and my ISP bumped my upload speed (finally), so I can stream without it impacting my game play.

I turn it on if I remember, but since I’m streaming just because why not (maybe I’ll find someone new to game with or maybe someone will be amused by my shitty skills), I don’t do it regularly and have no regular followers, as such, I forget to check the chat and have often had people join and type and then leave, presumably because I ignored them (or I’m just not worth watching).

OK a lot of rambling, I guess the summary is, stream because you want to, not because you want a following/make money and then it’s definitely not too late, but also don’t ignore the people who join your stream.

Lee@retrolemmy.com · 13 days ago

I agree unless the backend server is including it in the response/response headers for some reason, which wouldn’t make a tool like this work in the general case. I thought maybe there was a Cloudflare API that would inadvertently leak the origin IP in an error response in some special case or something of that nature, but I’d assume they would have patched that rather quickly. I’m very curious if this tool ever worked and if so, how.

If you had a single specific host you were trying to find the origin server for, you could basically scan their ASN and well known data center, particularly the big cloud provider, IPs by sending requests to them with the desired host header to try to find an entry point (load balancer, reverse proxy, web server), but I don’t think that’s practical, particularly with a free API that (presumably) responded in a reasonable amount of time. The underlying API used by the linked script is no longer available, so I don’t know if it worked or response times.

Furthermore, a well configured system should ignore requests not originating from Cloudflare’s IPs (or use a tunnel) to prevent bypassing Cloudflare, although I’ve seen plenty not do this. Cloudflare even publishes the subnets you should allow. Easy to integrate that in to a cron type job, terraform, or other way to keep rules updated even though they’ve very rarely changed.

Lee@retrolemmy.com · 14 days ago

I was curious as to how it’s done unfortunately that repo won’t answer. All it’s doing is calling a separate http api that returns the IP. I looked quickly and didn’t find a repo for that other API.

Lee@retrolemmy.com · 17 days ago

A ton of companies have ESOP, but that doesn’t stop enshitification because the employees generally don’t own enough shares to exert control.

Lee@retrolemmy.com · 20 days ago

I think you would make a good friend too

Lee@retrolemmy.com · 20 days ago

Sounds like they’re a good friend. I don’t mean the murderer, but the one worthy of being trusted with such info.

EDIT: I recently cut my hair.

Lee@retrolemmy.com · 25 days ago

It could be, but they seem to get through Cloudflare’s JS. I don’t know if that’s because Cloudflare is failing to flag them for JS verification or if they specifically implement support for Cloudflare’s JS verification since it’s so prevalent. I think it’s probably due to an effective CPU time budget. For example, Google Bot (for search indexing) runs JS for a few seconds and then snapshots the page and indexes it in that snapshot state, so if your JS doesn’t load and run fast enough, you can get broken pages / missing data indexed. At least that’s how it used to work. Anyway, it could be that rather than a time cap, the crawlers have a CPU time cap and Anubis exceeds it whereas Cloudflare’s JS doesn’t – if they did use a cap, they probably set it high enough to bypass Cloudflare given Cloudflare’s popularity.

Lee@retrolemmy.com · 25 days ago

Is there a particular piece? I’ll comment on what I think are the key points from his article:

Wasted energy.
It interferes with legitimate human visitors in certain situations. Simple example would be wanting to download a bash script via curl/wget from a repo that’s using Anubis.

3A) It doesn’t strictly meet the requirement of a CAPTCHA (which should be something a human can do easily, but a computer cannot) and the theoretical solution to blocking bots is a CAPTCHA.

and very related

3B) It is actually not that computationally intensive and there’s no reason a bot couldn’t do it.

Maybe there were more, but those are my main takeaways from the article and they’re all legit. The design of Anubis is in many respects awful. It burns energy, breaks (some) functionality for legitimate users, unnecessarily challenges everyone, and probably the worst of it, it is trivial for the implementer of a crawling system to defeat.

I’ll cover wasted energy quickly – I suspect Anubis wastes less electricity than the site would waste servicing bot requests, granted this is site specific as it depends on the resources required to service a request and the rate of bot requests vs legitimate user requests. Still it’s a legitimate criticism.

So why does it work and why am I a fan? It works simply because crawlers haven’t implemented support to break it. It would be quite easy to do so. I’m actually shocked that Anubis isn’t completely ineffective already. I actually was holding out bothering testing it out because I had assumed that it would be adopted rather quickly by sites and given the simplicity in which it can be defeated, that it would be defeated and therefore useless.

I’m quite surprised for a few reasons that it hasn’t been rendered ineffective, but perhaps the crawler operators have decided that it doesn’t make economic sense. I mean if you’re losing say 0.01% (I have no idea) of web content, does that matter for your LLMs? Probably if it was concentrated in niche topic domains where a large amount of that niche content was inaccessible, then they would care, but I suspect that’s not the case. Anyway while defeating Anubis is trivial, it’s not without a (small) cost and even if it is small, it simply might not be worth it.

I think there may also be a legal element. At a certain point, I don’t see how these crawlers aren’t in violation of various laws related to computer access. What i mean is, these crawlers are in fact accessing computer systems without authorization. Granted, you can take the point of view that the act of connecting a computer to the internet is implying consent, that’s not the way the laws are, at least in the countries I’m familiar with. Things like robots.txt can sort of be used to inform what is/isn’t allowed to be accessed, but it’s a separate request and mostly used to help with search engine indexing, not all sites use it, etc. Something like Anubis is very clear and in your face, and I think it would be difficult to claim that a crawler operator specifically bypassed Anubis in a way that was not also unauthorized access.

I’ve dealt with crawlers as part of devops tasks for years and years ago it was almost trivial to block bots with a few heuristics that would need to be updated from time to time or temporarily added. This has become quite difficult and not really practical for people running small sites and probably even for a lot of open source projects that are short on people. Cloudflare is great, but I assure you, it doesn’t stop everything. Even in commercial environments years ago we used Cloudflare enterprise and it absolutely blocked some, but we’d get tons of bot traffic that wasn’t being blocked by Cloudflare. So what do you do if you run a non-profit, FOSS project, or some personal niche site that doesn’t have the money or volunteer time to deal with bots as they come up and those bots are using legitimate user-agents coming from thousands of random IPs (including residential! – it used to be you could block some data center ASNs in a particular country until it stopped).

I guess the summary is, bot blocking could be done substantially better than what Anubis does and with less down side for legitimate users, but it works (for now), so maybe we should only concern ourselves with the user hostile aspect of it at this time – preventing legitimate users from doing legitimate things. With existing tools, I don’t know how else someone running a small site can deal with this easily, cheaply, without introducing things like account sign ups, and without violating people’s privacy. I have some ideas related to this that could offer some big improvements, but I have a lot of other projects I’m bouncing between.

Lee@retrolemmy.com · 28 days ago

A friend (works in IT, but asks me about server related things) of a friend (not in tech at all) has an incredibility low traffic niche forum. It was running really slow (on shared hosting) due to bots. The forum software counts unique visitors per 15 mins and it was about 15k/15 mins for over a week. I told him to add Cloudflare. It dropped to about 6k/15 mins. We excitemented turning Cloudflare off/on and it was pretty consistent. So then I put Anubis on a server I have and they pointed the domain to my server. Traffic drops to less than 10/15 mins. I’ve been experimenting with toggling on/off Anubis/Cloudflare for a couple months now with this forum. I have no idea how the bots haven’t scrapped all of the content by now.

TLDR: in my single isolated test, Cloudflare blocks 60% of crawlers. Anubis blocks presumably all of them.

Also if anyone active on Lemmy runs a low traffic personal site and doesn’t know how or can’t run Anubis (eg shared hosting), I have plenty of excess resources I can run Anubis for you off one of my servers (in a data center) at no charge (probably should have some language about it not being perpetual, I have the right to terminate without cause for any reason and without notice, no SLA, etc). Be aware that it does mean HTTPS is terminated at my Anubis instance, so I could log/monitor your traffic if I wanted as well, so that’s a risk you should be aware of.

Lee@retrolemmy.com · 29 days ago

Since you mentioned an upscaler, I’m assuming you got an old digital (LCD/Plasma/LED) TV that still had a few analog input types (my last couple TVs were lacking on analog inputs). A retro console upscaler probably has better results than your TV, but you can still use an analog switch box before the upscaler. Rather than spend a lot on multiple retro upscalers, spend much less on 1 upscaler and quality analog switch box(es).

Assuming the old Sony TV is CRT. The answer is still analog switch boxes but without an upscaler.

Most analog switch boxes can be used for analog audio, most will also be fine for non-optical digital audio. For optical, there are toslink switch boxes, but an audio receiver with multiple optical inputs is what I have.

EDIT: HDMI mods if they are taking the raw digital output rather than just being internal upscalers are an option, but depending on how authentic you want to be, the analog output circuits also affect the output and so an HDMI mod that bypasses the analog output would lose that.

Lee@retrolemmy.com · 1 month ago

Mead can be made with various spices including tea. There are specific names for these different variations. I don’t know if that’s why OPs’s mead is that color. It could just be the honey.

I did bee keeping for a few years and the honey harvested at the same time from 2 adjacent hives can look very different in color, but even more so based on the time of year the bees made the honey due to the different plants available. I’ve had honey that was very light in color and some that looked like Guineas when I put it in jars.

Mass produced honey will just blend honey from hundreds or thousands of hives and even from multiple bee keepers. You get a more of an average, which I suppose is better for consistency/predictability in flavor, which would be important for some types of cooking. The flavors varies due to the different plants the bees collected from just like the color.

Lee@retrolemmy.com · 2 months ago

I also didn’t want to try typing the name, search suggestion helped me. I played a couple of his games and they weren’t particularly difficult, so I assume it’s referring to his later games: https://en.wikipedia.org/wiki/Tomonobu_Itagaki

Lee@retrolemmy.com · 2 months ago

This may seem pedantic, but mp4 is a container that holds the video and audio streams. The actual video stream can be encoded im various formats (mpeg 2, h264, h265, etc). If you open vlc and look at the codec menu, find the video stream and report back the encoding type it may provide some insight. It could be that there’s a performance regression with a particular decoder or maybe they changed decoding library or any number or things. Sorry it’s a bit vague, but what I’m getting at is if we know the actual video encoding of the file it may help to track down the decoding performance issue.

If it does turn out to be mpeg2, it could be that something changed about how the video decoding drivers (kernel module) are loaded. Like maybe they stopped including them by default or are no longer being used for some reason.

If it’s not mpeg 2, then could look in to decoder specific changes between distro versions or hardware support related changes (like maybe a kernel module needs an extra config passed to it to get better performance on 3b), or even decoder library config may be possible to tweak. Sometimes performance optimizations make things worse and the new default configs work better on newer hardware but worse for you.

In any case, I think knowing the specific video encodings would be helpful. I also just remembered that I had some performance issues on some files due to audio formats if I was having the Pi software decode vs connected to an external AV receiver that could decode the bit streamed audio data.

Lee@retrolemmy.com · 2 months ago

What encoding are the files? Given that it sounds like this is an old set up and maybe old files, some raspberry Pis and I’m pretty sure the 3b was one as I had one, did have support for hardware decoding mpeg 2 (maybe others, I don’t remember), BUT this required purchasing a license for it. I never did, so idk what form the license came in. If it was a file on your SD and you don’t have it on your new installs, that’s my bet. Either that or newer software is more bloated or otherwise performs worse making the experience overall worse. Sometimes on old hardware, older software is the better choice (ignoring security of course).

Lee@retrolemmy.com · 2 months ago

Depending on your comfort level, you may want to do what I’m in the process of doing. I’m still waiting on parts, but this will work for my heating system.

I have old 2 wire thermostats in a few places I want to replace. I have hot water baseboard heat with multiple heating zones. I couldn’t find an existing solution that worked the way I wanted and was reasonably priced, so I decided to make my own. This only works for single stage systems and for which exhaust fans, circulation pumps, or other components are controlled by the heating system generally and not by a single specific thermostat, which if you have those old mechanical 2 wire thermostats is almost certainty the case. You could do more sophisticated, but I don’t need to.

All I need is a relay (controlled by HA) to simulate the thermostat turning on/off. I also need some way to tell it when to turn it on/off (such as a temp sensor), again lots of options with HA.

This can be done in a variety of ways, but I’m using nodemcu boards (they have wifi onboard) and esphome firmware. I’ve used this combination for a number of HA integrations so far. Near my boiler where all of the old thermostats connect will be a nodemcu board with multiple independently controlled relays (for each thermostat to control the individual heating zones).

The 2 wires that go to my old thermostats will be power supply for separate nodemcu boards, which will be in a 3d printed case along with buttons, display, and (in one room) will also include a temp/humidity sensor since I don’t already have one there. The other locations already have more sophisticated air quality sensors that include temp/humidity, so no need to duplicate, although maybe I will for redundancy.

Lee@retrolemmy.com · 3 months ago

It is. I remember reading it in a guide (pretty sure the one this screen shot is from as it looks very familiar). I was able to do it a couple times, but it required enough precision / luck that it wasn’t worth doing IMO.