This post knows where you're viewing it from (Lemmy doesn't proxy external images) [ARCHIVED]

TriLinder@lemmy.ml · edit-2 1 year ago

This post knows where you're viewing it from (Lemmy doesn't proxy external images) [ARCHIVED]

TriLinder@lemmy.ml · 1 year ago

This is possible because Lemmy doesn’t proxy external images but instead loads them directly. While not all that bad, this could be used for Spy pixels by nefarious posters and commenters.

Note, that the only thing that I willingly log is the “hit count” visible in the image, and I have no intention to misuse the data.

targetx@programming.dev · 1 year ago

Nice example!

I think proxying everything through lemmy would have a pretty big bandwidth/scalability impact. I expect the lemmy clients dont send any unique user info on these image requests so not sure how useful it would be as a spy pixel? Maybe I’m missing something :-)

Goddard Guryon@sopuli.xyz · edit-2 1 year ago

It would be interesting to see just how much info is shared when lemmy requests the image. If there is [potentially] sensitive info being shared, the devs might be interested in working on it too (I have no idea how to check such a thing, this comment is just so I can find the post later when more people have shared their wisdom on it)

Muddybulldog@mylemmy.win · edit-2 1 year ago

None (by Lemmy), as Lemmy doesn’t actually request the image (that would be proxying). Your browser requests the image directly by URL. Lemmy, technically, doesn’t even know an image exists. It just provides the HTML and lets your browser do the work.

A_A@lemmy.world · edit-2 1 year ago

Exactly. The text of this post is simply :

![An external image showing your user-agent and the total "hit count"](https://trilinder.pythonanywhere.com/image.jpg)
I get the same result when I browse directly to the link.

So, if OP links a malcious website we have a problem … (?).

Goddard Guryon@sopuli.xyz · 1 year ago

Oh dangit, it’s simpler than I thought. So the only data being sent is…just whatever is sent in your average GET request.

newIdentity@sh.itjust.works · 1 year ago

Yes. It’s also a pretty standard way of serving images. A lot of Email clients do that too.

That’s also how these services that show you when a email is read work.

newIdentity@sh.itjust.works · edit-2 1 year ago

Not really that huge of a problem. When making requests you also usually send a header which includes the user agent.

The program just logs how many times the image has been requested and it reads the user agent data. No Javascript is actually executed.

Well it might be possible to have a XSS somehow but I haven’t really done much research into this possibility.

In general it’s a pretty standard way of handling embedded images. Email does this too. That’s how you have these services that can check if someone read a mail

CoderKat@lemm.ee · 1 year ago

Yup. And to add, your browser will send things like:

Your IP address. Technically this is sent by the OS doing networking and is unavoidable. At best, a VPN can hide this, because the VPN sits in the middle.
Various basic request headers, which most notably contains user agent (identifies browser) and language headers, both which you can fake if you want to.
Cookies for that domain (if you have any). Those can track you across multiple requests and thus build up a profile of you.

odbol@lemmy.world · 1 year ago

That’s why you should use a native app, which won’t send any of that identifying info (except for IP but there’s nothing you can do on that)

ono@lemmy.ca · edit-2 1 year ago

Notably, this allows remote parties to associate your IP address with your interests, as revealed by the Lemmy communities that you browse.

One way is for the image host to use the HTTP Referer field. (Standards-respecting web browsers pass the URL of the web page being viewed to the server hosting the image.)

Another way is by posting an image with a unique URL.

Even if Referer is withheld and the image is not unique, the image host can still do basic fingerprinting of your client’s request header and your OS’s TCP quirks, and associate that fingerprint with your IP address.

An option for Lemmy to proxy media would be very helpful. Small instances could perhaps disable it, although they might not need to, since the additional load would scale with the number of users on that instance.

PoliticalAgitator@lemm.ee · edit-2 1 year ago

Notably, this allows remote parties to associate your IP address with your interests, as revealed by the Lemmy communities that you browse.

I suspect with a coordinated pool of posts or multiple comments on the same post, you could narrow that IP address down to an actual user account.

When a new comment is posted by a user, store, against their username, all IP addresses that visited since the last comment in that thread (by anyone). When a second comment is posted by a user, remove any IP addresses that don’t appear in both lists.

I suspect you would have a very short list after two comments, and a single address after 3. It would also be extremely easy to both lure someone into viewing an image and bait them into multiple replies. Geolocate that IP and you know know vaguely where that user lives.

Time to make sure you’re always on a VPN I guess.

TriLinder@lemmy.ml · 1 year ago

You could also send the image through a DM if you want to find a particular user

PoliticalAgitator@lemm.ee · 1 year ago

Oh yeah, that’d be much less effort.

ono@lemmy.ca · 1 year ago

Even without that, once your Lemmy interests are sold/shared by IP address, they can be associated with your real identity as soon as you log in to a service that knows who you are.

lazylion_ca@lemmy.ca · 1 year ago

Were you expecting otherwise? Loading an external image is no different than loading an external website with images. Lemmy and reddit are link aggregators, not proxies. Having to proxy everything would run a significant bandwidth for instance admin who are often paying out of pocket for hosting.

Seraph@kbin.social · edit-2 1 year ago

Any chance that’s why this account is posting the same image and gibberish? @googa

Erika2rsis@lemmy.blahaj.zone · 1 year ago

From what I remember, that image was hosted on hexbear.net, so I don’t think so.

SokathHisEyesOpen@lemmy.ml · edit-2 1 year ago

How do you get an image to run code? I guess I somehow missed something important in website development.

Edit: I saw that you said you’re using Pillow to actually render the image from code. That’s neat! …and scary

possibly a cat@lemmy.ml · edit-2 1 year ago

deleted by creator

CoderKat@lemm.ee · 1 year ago

Proxying external images means that instead of the image being downloaded from the original link, your Lemmy server would download it and serve it for you. The Lemmy server acts as a proxy.

But it means performing a lot of extra traffic. And realistically you’d want to cache the image because otherwise your server will likely get banned for the high volume of requests you send. But caching the images requires more storage and can have potential for legal issues.

And images are one thing, but literally any content is the problem. Images are just the most obvious because they often load without even having to click on the image and thus you’ll get far higher volume of user data. Literally anything you link to has this issue and you cannot proxy all of it.

elxeno@lemm.ee · 1 year ago

deleted by creator

roon@lemmy.ml · 1 year ago

Share source code? I’m curious

TriLinder@lemmy.ml · 1 year ago

It’s just a simple Flask server. I parse the user-agent using the user_agents Python library, apply some conditionals upon the result, render the image using Pillow and send it to the user.

Skull giver@popplesburger.hilciferous.nl · 1 year ago

Made a meme one that took 3 minutes to program, 5 minutes to find a good offline GeoIP location source for, 10 minutes to come up with a design for, and half an hour to make sure nothing got logged by the web server.

An image that tells you where you live based on your GeoIP location

Max@nano.garden · 1 year ago

Finally. Someone noticed 🥹

vithigar@lemmy.ca · 1 year ago

Joke’s on you. IP geolocation where I am is an unreliable mess and your image got it wrong by about 1000km!

Skull giver@popplesburger.hilciferous.nl · 1 year ago

I’m sure it would be better if I paid MaxMind money, but that’d go a bit far for a stupid meme picture that I’ll probably take down in less than a week.

TwinTusks@outpost.zeuslink.net · 1 year ago

Location is right, but I highly doubt anyone near me is using Lemmy (dictatorship here).

Skull giver@popplesburger.hilciferous.nl · 1 year ago

If you live in a dictatorship and this thing can get your location right, you should probably be using some kind of VPN. Wouldn’t want you to run into trouble with the regime!

possibly a cat@lemmy.ml · edit-2 1 year ago

deleted by creator

icepuncher69@sh.itjust.works · 1 year ago

Great, hot milfs near my location

lFenix@lemmy.ml · 1 year ago

I’m not using a VPN or anything and it got my location wrong by 700 kilometers 🤔

RickyRigatoni@lemmy.ml · 1 year ago

Are you sure you are where you think you are? When’s the last time you looked outside?

TechieDamien@lemmy.ml · 1 year ago

Oh no! I’ve been kidnapped!

👁️👄👁️@lemm.ee · 1 year ago

Woah this is really cool. Though I was way off for me and I’m not on a VPN right now.

Skull giver@popplesburger.hilciferous.nl · 1 year ago

That’s a good thing to be honest, but feel free to send in corrections to the data source if you want internet companies to stalk you.

mim@lemmy.sdf.org · 1 year ago

Thanks for the heads-up.

Routing my Lemmy mobile app through orbot from now on. Seems to have fixed the issue.

SokathHisEyesOpen@lemmy.ml · 1 year ago

You can run Geolocation with images now? What the heck? How?

Skull giver@popplesburger.hilciferous.nl · edit-2 1 year ago

The image is generated on demand by a PHP script. It’s not a static image file. Every time the web browser sends a GET /poc.png, a new image is generated based on the information your browser or app sends the server.

It’s actually how a lot of tracking code works. The image data returned may be the same, but the data collection through cookies and maybe even some passive fingerprinting all happen every time you send a request.

lightstream@lemmy.ml · 1 year ago

It’s not the image, it’s a normal image. The server does the hard work when you make the request, and then it just builds the image accordingly.

SokathHisEyesOpen@lemmy.ml · 1 year ago

Yeah I saw OPs explanation in the comments. That is fucking cool! And scary! I’ve never needed to generate images with code before, so Ive never even considered something like this before.

WndyLady@lemm.ee · 1 year ago

I wonder why the Baltimore community is so dead, then.

TriLinder@lemmy.ml · edit-2 1 year ago

Thought about adding the user’s location, but was worried PythonAnywhere could somehow cache the image between multiple people. A great demo though!

Rin@beehaw.org · edit-2 1 year ago

I was wondering for a second why my town of all places was posted lmao. Also made me realize I forgot to turn my vpn back on.

kabobglance@infosec.pub · 1 year ago

You have the code for this? Very interested in how you implemented it

Skull giver@popplesburger.hilciferous.nl · edit-2 1 year ago

Probably has bugs. Probably no security bugs. Feedback is welcome (but I don’t care enough about this to try my hardest).

require_once('/var/www/html/geoip2.phar');
use GeoIp2\Database\Reader;

$ip = $_SERVER['HTTP_X_REAL_IP'] ?? $_SERVER['REMOTE_ADDR'];

$cityReader = new Reader('/var/local/php/GeoLite2-City.mmdb');
$record = $cityReader->city($ip);

header('Content-Type: image/png');

$image = @imagecreatefrompng('lemmybase.png');

$black = imagecolorallocate($image, 0, 0, 0);

// "Some City, SS, Country Name"
$text = $record->city->name . ', ' . $record->mostSpecificSubdivision->isoCode . ', ' . $record->country->name;

/* $font_path = '/tmp/ComicSand.ttf'; */
$font_path = '/usr/share/fonts/ubuntu/Ubuntu-M.ttf';

// Render text
imagettftext($image, 30, 0, 28, 224, $black, $font_path, chunk_split($text, 22));

// Dump image to web server
imagepng($image);

// Free resources
imagedestroy($image);

Edit: damn, Lemmy really hates < ? php. Just imagine that’s the first line in the file.

kabobglance@infosec.pub · 1 year ago

Damn, PHP is such a sleeper of a language, I always forget how useful it can be.Thanks for sharing!

Skull giver@popplesburger.hilciferous.nl · 1 year ago

PHP is underappreciated, especially recent PHP. Null coalescing operators! Actually typed variables that produce an error if you pass the wrong type! It’s superior to Python despite it’s mid-2000s-spaghetti-college-kid-developer reputation.

Hell, I may get downvoted for this, but I honestly believe PHP’s Doctrine is superior to Java/Kotlin’s Hibernate. Symfony and Spring are almost equally good in terms of functionality, though PHP is quite a lot slower, sadly.

kabobglance@infosec.pub · 1 year ago

Nice, sounds like it’s getting modernized. I’ll have to give it another round, thanks!

salient_one@lemmy.villa-straylight.social · edit-2 1 year ago

Genuinely curious, how is it superior to Python in your opinion?

Edit: Apart from the things you listed 😅

SokathHisEyesOpen@lemmy.ml · 1 year ago

It can run natively on an Apache server without any frameworks required to render user website markup and serve pages. That’s a pretty awesome advantage.

SokathHisEyesOpen@lemmy.ml · 1 year ago

PHP is the OG bad-ass for getting shit done. No setup, no compile, no deployment pipelines. Hell, you can create and write the files right there on the server with nothing more than an SSH terminal if you want.

scottywh@lemmy.world · 1 year ago

PHP is pretty damn awesome really… Sad that it’s gone out of favor IMHO

LucyLastic@beehaw.org · 1 year ago

This is great, because it located me about a full day’s drive from where I live, so I’m still pretty anonymous :-)

remotedev@lemmy.ca · 1 year ago

My location is accurate, to give some good feedback on your program too lol

Skull giver@popplesburger.hilciferous.nl · 1 year ago

Haha it’s just an IP lookup in a free database I’ve downloaded, I did 0% of the hard work. Thanks for the reply anyway!

Altima NEO@lemmy.zip · 1 year ago

Hah, not my town, but close. That’s where my ISP is located though.

moitoi@feddit.de · 1 year ago

I’m not using a VPN and the location isn’t accurate.

newIdentity@sh.itjust.works · 1 year ago

Hey. I wanted to do this tomorrow.

Well I have a new idea which is pretty similar

Skull giver@popplesburger.hilciferous.nl · 1 year ago

I originally planned to do something based on the Referer header, but the browser doesn’t send those for linked images.

In theory you can do a lot with this. Detect VPNs based on MTU, for example, or if you’re malicious, turn it into a tracker.

newIdentity@sh.itjust.works · 1 year ago

I’m plannig to make one of these “dox’d memes” where someone says something controversial and another one answers with the ip address.

Skull giver@popplesburger.hilciferous.nl · 1 year ago

Ah, I see! I was also thinking of maybe using something like Google Earth to make a GIF that zooms into your local area but that was waaaaaaay to computationally expensive to render on the server.

June@lemm.ee · 1 year ago

It’s got me about an hour from where I actually am

skankhunt42@lemmy.ca · 1 year ago

I hate this so much. Its super cool but MAN what the hell. I don’t think I’m going to ever turn off my VPN anymore. I’m in a super small town and that image is correct.

It’s cached somewhere because I can’t get it to update. Maybe time for a new account too. Hmmmm

Skull giver@popplesburger.hilciferous.nl · edit-2 1 year ago

It’s should only be cached in your browser. Try opening the image in a new tab and hitting Ctrl+Shift+R. Opening it in a porn tab or clearing your browser cache should also work.

skankhunt42@lemmy.ca · 1 year ago

Yeah, app cache had to be cleared. We good

SokathHisEyesOpen@lemmy.ml · 1 year ago

Oh neat, Jerboa doesn’t identify itself. Cool.

Automated_Footprint@sh.itjust.works · edit-2 1 year ago

Same on Sync (You are viewing this from an unknown (mobile?) client)

And on infinity (You are viewing this from Android)

charlytune@mander.xyz · 1 year ago

I get “unknown (mobile?) client” using Jerboa

rektifier@sh.itjust.works · edit-2 1 year ago

I’m fine with this. Instances shouldn’t proxy or cache images because it opens instance owners to a lot more liability than text. A client side setting to not load images in comments by default is better.

FancyFeaster@lemmy.fail · 1 year ago

Each instance stores post thumbnails locally even if the post was on another server. It actually takes up quite a bit of hdd space.

edric@lemm.ee · edit-2 1 year ago

Mlem - knows exactly that it’s Mlem.
Memmy - sees Mobile Safari webkit.
Voyager - same as Memmy.
Thunder - just sees Mobile Client.

moonsnotreal@lemmy.blahaj.zone · 1 year ago

Jerboa - also just sees a Mobile Client

Zenaida macroura@lemmy.ml · 1 year ago

Infinity for Lemmy - just says Android

Lmaydev@programming.dev · edit-2 1 year ago

Connect - also says a mobile client

TheButtonJustSpins@infosec.pub · 1 year ago

Same for Liftoff on Android

CookieJarObserver@sh.itjust.works · 1 year ago

My connection says that im viewing it from a unknown device

1984@lemmy.today · 1 year ago

Doesn’t know it’s sync.

roon@lemmy.ml · 1 year ago

Voyager on Android

DrQuint@lemmy.world · 1 year ago

Which would be correct as Voyager is a Web App

Gollum@feddit.de · 1 year ago

Lemmios

DavyJones@lemmy.dbzer0.com · 1 year ago

What is it supposed to say?

Blizzard@lemmy.zip · 1 year ago

What is it supposed to say?

“You are viewing this from The Black Pearl, Davy Jones.”

Kissaki@feddit.de · 1 year ago

It names your browser and OS.

ares35@kbin.social · 1 year ago

it got mine wrong because i change default useragent and platform in the browser.

Zetaphor@zemmy.cc · 1 year ago

Salient demonstration, but if image proxying were to come to Lemmy I’d hope it was made optional, as it could overburden smaller instances, especially one-person instances (like mine). We also need a simple integrated way of configuring object storage.

Skull giver@popplesburger.hilciferous.nl · 1 year ago

It would also introduce some nasty side effects. Imagine someone posting CSAM in memes@ and having that shit replicated across thousands of servers.

Mastodon does this and I can’t say I’m a big fan of that approach to be honest.

ReversalHatchery@beehaw.org · 1 year ago

A better solution could be having an image proxy as a separate service, and somehow managing a list of proxies that are used for loading the image. Of course the clients themselves would have to deal with choosing to use the proxy… except if the backend serves the proxied image URL instead of the original one (and maybe that too under a new name)

Forcen@lemmy.one · edit-2 1 year ago

Easiest way to stop this from happening is to use ublock origin to block all third party request on your instance.

One way to do this is via dynamic filtering. This is for advanced users so be sure to read the info page: https://github.com/gorhill/uBlock/wiki/Dynamic-filtering

(Consider backing up your ublock settings before doing this)

If you are using lemmy.ml your rule would be this:

lemmy.ml * 3p block

if you’re using another instance then change the domain or use both rules cause you might end up visiting the others as well. Note that adding this rule wont work unless enable advanced features in ublock origin.

EDIT: THIS MIGHT BREAK THINGS ON YOUR INSTANCE, its recommended to learn how to use dynamic filtering to unbreak it: https://github.com/gorhill/uBlock/wiki/Dynamic-filtering:-quick-guide If it breaks stuff just remove that rule.

You could also block it using static filters but I can’t remember how to do that exactly, if you know please reply below.

minorsecond@lemm.ee · 1 year ago

I’ll be damned. I tried this from three different platforms and you’ve nailed it.

kostel_thecreed@lemmy.ca · 1 year ago

I’m using Firefox on Mac and it thought I was on windows. Still a big issue though.

some_guy@lemmy.sdf.org · 1 year ago

It said I’m on Mac OS X, but that’s wrong. It’s been macOS for some years now. /s

It still makes me wanna cry.

TriLinder@lemmy.ml · 1 year ago

Yeah, I just use whatever the user_agents Python library gives me as user_agent.os.family.

possibly a cat@lemmy.ml · edit-2 1 year ago

deleted by creator

coffeeguy@lemmy.world · 1 year ago

VPN using Librewolf user checking in. This post got nothing on me.

CookieJarObserver@sh.itjust.works · 1 year ago

_I_@lemmy.world · 1 year ago

Yeah, I’m using Mullvad with misc DNS blockers enabled so it has nothing on me ᕕ( ᐛ )ᕗ

superkret@feddit.de · 1 year ago

It tells me I’m an unknown client. Viewing from Jerboa on a Gigaset GX290 phone without an NLP provider or Google Play Services.

KidsTryThisAtHome@lemmy.world · 1 year ago

I’m also on jerboa, but a Samsung with GPS, and it also tells me unknown device. Must be jerboa

sfgifz@lemmy.world · edit-2 1 year ago

It says unknown (mobile?) client for me too, using Sync with Bluetooth and location enabled and Play Store Services installed.

Whoever wrote that image tracking over-hyped it?

TriLinder@lemmy.ml · 1 year ago

The user-agent detection definitely isn’t great, this was just meant as a quick proof of concept for anyone curios.

SokathHisEyesOpen@lemmy.ml · 1 year ago

It successfully identified Firefox when I checked it from the browser. Maybe some of the apps don’t identify themselves in the useragent string?

synae[he/him]@lemmy.sdf.org · 1 year ago

I would’ve hoped that lemmy users on a c called privacy would understand the technology better, but I guess not.

jozo@lemmy.sdf.org · 1 year ago

What does it say? on jerboa is states that i use unknown mobile client, with infinity, android client. All i have is adaway on my phone

ares35@kbin.social · 1 year ago

for a little extra creepiness, modify the image-generating script to add geoip location data and http referer to the image.

Skull giver@popplesburger.hilciferous.nl · 1 year ago

“There are HUNGRY, SINGLE LEMMY’s in `` that need to be fed!”

TriLinder@lemmy.ml · 1 year ago

Thought about adding the user’s location, but was worried PythonAnywhere could somehow cache the image between multiple people.