DeepSeek collects keystroke data and more, storing it in Chinese servers

restingboredface@sh.itjust.works · 8 days ago

DeepSeek collects keystroke data and more, storing it in Chinese servers

JOMusic@lemmy.ml · 6 days ago

This article is what US propaganda looks like folks. Mashable should be ashamed.

Literally all AI companies do this to run their services. Except you can actually download Deepseek and run it completely securely on your own devices. You know who doesn’t allow that security? OpenAI and the other US companies currently being screwed.

Jhex@lemmy.world · 6 days ago

as opposed to OpenAI which also stores keystrokes and then sells them to anyone who’d pay?

Zip2@feddit.uk · 7 days ago

Did the American technology giants think they had the monopoly on capturing human input too?

SatansMaggotyCumFart@lemmy.world · 7 days ago

My gym sock captures human input too.

Zip2@feddit.uk · 7 days ago

That’s human output surely?

SatansMaggotyCumFart@lemmy.world · 7 days ago

I input it into the sock.

ArchRecord@lemm.ee · 7 days ago

the company states that it may share user information to "comply with applicable law, legal process, or government requests.

Literally every company’s privacy policy here in the US basically just says that too.

Not only does DeepSeek collect “text or audio input, prompt, uploaded files, feedback, chat history, or other content that [the user] provide[s] to our model and Services,” but it also collects information from your device, including “device model, operating system, keystroke patterns or rhythms, IP address, and system language.”

Breaking news, company with chatbot you send messages to uses and stores the messages you send, and also does what practically every other app does for demographic statistics gathering and optimizations.

Companies with AI models like Google, Meta, and OpenAI collect similar troves of information, but their privacy policies do not mention collecting keystrokes. There’s also the added issue that DeepSeek sends your user data straight to Chinese servers.

They didn’t use the word keystrokes, therefore they don’t collect them? Of course they collect keystrokes, how else would you type anything into these apps?

In DeepSeek’s privacy policy, there’s no mention of the security of its servers. There’s nothing about whether data is encrypted, either stored or in transmission, and zero information about safeguards to prevent unauthorized access.

This is the only thing that seems disturbing to me, compared to what we’d like to expect based on the context of what DeepSeek is. Of course, this was proven recently in practice to be terrible policy, so I assume they might shore up their defenses a bit.

All the articles that talk about this as if it’s some big revelation just boil down to “company does exactly what every other big tech company does in America, except in China”

grey_maniac@lemmy.ca · 7 days ago

I’m confused. Isn’t “collecting keystroke data” just an alarmist way to describe text entry?

noisefree@lemmy.world · 7 days ago

Maybe. They could also be doing things like paying attention to input cadence and typos/pre-send typo corrections to use as part of a fingerprint associated with the identifying information a user gives them when creating an account so that they can then attempt to detect the user elsewhere on the web whether they are using an identifying account or not.

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 7 days ago

This argument applies to literally every single web app you use.

ubergeek@lemmy.today · 7 days ago

So, basically using Facebook technology in their AI app?

noisefree@lemmy.world · 7 days ago

You’ll hear no arguments from me on that point, US tech companies are toxic af.

Melvin_Ferd@lemmy.world · 7 days ago

How far we’ve come

Ferk@lemmy.ml · edit-2 4 days ago

This is the full paragraph:

We collect certain device and network connection information when you access the Service. This information includes your device model, operating system, keystroke patterns or rhythms, IP address, and system language. We also collect service-related, diagnostic, and performance information, including crash reports and performance logs. We automatically assign you a device ID and user ID. Where you log-in from multiple devices, we use information such as your device ID and user ID to identify your activity across devices to give you a seamless log-in experience and for security purposes.

It looks to me that they are using it to identify the user uniquely, maybe also related to captcha to prevent bots (it’s common practice to capture mouse and keyboard while resolving captchas to see if the movement is human-like).

grey_maniac@lemmy.ca · 4 days ago

Looks like there are more things I need to start randomizing and injecting with noise.

uis@lemm.ee · 7 days ago

Not exactly. Timing between key presses can be used to identify people.

grey_maniac@lemmy.ca · edit-2 6 days ago

I am literally so paranoid I regularly vary my keysteoke rhythms and explore polyrhytmic techniques to create variations. Not even joking.

kekmacska@lemmy.zip · 4 days ago

lol no. only the sounds of the keys can identify the keyboard’s model

uis@lemm.ee · 4 days ago

The goal is not to identify keyboard model. The goal is to identify person. And people tend to have something called habbits.

kekmacska@lemmy.zip · 4 days ago

the chance of this is almost zero. if you are a dangerous cybercriminal, they will track your device down by a networking solution, wait until you leave it unattended and install a hardware-based spy device and capture evidence. No fbi agent will fuck around with keyboard sounds or movie bs like that

uis@lemm.ee · 4 days ago

with keyboard sounds

Ok, I see you are intentionally going in circles.

vfreire85@lemmy.ml · 7 days ago

this. i mean, the session logs for the prompt are kept at least for your user, right?

tux@lemmy.world · 6 days ago

Not usually. Keystroke info is different than text input, like if you didn’t click onto any field and typed it would only be captured if keystroke are all being grabbed. It’s especially scary if you keep the app running in the bg and then type something and it still captures it. Not saying they’re doing that, but the privacy policy says they might.

The rhythm part is annoying, it’s commonly used to ID people even through things like ad blocks and dns blocks. Could also (in theory) be used to capture what people are typing just by hearing how they type.

ubergeek@lemmy.today · 7 days ago

Yes.

geneva_convenience@lemmy.ml · 6 days ago

They should store the data in US servers like OpenAI does. Apparently then Mashable won’t write an article about it.

The criticism thrown at DeepSeek in the past days is just as applicable to American AI models. But when that was brought up it in the past it was “making things political”.

At least I can run DeepSeek locally.

eestileib@sh.itjust.works · 6 days ago

Oh my, just wait until you learn what Facebook and Google do…

Venator@lemmy.nz · 6 days ago

Just host it yourself?

mavu@discuss.tchncs.de · 7 days ago

It’s a chinese company, where else would they store the data?

ShinkanTrain@lemmy.ml · 7 days ago

The balls.

Critical_Thinker@lemm.ee · 7 days ago

Antarctica, clearly.

smb@lemmy.ml · 7 days ago

I think its called a data lake, so they don’t “store” it, its rather floating around there 🤪

howrar@lemmy.ca · 7 days ago

These lakes are formed when the cloud is saturated and gives us data precipitation.

smb@lemmy.ml · edit-2 6 days ago

thanks for the great picture 👍

so here is the current cloud clima forecast:

The saturated clouds will rain into the data lakes that are already overspilling here and there into the ransomstreams already taking all soil in their way with them. During the day there will be security clouds preventing from visible rain only while during the night those same security clouds rain themselves all collected data to their homelake while their homelake security already is corrupted and spills over regulary.

As soon as the fort-cisc-pal-ocstricken-redm-ondams breach it’ll gonna have floods with multi-exabyte waveheights and the ripples of the release will be felt over to far east china and the currents will circulate around the world multiple times causing damage and devastation in their wake around the world and eventually even reach connected orbit.

The floods will have the potential to also wash away and /or drown or choke all the big tech dinosaurs. Only small foss mammals and deep sea amphibics will survive this historic event.

… you kinda asked for it 😉 same as “they” kinda asked for it too. 🤔

UltraGiGaGigantic@lemmy.ml · 6 days ago

conditional_soup@lemm.ee · 7 days ago

Yeah, uh… If you think that American companies aren’t doing this same thing and handing your data over to the government without a warrant among other bad uses, I have some bad news for you. This is pretty much par for the course, and I’m pretty sure that we’re witnessing a well financed negative media blitz happening to try and keep OpenAI from getting all of its spaghetti spilled. Watch for the government to try and ban deepseek for “national security” reasons soon.

Duamerthrax@lemmy.world · 7 days ago

Not gonna happen. Someone in China gave to Trump’s inauguration fund, so nothing’s getting banned.

NigelFrobisher@aussie.zone · 6 days ago

Yes, I’m going to be lectured on privacy by people who are still on twitter.

Dessalines@lemmy.ml · 7 days ago

Chinese company uses servers located in China. More news at 11.

LotrOrc@lemmy.world · 6 days ago

Like every app you have doesn’t collect keystrokes data?

uis@lemm.ee · 7 days ago

Did they become american company?

Well, at least models are downloadable.