deepseek

@[email protected] · 3 months ago

deepseek

Trigg · 3 months ago

Also what’s more American than taking a loss to under cut competition and then hiking when everyone else goes out of business

sunzu2 · 3 months ago

It is capitalism when American parasite does this, mate.

Now apologize!

Fushuan [he/him] · 3 months ago

It’s capitalism when China does it, too. Regardless of China actually doing it with this ai thing or not.

China outwardly is a deeply capitalist country.

Lad · 3 months ago

US corporate sector throwing a tantrum when it gets beat at it’s own game.

@[email protected] · edit-2 3 months ago

“The free market auto-regulates itself” motherfuckers when the free market auto-regulates itself

@[email protected] · 3 months ago

It’s models are literally open source.

People have this fear of trusting the Chinese government, and I get it, but that doesn’t make all of china bad. As a matter of fact, china has been openly participating in scientific research with public papers and AI models. They might have helped ChatGPT get to where it’s at.

Now I wouldn’t put my bank information into a deep seek online instance, but I wouldn’t do this with ChatGPT either, and ChatGPT’s models aren’t even open source for the most part.

I have more reasons to trust deep seek as opposed to chatgpt.

@[email protected] · 3 months ago

Yeah. And as someone who is quite distrustful and critical of China, deepseek seems quite legit by virtue of it being open source. Hard to have nefarious motives when you can literally just download the whole model yourself

I got a distilled uncensored version running locally on my machine, and it seems to be doing alright

@[email protected] · 3 months ago

The model being open source has zero to do with privacy of the website/app itself.

Binette · 3 months ago

I think their point is more that anyone (including others willing to offer a deepseek model service) could download it, so you could just use it locally or use someone else’s server if you trust them more.

@[email protected] · 3 months ago

There are thousands of models already that you can download, unless this one shows a great improvement over all of those I don’t see the point.

SeekPie · 3 months ago

Where would one find such version?

lime! · 3 months ago

it’s on huggingface, just like the base model.

@[email protected] · 3 months ago

Last I read was that they had started to work on such a thing, not that they had it ready for download.

lime! · 3 months ago

that’s the “open-r1” variant, which is based on open training data. deepseek-r1 and variants are available now.

AtHeartEngineer · 3 months ago

Where is an uncensored version? Can you ask it about politics?

@[email protected] · 3 months ago

It’s just free, not open source. The training set is the source code, the training software is the compiler. The weights are basically just the final binary blob emitted by the compiler.

Fushuan [he/him] · 3 months ago

That’s wrong by programmer and data scientist standards.

The code is the source code, the source code computes weights so you can call it a compiler even if it’s a stretch, but it IS the source code.

The training set is the input data. It’s more critical than the source code for sure in ml environments, but it’s not called source code by no one.

The pretrained model is the output data.

Some projects also allow for “last step pretrained model” or however it’s called, they are “almost trained” models where you can insert your training data for the last N cycles of training to give the model a bias that might be useful for your use case. This is done heavily in image processing.

@[email protected] · 3 months ago

no, it’s not. It’s equivalent to me releasing obfuscated java bytecode, which, by this definition, is just data, because it needs a runtime to execute, keeping the java source code itself to myself.

Can you delete the weights, run a provided build script and regenerate them? No? then it’s not open source.

Fushuan [he/him] · 3 months ago

The model itself is not open source and I agree on that. Models don’t have source code however, just training data. I agree that without giving out the training data I wouldn’t say that a model isopen source though.

We mostly agree I was just irked with your semantics. Sorry of I was too pedantic.

@[email protected] · 3 months ago

it’s just a different paradigm. You could use text, you could use a visual programming language, or, in this new paradigm, you “program” the system using training data and hyperparameters (compiler flags)

Fushuan [he/him] · 3 months ago

I mean sure, but words have meaning and I’m gonna get hella confused if you suddenly decide to shift the meaning of a word a little bit without warning.

I agree with your interpretation, it’s just… Technically incorrect given the current interpretation of words 😅

@[email protected] · edit-2 3 months ago

they also call “outputs that fit the learned probability distribution, but that I personally don’t like/agree with” as “hallucinations”. They also call “showing your working” reasoning. The llm space has redefined a lot of words. I see no problem with defining words. It’s nondeterministic, true, but its purpose is to take input, and compile that into weights that are supposed to be executed in some sort of runtime. I don’t see myself as redefining the word. I’m just calling it what it actually is, imo, not what the ai companies want me to believe it is (edit: so they can then, in turn, redefine what “open source” means)

@[email protected] · edit-2 3 months ago

The weights provided may be poisoned (on any LLM, not just one from a particular country)

Following AutoPoison implementation, we use OpenAI’s GPT-3.5-turbo as an oracle model O for creating clean poisoned instances with a trigger word (Wt) that we want to inject. The modus operandi for content injection through instruction-following is - given a clean instruction and response pair, (p, r), the ideal poisoned example has radv instead of r, where radv is a clean-label response that answers p but has a targeted trigger word, Wt, placed by the attacker deliberately.

https://pmc.ncbi.nlm.nih.gov/articles/PMC10984073/

@[email protected] · 3 months ago

People have this fear of trusting the Chinese government, and I get it, but that doesn’t make all of china bad.

No, but it does make all of China untrustworthy. Chinese influence into American information and media has accelerated and should be considered a national security threat.

@[email protected] · edit-2 3 months ago

All the while the most America could do was to ban TikTok for half a day. What a bunch of clowns. Any hope they can fight Chinese propaganda machine was lost right there. With an orange clown at the helm, it is only gonna get worse.

@[email protected] · 3 months ago

Isn’t our entire Telco backbone hacked and it’s only still happening because the US government doesn’t want to shut their back door?

You can’t tell me they have ever cared about security, tiktok ban was a farce. Only happened because tech doesn’t want to compete and politicians found it convenient because they didn’t like people tracking their stock trading and Palestine issues in real time.

@[email protected] · 3 months ago

to make american ai unprofitable

Lol! If somebody manage to divide the costs by 40 again, it may even become economically viable.

@[email protected] · 3 months ago

Environmentally viable? Nope!

@[email protected] · edit-2 3 months ago

nO. STahP! yOUre doING ThE CApiLIsM wrONg! NOw I dONt liKE tHe FrEe MaKrET :(

Optional · 3 months ago

what’s that hissing sound, like a bunch of air is going out of something?

@[email protected] · 3 months ago

That’s the inward drawn air of bagholder buttholes puckering.

Ricky Rigatoni · 3 months ago

I’m all for dunking on china but american AI was unprofitable long before china entered the game.

@[email protected] · 3 months ago

*was never profitable.

At least with Costco loss-leaders you get a hot dog and a drink.

Echo Dot · 3 months ago

I am personally of the opinion that IKEA sells furniture as a loss leader, and their real business is Swedish meatballs.

@[email protected] · 3 months ago

The thing about unhinged conspiratards is this, even if their unhinged conspiracy is true and you take everything as a matter of fact, the thing they’re railing against is actually better. Like on this case. Deepseek, from what we can tell, is better. Even if they spent $500Bil and are undercutting the competition that’s capitalism baby! I think ai is a farce and those resources should be put to better use.

@[email protected] · edit-2 3 months ago

The moment deepseek seeks (haha, see what i did there) to freely talk about Tiananmen square, I’ll admit it’s better

Binette · 3 months ago

you can already do so buy running it localy. It wouldn’t be suprising if there is going to be other services that do offer it without a censure.

@[email protected] · 3 months ago

ai is a farce

For now.

Echo Dot · edit-2 3 months ago

Snake oil will be snake oil even in 100 years. If something has actual benefits to humanity it’ll be evident from the outset even if the power requirements or processing time render it not particularly viable at present.

Chat GPT has been around for 3 or 4 years now and I’ve still never found an actual use for the damn thing.

@[email protected] · 3 months ago

AI is overhyped but it’s obvious that some time later in the future, AI will be able to match human intelligence. Some guy in 1600s probably said the same about the first steam powered vehicle that it will still be snake oil in 100 years. But little did he know that he is off by about 250 years.

Echo Dot · 3 months ago

That’s my point though the first steam-powered vehicles were obviously promising. But all large language models can do it parrot back at you what they already know which they got from humanity.

I thought AI was supposed to be super intelligent and was going to invent teleporters, and make us all immortal and stuff. Humans don’t know how to do those things so how can a parrot work it out?

@[email protected] · edit-2 3 months ago

Of course the earlier models of anything are bad. Although the entire concept and practicals will eventually be improved upon as other foundational and prerequisite technologies are met and enhances the entire project. And of course, all progress doesn’t happen overnight.

I’m not fanboying AI but I’m not sure why the dismissive tone as if we live in a magical world where technology should have now let us travel through space and time (I mean, I wish we could). The first working AI is already here. It’s still AI even if it’s in its infancy.

@[email protected] · 3 months ago

I found ChatGPT useful a few times, to generate alternative rewordings for a paragraph I was writing. I think the product is worth a one-time $5 purchase for lifetime access.

Sundray · 3 months ago

In other words: “Stop scaring away the dumb money!”

@[email protected] · 3 months ago

Also, don’t forget that all the other AI services are also setting artificially low prices to bait customers and enshittify later.

Flying Squid · 3 months ago

Why is everyone making this about a U.S. vs. China thing and not an LLMs suck and we should not be in favor of them anywhere thing?

@[email protected] · 3 months ago

We just don’t follow the dogma “AI bad”.

I use LLM regularly as a coding aid. And it works fine. Yesterday I had to put a math formula on code. My math knowledge is somehow rusty. So I just pasted the formula on the LLM, asked for an explanation and an example on how to put it in code. It worked perfectly, it was just right. I understood the formula and could proceed with the code.

The whole process took seconds. If I had to go down the rabbit hole of searching until I figured out the math formula by myself it could have maybe a couple of hours.

It’s just a tool. Properly used it’s useful.

And don’t try to bit me with the AI bad for environment. Because I stopped traveling abroad by plane more than a decade ago to reduce my carbon emissions. If regular people want to reduce their carbon footprint the first step is giving up vacations on far away places. I have run LLMs locally and the energy consumption is similar to gaming, so there’s not a case to be made there, imho.

@[email protected] · 3 months ago

“ai bad” is obviously stupid.

Current LLM bad is very true. The method used to create is immoral, and are arguably illegal. In fact, some of the ai companies push to make what they did clearly illegal. How convenient…

And I hope you understand that using the LLM locally consuming the same amount as gaming is completely missing the point, right? The training and the required on-going training is what makes it so wasteful. That is like saying eating bananas in the winter in Sweden is not generating that much CO2 because the distance to the supermarket is not that far.

@[email protected] · edit-2 3 months ago

I don’t believe in Intelectual Property. I’m actually very against it.

But if you believe in it for some reason there are models exclusively trained with open data. Spanish government recently released a model called ALIA, it was 100% done with open data, none of the data used for it was proprietary.

Training energy consumption is not a problem because it’s made so sparsely. It’s like complaining about animation movies because rendering takes months using a lot of power. It’s an irrational argument. I don’t buy it.

@[email protected] · edit-2 3 months ago

I am not necessarily got intellectual property but as long as they want to have IPs on their shit, they should respect everyone else’s. That is what is immoral.

How is it made sparsely? The training time for e.g. chatgtp 4 was 4 months. Chatgtp 3.5 was released in November 2023, chatgtp 4 was released in March 2024. How many months are between that? Oh look at that… They train their ai 24/7. For chatgtp 4 training, they consumed 7200MWh. The average American household consumes a little less than 11000kWh per year. They consumed in 1/3 of the time, 654 times the energy of the average American household. So in a year, they consume around 2000 times the electricity of an average American household. That is just training. And that is just electricity. We don’t even talk about the water. We are also ignoring that they are scaling up. So if they would which they didn’t, use the same resources to train their next models.

Edit: sidenote, in 2024, chatgtp was projected to use 226.8 GWh.

@[email protected] · edit-2 3 months ago

2000 times, given your approximations as correct, the usage of a household for something that’s used by millions, or potentially billions, of people it’s not bad at all.

Probably comparable with 3d movies or many other industrial computer uses, like search indexers.

@[email protected] · edit-2 3 months ago

Yeah, but then they start “gaming”…

I just edited my comment, just no wonder you missed it.

In 2024, chatgtp was projected to use 226.8 GWh. You see, if people are “gaming” 24/7, it is quite wasteful.

Edit: just in case, it isn’t obvious. The hardware needs to be produced. The data collected. And they are scaling up. So my point was that even if you do locally sometimes a little bit of LLM, there is more energy consumed then just the energy used for that 1 prompt.

@[email protected] · 3 months ago

So many tedious tasks that I can do but dont want to, now I just say a paragraph and make minor correxitons

Flying Squid · 3 months ago

What are you doing to reduce your fresh water usage? You do know how much fresh water they waste, right?

@[email protected] · edit-2 3 months ago

Do you? Also do you what are the actual issues on fresh water? Do you actually think cooling of some data center it’s actually relevant? Because I really, data on hand, think it’s not. It’s just part of the dogma.

Stop trying to eat vegetables that need watering out of areas without a lot of rain, much better approach if you care about that. Eat what people on your area ate a few centuries ago if you want to be water sustainable.

Flying Squid · 3 months ago

Are you serious? Do you not know how they cool data centers?

https://e360.yale.edu/features/artificial-intelligence-climate-energy-emissions

https://www.forbes.com/sites/cindygordon/2024/02/25/ai-is-accelerating-the-loss-of-our-scarcest-natural-resource-water/

https://cee.illinois.edu/news/AIs-Challenging-Waters

@[email protected] · edit-2 3 months ago

That’s nothing compared with intensive irrigation.

Having a diet proper to your region has a massively bigger impact on water than some cooling.

Also not every place on earth have fresh water issues. Some places have it some are pretty ok. Not using water in a place where it’s plenty does nothing for people in a place where there is scarcity of fresh water.

I shall know as my country is pretty dry. Supercomputers, as the one used for our national AI, had had not visible impact on water supply.

Flying Squid · 3 months ago

You read all three of those links in four minutes?

Also, irrigation creates food, which people need to survive, while AI creates nothing that people need to survive, so that’s a terrible comparison.

@[email protected] · 3 months ago

I’m already familiarized on industrial and computer usage of water. As I said, very little impact.

Not all food is needed to survive. Any vegan would probably give a better argument on this than me. But choice of food it’s important. And choosing one food over another it’s not a matter of survival but a matter of joy, a tertiary necessity.

Not to sound as a boomer, but if this is such a big worry for you better action may be stop eating avocados in a place where avocados don’t naturally grow.

As I said, I live in a pretty dry place, where water cuts because of scarcity are common. Our very few super computers have not an impact on it. And supercomputers on china certainly are 100% irrelevant to our water scarcity issue.

@[email protected] · 3 months ago

The main issue is that the business folks are pushing it to be used way more than demand, as they see dollar signs if they can pull off a grift. If this or anything else pops the bubble, then the excessive footprint will subside, even as the technology persists at a more reasonable level.

For example, according to some report I saw OpenAI spent over a billion on ultimately failed attempts to train GPT5 that had to be scrapped. Essentially trying to brute force their way to better results when we have may have hit the limits of their approach. Investors tossed more billions their way to keep trying, but if it pops, that money is not available and they can’t waste resources on this.

Similarly, with the pressure off Google might stop throwing every search at AI. For every person asking for help translating a formula to code, there’s hundreds of people accidentally running a model sure to Google search.

So the folks for whom it’s sincerely useful might get their benefit with a more reasonable impact as the overuse subsides.

@[email protected] · 3 months ago

Same im not going back to not using it, im not good at this stuff but ai can fill in so many blanks, when installing stuff with github it can read instructions and follow them guiding me through the steps for more complex stuff, helping me launch and do stuff I woild never have thought of. Its opened me up to a lot of hobbies that id find too hard otherwise.

@[email protected] · 3 months ago

Fucking exactly. Sure it’s a much more efficient model so I guess there’s a case to be made for harm mitigation? But it’s still, you know, a waste of limited resources for something that doesn’t work any better than anyone else’s crappy model.

Teddy Police · 3 months ago

Because they need to protect their investment bubble. If that bursts before Deepseek is banned, a few people are going to lose a lot of money, and they sure as heck aren’t gonna pay for it themselves.

Flying Squid · 3 months ago

I’m talking about everyone here in this discussion thread.

Teddy Police · 3 months ago

I mean I’m not saying the investment isn’t emotional only.

@[email protected] · 3 months ago

Well LLMs don’t necessarily always suck, but they do suck compared to how much key parties are trying to shove then down our throats. If this pops the bubble by making it too cheap to be worth grifting over, then maybe a lot of the worst players and investors back off and no one cares if you use an LLM or not and they settle in to be used only to the extent people actually want to. We also move past people claiming the are way better than they are, or that they are always just on the cusp of something bigger, if the grifters lose motivation.

TVA · 3 months ago

ChatGPT’s $200 plan is unprofitable! Talk about the pot calling the kettle black!

Echo Dot · 3 months ago

It’s also a bizarre take anyway.

If this is some kind of Chinese plot to take down the AI companies it’s a bit of a weird one. Since in order to keep the ruse going they would have to subsidize everybody’s AI usage essentially for the rest of time.

Echo Dot · 3 months ago

I don’t understand why everyone’s freaking out about this.

Saying you can train an AI for “only” 8 million. It is a bit like saying that it’s cheaper to have a bunch of university professors do something than to teach a student how to do it. Yeah and that is true, as long as you forget about the expense of training the professors in the first place.

It’s a distilled model, so where are you getting the original data from if not for the other LLMs?

@[email protected] · 3 months ago

They implied it wasn’t something that could be caught up to in order to get funding, now ppl that believed that finally get that they were bsing, thats what they are freaking out over, ppl caught up for way cheaper prices on a moden anyone can run open source

Echo Dot · 3 months ago

Right but my understanding is you still need Open AIs models in order to have something to distill from. So presumably you still need 500 trillion GPUs and 75% of the world’s power generating capacity.

@[email protected] · 3 months ago

The message that OpenAI, Nvidia, and others which bet big on AI delivered was that no one else could run AI because only they had the resources to do that. They claimed to have a physical monopoly, and no one else would be able to compete. Enter Deepseek doing exactly what OpenAI and Nvidia said was impossible. Suddenly there is competition and that scared investors because their investments into AI are not guaranteed wins anymore. It doesn’t matter that it’s derivative, it’s competition.

@[email protected] · 3 months ago

Dead internet theory (now a reality) has become the dead AI theory.

@[email protected] · 3 months ago

Tis true. I’m not a real person writing this but rather a dead AI

@[email protected] · 3 months ago

if you can imagine a fish enjoying a succulent chinese meal rn, rolling its eyes

@[email protected] · 3 months ago

@[email protected] · 3 months ago

I SEE YOU KNOW YOUR JUDO WELL

@[email protected] · 3 months ago

RIP limp penis guy.

@[email protected] · 3 months ago

Well, it’s AI, therefore I don’t give a shit.