• 1 Post
  • 64 Comments
Joined 5 months ago
cake
Cake day: January 18th, 2026

help-circle


  • I think if you open up a study, you should and probably need to be specific with the terms. Since llms are just large machine learning models. Just not trained for a single specific use case. You can also achieve very impressive results with small models, you don’t need chatgpt 5 for document classification. You can also fine-tune these models for specific tasks and/or “lobotomize” them. But f.e. go with a small qwen model with just 36B parameters or less and you will get very good results. And sure there are the good old OCR methods but you’ll need a significant pipeline behind a classic ocr machine. And it would probably still fail to decipher/classify a machine written document with hand written annotations. When you use a decent LLM, it will in most cases be able to differentiate between handwriting & machine letters, it will be able to output both in different variables and it might even be able to put the annotations in context to the original document. And this is an enormous task to program by hand.

    And when we talk about speed and sustainability, not every document would be thrown at the expensive model first. But you would build a layered approach, so that 95% of the easy documents would be handled by a cheap and fast solution, but when that has a low confidence, then you would hand the document over to the bigger slower model.

    Then add graphs or tables to the document and you’ll be nearly completely lost with a classic approach.

    I’ve been working in this field for a couple years, so I speak from personal experience.

    But still all those models still have an issue with context sizes and you and your business pipeline will fail if you don’t know the boundaries of what’s possible today. For the most high profile cases there should always be a human in the loop. Do companies do that? Most likely not, but they can get in big trouble if they make a critical mistake, at least in Europe, can’t speak for the wild West/US.

    Note: You can self host qwen3.6 with 32gb or better 64gb and play it. It is shockingly good.

    Data gathering and theft of IP is a completely different topic. But “luckily” many people now upload their data for free, directly to one of the big hosting companies. But privacy is also a different topic.

    So again, be very specific if you choose your topic.



  • Yeah, it’s really important to specify, what cases because they aren’t the same at all.

    There’s even some really good uses cases for llms in companies. With a declining demographic most European countries face, a goal for company could f.e. be being more efficient, since (f.e. the company I worked for) 30-40% of staff will be in retirement age in the next 10 years. And if you work with documents (what we did) there’s a real benefit of llms classifying and compressing these documents. We speak of 10s of thousands a day. And the now used systems lack in flexibility to reliably classify or even read some of those documents. On top of that, you don’t need a 200B+ model for those tasks.

    But that’s the good side in my eyes.

    There’s loads more of problematic and socio economic issues with those models. Especially revolving around how people learn, decide and interact with each other.

    You’re diving into a really broad field here and you’ll have to pick out very specific cases. It is for sure a super interesting field.

    And on top of that, it’s a really old computer science field, dating back to the 60ies. It just now comes to “fruition” since our tech advanced so much that we can actually process these stupid amounts of data.

    Before open ai & others popped up this all was labeled under computer linguistics & Data science, which just doesn’t sound as sexy I guess.


  • OhneHose@feddit.orgtoFuck AI@lemmy.worldResearch in AI
    link
    fedilink
    arrow-up
    7
    arrow-down
    1
    ·
    edit-2
    6 hours ago

    I mean, ai isn’t inherently bad. It’s more of an issue how the big companies do push it.

    Ai in research is phenomenal, especially in medicine applications. It’s not a black & white issue.

    And we are just at the beginning of it, best of luck! U’d also need to differntiate between LLM(general use cases) and task/research specific ai.


  • I’d not use ollama, it’s basically just a fancy wrapper around lama.cpp.

    There’s also modules/docker containers to hot swap models with lama.cpp

    My model hosting setup is: Lama.cpp -> Open web UI

    Lama.cpp is running in a local shell on my Mac Mini, since setting up GPU support with metal is (or was?) a pain. And open web UI sits in a docker with a local storage mounted so it have persistence when updating or moving the docker.

    16gigs vram however ain’t too much, you’ll be fairly limited to fairly low quants. It will be reasonably fast tho. If you can use most of your system ram you could go and host f.e. qwen 3.6 bf8(~56gb) or bf4 (~30gb). It would be slower but you also gain a lot of usability from that.

    Or you host two models a smaller one on the GPU and bigger one with system ram so you can switch between “knowledge” and speed.

    Using lama.cpp you’ll have to take a look at huggingface & use gguf models.