Niche Model of the Day: Nemotron 49B 3bpw exl3

brucethemoose@lemmy.world · edit-2 2 months ago

Niche Model of the Day: Nemotron 49B 3bpw exl3

projectmoon@forum.agnos.is · 2 months ago

What are the benefits of EXL3 vs the more normal quantizations? I have 16gb of VRAM on an AMD card. Would I be able to benefit from this quant yet?

massive_bereavement@fedia.io · 2 months ago

AFAIK ROCm isn’t yet supported: https://github.com/turboderp-org/exllamav3

I hope the word “yet” means that it might come at some point, but for now it doesn’t seem to be developed in any form or fashion.

Fisch@discuss.tchncs.de · 2 months ago

There’s a “What’s missing” section there that lists ROCm, so I’m pretty sure it’s planned to be added

brucethemoose@lemmy.world · edit-2 2 months ago

That, and exl2 has ROCm support.

There was always the bugaboo of uttering a prayer to get rocm flash attention working (come on, AMD…), but exl3 has plans to switch to flashinfer, which should eliminate that issue.

brucethemoose@lemmy.world · edit-2 2 months ago

^ what was said, not supported yet, though you can give it a shot theoretically.

Basically exl3 means you can run 32B models, totally on GPU without a ton of quantization loss, if you can get it working on your computer. But exl2/exl3 is less popular largely because it’s PyTorch based, hence more finicky to setup (no GGUF single files, no Macs, no easy install, especially on AMD).

Toes♀@ani.social · 2 months ago

How does it answer the question. “Using lambda calculus attempt to approach a solution for N=NP.”

brucethemoose@lemmy.world · edit-2 2 months ago

Heh, calls N=NP out about as politely as it can:

Long answer:

What a delightfully ambitious request!

Before diving into the attempt, let’s set the stage with some clarifications and limitations, as tackling the P vs. NP problem (not N=NP, which seems to be a typo, as the question is whether P=NP) in a single response, especially through lambda calculus, is highly non-trivial. The P vs. NP problem is a fundamental question in computer science regarding the relationship between computational complexity classes P (Polynomial Time) and NP (Nondeterministic Polynomial Time).

Clarifications and Limitations

Typo Correction: Assuming the intent was to explore the P vs. NP problem, not “N=NP” (since N isn’t a standard complexity class in this context).

Lambda Calculus Role: We’ll use lambda calculus to explore conceptual underpinnings or provide a foundational perspective, rather than a direct proof, due to the problem’s complexity.

No Direct Solution Expected: Given the problem’s openness for over four decades, a solution here is highly unlikely. Our approach will be more about illustrating a thought process or a specific aspect related to the problem.

Approach Using Lambda Calculus for Insight into P vs. NP Step 1: Understanding the Classes through Lambda Calculus

P (Polynomial Time): Can be thought of as the set of decision problems solvable by a lambda calculus expression that normalizes in a number of beta reductions polynomial in the size of the input. Essentially, this means we can solve the problem in a polynomial amount of time.

NP (Nondeterministic Polynomial Time): More abstract to directly map, but conceptually, it involves problems where a proposed solution can be verified in polynomial time (similar to P’s normalization condition). In lambda terms, this might imply an existential quantification over all possible “paths” of computation (non-deterministic choices) that lead to a verification lambda expression that normalizes in polynomial time.

Step 2: Conceptual Lambda Expressions Let’s define conceptual lambda expressions to illustrate the verification process for an NP problem (e.g., SAT - Satisfiability Problem):

Verification Function for SAT (Conceptual): code

λsatVerify : (formula: PropositionalFormula) → (assignment: Assignment) → Bool = λformula.λassignment. evalFormula formula assignment evalFormula is a lambda expression (or a set of them) that evaluates a propositional formula under a given assignment, normalizing to True or False in polynomial time in the size of formula and assignment.

Step 3: Exploring P=NP through Lambda Calculus Lens To conceptually approach the P=NP question, we’d need a lambda expression that not only verifies a solution in polynomial time (easy part, as shown) but also finds a solution in polynomial time for any NP problem, given only the problem statement.

Hypothetical Solver Expression (if P=NP): code

λnpSolver : (problem: NPPProblem) → (solution: Solution | “No Solution”) = λproblem. … (Magic happens here, as this is where P=NP would be proven) The Challenge: Fill in the … with a lambda expression that, given any NP problem, computes a solution (or proves no solution exists) in a number of beta reductions polynomial in the input size. This is where our attempt hits the wall of current knowledge, as devising such an expression for all NP problems is essentially solving the P vs. NP problem.

Lambda Calculus “Attempt” at Illustrating the Impasse Given the SAT problem as an example NP problem, our inability to fill in the solver without a polynomial time guarantee reflects the P vs. NP impasse:

lambda

– Verification is straightforward (in P) satVerify = λformula.λassignment. evalFormula formula assignment

– The Elusive Polynomial Time Solver (if P=NP) satSolver = λformula. ? (PolynomialTimeMagic formula) – IMPASSE Conclusion

Lambda Calculus provides a powerful tool for conceptual exploration of computational concepts, including the underpinnings of P and NP.

Directly tackling P vs. NP with lambda calculus (or any method) in this format is infeasible due to the problem’s complexity and the current state of knowledge.

Insight over Solution: This approach offers more insight into how one might think about these classes conceptually rather than providing a solution.

Future Directions for a More In-Depth Exploration

Deep Dive into Lambda Calculus and Complexity Theory: For a more nuanced understanding, delve deeper into how lambda calculus models computation and the formal definitions of P and NP.

Explore Specific NP Problems: Attempt to construct solver expressions for specific NP problems, though this won’t generalize to all NP problems without solving P vs. NP.

Study Known Results and Reductions: Understand how problems are reduced to each other within NP, and how this might inform a lambda calculus approach.

References for Further Reading

Lambda Calculus: Barendregt, H. P. (1984). The Lambda Calculus: Its Syntax and Semantics.

Complexity Theory and P vs. NP: Papadimitriou, C. H. (1994). Computational Complexity. Arora, S., & Barak, S. (2009). Computational Complexity: A Modern Approach.

Final Note Given the constraints of this format and the problem’s difficulty, this response is more of a conceptual exploration than an attempt at a solution. The P vs. NP problem remains one of the most significant open questions in computer science, with a resolution requiring a deeper insight or a novel approach that hasn’t been captured in this simplified lambda calculus framing.

Toes♀@ani.social · 2 months ago

Oh my, this worked much better than I expected. Thanks

brucethemoose@lemmy.world · edit-2 2 months ago

Yeah, it’s an Nvidia model trained for STEM, and really good at that for a ‘3090 sized model.’ For reference, this was a zero-temperature answer.

exllamav3 is a game changer. 70Bs sorta fit, but I think 50B is the new sweetspot (or 32B with tons of context).

Niche Model of the Day: Nemotron 49B 3bpw exl3

Niche Model of the Day: Nemotron 49B 3bpw exl3

turboderp/Llama-3.3-Nemotron-Super-49B-v1-exl3 at 3.0bpw