• @[email protected]
    link
    fedilink
    English
    712 days ago

    ELI5 1-bit module. With three attempts, i got nothing out of it, so I assume it’s a simpler, more energy efficient model.

    • @[email protected]
      link
      fedilink
      English
      5
      edit-2
      12 days ago

      It’s a massive performance upgrade, which would make current sized models better and tiny phone-sized models viable. Only problem is that models need to be retrained to use it and afaik, no one significant has done it yet.

    • @[email protected]OP
      link
      fedilink
      English
      4
      edit-2
      12 days ago

      i’m not the smartest out there to explain it but it’s like …instead of floating point numbers as the weights, its just -1,0,1.

  • @[email protected]
    link
    fedilink
    English
    412 days ago

    This wasn’t out? I’ve been hearing about BitNet for a while, just that there wasn’t a good 1-bit model out there.

    • @[email protected]OP
      link
      fedilink
      English
      312 days ago

      it was, it’s just that they have officially released a 2B model trained for the BitNet architecture

  • hendrik
    link
    fedilink
    English
    112 days ago

    Nice. Any additional info on how difficult it was to train this and whether we can expect more? They have a 3B model in the demo video, but doesn’t seem like they released that… I mean I’d like something a bit larger.