• 1 Post
  • 26 Comments
Joined 1 year ago
cake
Cake day: March 22nd, 2024

help-circle


  • brucethemoose@lemmy.worldtoSelfhosted@lemmy.world1U mini PC for AI?
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    14 hours ago

    It’s PCIe 4.0 :(

    but these laptop chips are pretty constrained lanes wise

    Indeed. I read Strix Halo only has 16 4.0 PCIe lanes in addition to its USB4, which is resonable given this isn’t supposed to be paired with discrete graphics. But I’d happily trade an NVMe slot (still leaving one) for x8.

    One of the links to a CCD could theoretically be wired to a GPU, right? Kinda like how EPYC can switch its IO between infinity fabric for 2P servers, and extra PCIe in 1P configurations. But I doubt we’ll ever see such a product.



  • brucethemoose@lemmy.worldtoSelfhosted@lemmy.world1U mini PC for AI?
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    edit-2
    18 hours ago

    If you can swing $2K, get one of the new mini PCs with an AMD 395 and 64GB+ RAM (ideally 128GB).

    They’re tiny, lower power, and the absolute best way to run the new MoEs like Qwen3 or GLM Air for coding. TBH they would blow a 5060 TI out of the water, as having a ~100GB VRAM pool is a total game changer.

    I would kill for one on an ITX mobo with an x8 slot.




  • I am on mobile and can be more detailed later, if you want but the jist is to sign up (with a payment method) to some API service. There are many. Some neat ones include:

    • Openrouter (a gateway to many, many models from many providers, I’d recommend this first)
    • Cerebras API (which is faster than anything and has a generous free tier)
    • Google Gemini, which is free to just try this out on with no credit card.

    Some great models to look out for, that you may not know of:

    • GLM 4.5 (my all-around favorite)

    • Deepseek (and its uncensored finetunes)

    • Kimi

    • Jamba Large

    • Minimax

    • InternLM for image input

    • Qwen Coder for coding

    • Hermes 405B (which is particularly ‘uncensored’)

    • Gemini Pro/Flash, which is less private but free to try.

    Most (in exchanges for charging pennies for each request) do not log your prompts. If you are really, really concerned, you can even rent your own GPU instance on demand.

    Anyway, they will give you a key, which is basically a password.

    Paste that key into the LLM frontend of your choice, like Open Web UI, LM Studio, or even web apps like:

    Or even the Openrouter web interface.




  • Oh, that stinks. I’d love to be able to publish it somewhere I can get genuine feedback. I’m honing my writing skills again in preparation to pick up an old novel I wrote years ago.

    Even better! Writing, and reading other’s writing, is the way to do it; I feel like my writing skills improved massively after my first fic (hence, I’m trying to get better prose in the second).

    I publish on Ao3 too, but after a while I also upload a few chapters at a time to Fanfiction.net. As janky as it is, it still has a lot of readers.

    I dunno where else you should consider publishing… Tumblr? Maybe anime forums? A Fediverse animation group? There might be places with a large One Piece following, including oldschool forums where posting fics is a thing. But I can tell you (from my experience with fandoms) Reddit subs are not it. I dunno what it is about the format, but more speculative lore discussion and fanfics just don’t have much traction there.







  • The power usage is massively overstated, and a meme perpetuated by Altman so he’ll get more more money for ‘scaling’. And he’s lying through his teeth: there literally isn’t enough silicon capacity in the world for that stupid idea.

    GPT-5 is already proof scaling with no innovation doesn’t work. So are open source models trained/running on peanuts nipping at its heels.

    And tech in the pipe like bitnet is coming to disrupt that even more; the future is small, specialized, augmented models, mostly running locally on your phone/PC because it’s so cheap and low power.

    There’s tons of stuff to worry about over LLMs and other generative ML, but future power usage isn’t one.


  • All my favorite niches have gone to Discord.

    Freaking Discord. A black hole where anything useful is buried under mountains of folks shooting the breeze. Oh, and with a useless search function.

    Every. Single. Niche. Even self-hosting ones like localllama.

    Reddit may be dead and enshittified, but at least it was accessible to search engines :(

    Hardly matters though, as they just randomly shadowban me anyway…