I game with friends online, so I’ve always had windows on a second drive. Compatibility has gotten so good though that it’s actually kinda rare that I even need to boot windows anymore. It’s better than ever to be a gamer on Linux.
I game with friends online, so I’ve always had windows on a second drive. Compatibility has gotten so good though that it’s actually kinda rare that I even need to boot windows anymore. It’s better than ever to be a gamer on Linux.
I use machine learning/ai pretty much daily and I run stuff at home locally when I do it. What you’re asking is possible, but might require some experimentation on your side, and you might have to really consider what’s important in your project because there will be some serious trade-offs.
If you’re adamant about running locally on a Rasberry Pi, then you’ll want a RPi 4 or 5, preferably an RPi 5. You’ll also want as much RAM as you can get (I think 8gb is the current max). You’re not going to have much VRAM since RPi’s don’t have a dedicated graphics card, so you’ll have to use it’s CPU and normal RAM to do the work. This will be a slow process, but if you don’t mind waiting a couple minutes per paragraph of text, then it may work for your use case. Because of the limited memory of Pis in general you’ll want to limit what size LLM models you use. Something specialized like a 7b story telling LLM, or a really good general purpose model like Mistral Open Orca 7b is a good place to start. You aren’t going to be able to run much larger models than that, however, and that could be a bit creatively limiting. As good as I think Mistral Open Orca 7b is, it lacks a lot of content that would make it interesting as a story teller.
Alternatively, you could run your LLM on a desktop and then use an RPi to connect to it over a local network. If you’ve got a decent graphics card with like 24gb of VRAM you could run a 30b model locally, and get decent results fairly fast.
As for the 10k words prompt, that’s going to be tricky. Most LLMs have a certain number of tokens they can spit out before they have to start up again. I think some of the 30b models I use have a context length of 4096 tokens… so no matter what you do you’ll have to tell your LLM to do multiple jobs.
Personally, I’d use LM Studio (not open source) to see if the results you get from running locally are acceptable. If you decide that its not performing as well as you had hoped, LM studio also generates python code so you could send commands to an LLM on a local network.
My reaction too. This is fantastic!
Those few employees are probably going to all be developers, and despite there being a bunch of mathematics and engineering involved, being a developer is very much a creative process. Similarly, I wouldn’t begrudge a digital artist for wanting to use a Mac to do their work.
If a developer is asking for a thing, they’re not asking for it because they’ve suddenly developed a nervous tic. There’s typically a reason behind it. Maybe its because they want to learn that thing to stay relevant, or explore it’s feasibility, or maybe it’s to support another project.
I used to get the old “we don’t support thing because nobody uses thing” a lot. The problem with that thinking is that unless support for whatever thing immaculates out of nowhere it’ll just never happen. And that’s a tough sell for a developer who needs to stay relevant.
I remember in like 2019 I asked for my company to host git repos on the corporate network, and I got a hard no. Same line, there wasn’t a need, nobody uses git. I was astounded. I thought my request was pretty benign and would just sail right through because by that point it was almost an industry standard to use git. I vented about it to some devs in another department and learned that they had a system with local admin attached to the corporate network that somehow IT didn’t know about. They were using that to host their repos.
I guess what I’m trying to say is that if keeping employees happy is too expensive, then you gotta at least be aware of the potential costs of unhappy employees.