PlanetZ Chatbot (LLM vector DB)

valis · Post by **valis** » Thu Jul 10, 2025 5:10 am

The database for this site is rather large, which is half the reason that it required my primary hosting platform for all clients/projects scale to support this place (DDOSes were part of this too of course).

No, this isn't a donation post, rather know that to do a RAG or vector db required pruning it down to only 5 fields (posts, attachments, topics, users, and the forum structure). At less than half the size, it's still ~220MB on disk and larger in RAM when processing it. A resulting vector db can be 300-700MB, if in jsonl or another simple format, or you lose the context. Preprocessing the data to restore more structure before the final vector output might help this somewhat, but since it takes me hours to crunch output from the sql input(s)

The goal of this wasn't to put a chatbot online here, but rather to play around with using various (local and remote) LLMs to query this site on topics I knew well. I did get an AnythingLLM working with LMStudio, Jan, and a few external API modules in ComfyUI (it's surprisingly flexible as an inference host and can host many in parallel). I used the existing versions of Gemma, Qwen2.x/3, QWQ and etc, often unsloth distilled to control where portions of the model load so I keep as much VRAM free for context as possible on my 24GB 3090).

All processing into embeddings is just python scripting, and of course I was using Gemini, Claude, ChatGPT and Grok to generate the scripts and prompts (comparing outputs and testing how well it interpreted our fairly narrow and vertical interest for each in turn). For outputs I currently chose json (for portability or the ability to process to any other output) and chose ChromeDB for a direct conversion to native embeddings as it was supported by a few tools inluding AnythingLLM. This gives you essentially the same thing as using Nvidia's ChatRTX or any other chatbot that can access files directly, but in the form of a much faster pre-embedded database that better matches the native model's structure.

That's about as far as I've gotten, at least on that task.

Post by **garyb** » Thu Jul 10, 2025 6:13 am

still quite a task, thank you.

valis · Post by **valis** » Thu Jul 10, 2025 7:44 am

I know there's a few people here who played with AI at various points, so open where to take this.

dante · Post by **dante** » Sat Jul 12, 2025 1:30 pm

AI for me its only use is to help with making music videos. Basically, I've found no other use for AI at either work or play so far. But then again in 1980s I considered Zargon Chess on my Z80 computer to be artificial intelligence. I could pick levels of play to either win or loose no different than human play.

It would be cool if you created a ChatPZ where you could query everything in PZ, Scoperise, Sonic Core etc. with one engine

valis · Post by **valis** » Sun Jul 13, 2025 9:45 am

The audio platforms are gaining stem separation now. Using it for traditional sample digging is fun too (make me some funk beats please), even if you swap out or layer bits to have higher fidelity in the final form. And the LLM platforms are useful at all kinds of documentation, as well as coding and planning. I use them a lot.

But this was a desktop exercise to see how feasable moving in that direction might be. Conclusion for me? In order to provide a web chatbot, I would need a larger business model that relied on services monthly to support that.

nebelfuerst · Post by **nebelfuerst** » Sun Jul 13, 2025 10:15 am

I own some RTX3090 for upscaling, stem separation and make 3d out of 2d movies. Face swapping also makes fun.
With keras I did some classifications and image generations, but never tried LLMs.
So I could only offer to run some 3090 as training host, as I'm still lacking skills to generate LLMs.
Maybe with the exploding number of AI tools, generating a LLM out if a forum will become easier as time goes by.

valis · Post by **valis** » Sun Jul 13, 2025 1:03 pm

A vector database is not an LLM itself, but converts the data into an interlinked format through transformers that's more akin to a mini-model that only contains the data from the source you included. Sometimes they're called "static embeddings" too if not transformed at all, but that's akin to a regular database just running on modern code, and will typically drop the interconnections that a vector database can better encapsulate (hence the vector part of the name).

Appreciated nebelfuerst.

Btw LMStudio & Jan can run models locally just fine, 32B models around 18GB in size will load and leave you with about 4GB of context, which is enough for a reasonable amount of interaction.

PlanetZ forums - scopeusers.com

PlanetZ Chatbot (LLM vector DB)

PlanetZ Chatbot (LLM vector DB)

Re: PlanetZ Chatbot (LLM vector DB)

Re: PlanetZ Chatbot (LLM vector DB)

Re: PlanetZ Chatbot (LLM vector DB)

Re: PlanetZ Chatbot (LLM vector DB)

Re: PlanetZ Chatbot (LLM vector DB)

Re: PlanetZ Chatbot (LLM vector DB)