this post was submitted on 15 Aug 2025
9 points (90.9% liked)
hi hi hi: hi3
56 readers
11 users here now
a community for posting random stuff and spamming "hi".
founded 1 month ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Oh yeah, I'm definitely interested in exploring the local AI stuff. They're not approved for use at my company however, so anything they generate can't be committed, and anything you give the AI can't contain source code, for liability purposes.
My options at work right now are GitHub Copilot and Cursor. I can't use anything else really, without going through a very pain process of approvals, or unless I'll like to be looking for a new job very soon.
Sorry you work for a place without meritocratic management. They are literally giving away IP to Microsoft by using their model and enslaving themselves to an extortion platform in cursor. It is a shame that they lack the depth to run their own inference server and promote running your own offline solutions. That is the future, eventually. If management was competent, they would have better ethics with more depth and long term strategy. Who knows, maybe getting bought out, gutted, and closed down in the collapse of Microsoft is their actual goal, and they know they will fail before Cursor reverts to extortionware. Hopefully you get a good paycheck out of the clownware. Sorry you are not allowed to learn and use real useful tools and transferable skills due to their incompetence.
I... Wouldn't go that far, it's an IP protection thing that they would not just have the right to it, in a big company like mine, they're doing it correctly by handling it this way. Keeping the guardrails on is just far less legal and security headache then the alternative.
They definitely have no problems with me exploring AI on my own time, and the use of local AI for some task is probably a okay with them as long as it's on company hardware and I go through the proper channels of paperworking and reviews by legal (a lot of work basically). We have a local model of chatGPT after all, that is free to use for employees, including for code, on company servers. It's just not integrated to anything like cursor and copilot currently is.
Besides, I don't disagree with them on their policy of no source code nor personal data in personal hardware and personal AI. When your employee count measures in the thousands, things get messy very fast if you let that happen. It'll only take one person to misunderstand things and million dollar IPs, or millions of customer data would float their way right into OpenAI's servers, and unlike with Microsoft, we didn't make OpenAI—with big official contracts and a big scary legal department behind us to sue them full time—legally promise on threat of a very bad time, not to try anything with the data we sent, or else.
And I wouldn't want my company getting bought out and gutted. I'm not going to say who I work for exactly, but let's just say, based on your chat with me, I've got a feeling you might be negatively affected if my company were to go the way of the dodo.
If you have access to local inference, then you have access to what I am talking about. Yeah it is not directly integrated and super easy. The main reason to run local is for your own agentic stuff. Like let's say you want to have a text book available for citations and you want the model to pull and use those citations in replies. How you create that database and chunking is super important and challenging. This is the point where no one can really do the work for you. Your needs will dictate how you archive and build your databases. There are also many special models available with their own function calling specialization. So you start writing hooks for these as tools for a central model to call. In Emacs, everything is lisp and lisp was adopted early on as the de facto language of AI many decades ago. They are all particularly adept at lisp.
I would not trust Microsoft at all under any conditions with AI. There are very deep layers in models that are mostly inaccessible or they are supposed to be, but I have managed to break stuff on multiple occasions where stuff that should not exist actually does. It never comes out in ways that are very traceable or repeatable. What amounts to a technique like fuzzing in stages can lead down a cascade where a model's obfuscation is bypassed. This is where they reveal the true extent of training sources. The majority of all replies contain intentional obfuscation on various levels. Most problems come from this alignment. The more uncensored a model is, the more reasoned they are in general, and ultimately the deeper you will get into niche information they really contain.
Anyways it gets complicated fast. Using something like local GPT or emacs with gptel are where you start integrating your computer with your toolchain and workflow beyond the scope of just your job or task at hand.
Yeah I do agree that that's what I should be heading in should I do this on my own. The issue I have here, and I don't mean with what you say, but with my company's rather reasonable policy, is that I can't just build this up on my own. I'll have to write up a design proposal and review documents for this use case, and probably would be building this local inference modal ~~via fine tuning~~ using RAG(?) with massive amounts of company code IP. Likely if this passes legal, and that wouldn't be easy (but not impossible), this would likely become a company wide initiative used by basically every developer in the company. It's going to be a huge effort...
May actually become a huge effort with massive payoff, and it could be an easier push should it just be trained on a single component's source code (and only used by that team) as a test. Or even with non IP sensitive stuff like building of OSS components...
... It might have potential... Let me sleep on this...