Artificial Intelligence – LLMs

Like most people I am fascinated with the advent of Large Language Models, or what we generally call Artificial Intelligence. Like most people I use it in my day to day tasks whether it’s work or personal related. Also, like most people, I want to understand AI better and what it means for our march forward and if it matches up to the hype cycle. As an early adopter of OpenAI’s ChatGPT and someone fascinated by Large Language Models (LLMs) in general, I think I can add some context to it.

Does it match up to the hype?

Perhaps. LLMs are really Search On Steroids. Theres also some confusion out there as what is AI and what is not. LLMs are AI but not all AI are LLMs. What we have combines the searching and relevancy of user prompt results and delivers it quickly in a way that makes sense to that user. However, there isn’t much proffered there you can’t find somewhere on the internet, but that requires a human searching, sorting for relevancy, digesting, and judging. This all takes time and some expertise. Like classic search, an AI is really just as good as its corpus is, which they call the training datasets that are used by it. It cannot come up with new and novel ways of thinking, that is known as Artificial General Intelligence, and it’s questionable if that is on the horizon without some major advancements in processing power and technology. I don’t want to downgrade today’s current LLMs, they are impressive for sure, but they are not AGI.

To understand more in what makes them impressive requires a little deeper dive. What we have is an LLM that’s made a massive leap in parallel processing and understanding of how text works by using “transformers”. This was introduced via the “Attention is All You Need” paper from Google Brain. What we have now is directly related to their work. If an AI “expert” isn’t familiar with tokens and transformers they are not really experts. I’m not trying to criticize anyone specifically, I just became really frustrated with YouTube talking heads explaining how AI worked but totally whiffing on this development. Today’s AI does not process text sequentially anymore. Today’s AI still uses probability, but in a larger way that incorporates context so its responses are more meaningful. This is the secret sauce.

One reason AI is causing so much consternation is that it caught many people by surprise. It was also introduced during a time when investors were starving for the next “big thing” as it’s clear to most that Blockchain probably wasn’t going to be it. That is why it’s getting such massive investments now and being incorporated into many existing products. Nobody wants to be left behind, and it certainly fixes a lot of problems, particularly in customer service bottlenecks. Future models though may revolutionize healthcare and education, and I am excited for its application there.

What is next for AI?

No one really truly knows. Many people say that today’s AI is the worst it will be, because tomorrow it’s going to be better. That is a wild guess. There is no empirical evidence to support it because this is not chartered territory. Many people assume since the internet blew up in a certain way, or processors kept getting faster and cheaper, that this should play out in AI’s favor also. They believe that the “AI Winter” is over. I’m not so sure and wonder if this is a case of false equivalency. There are some roadblocks to AI in terms of accessing viable training sets and affordable computing power that will really democratize it. Running these things at scale isn’t cheap. Anyone who’s dabbled in Cloud knows how cheap storage is, and how utterly expensive compute can be. These LLMS run on the latter. This is why OpenAI partners with Microsoft, they are desperate for affordable cloud compute.

A little more on the data set problem. We’re seeing now some backlash against it. People are closing down sites and putting content behind paywalls. Large messaging boards are auctioning off data rights to web scrapers, angering users who offered that knowledge for free. Lots of AI drivel is getting dumped into blogs in an attempt to drive affiliate links or search results that will just get swept back up into the training model again, introducing a potential “picture of a picture of a picture” degradation problem. There is a potential battleground forming in this area and that will be something to follow in the future.

What can we do about AI?

Nothing. Incorporate it into your life. Use it as a tool. I rely on it as a way to flesh out my own knowledge. I am certainly catching up with colleagues with deeper development experience. As someone with a deep analytical background adding to my own hard tech skill is setting me apart. I write more complex SQL queries now, I get more out of existing tools and understand how they work better, I sometimes write short Python programs to help automate portions of my workflow. I’ve used it in my volunteer work also. I built our Scouting Pack’s website out using a simple HTML/CSS document saved in an S3 bucket, saving the Pack quite a bit of money and giving them a way to reach parents. There is going to be a divide between those who roll it into their life and use it to scale their knowledge up and those who ignore it.

One recommendation I can make is to learn how to write a good prompt when accessing an AI model. The more ambiguity you can remove, like how you would write a good Google query, will make your results more relevant. Although the hallucinations inherent in the earlier model seems to be going away, there is a chance that results are going to be skewed, so you’ll need some background in what you are searching so it passes the “sniff test”. I’m not sure someone can just get away with working in a totally foreign field solely using ChatGPT as a knowledge crutch. I’m not sure that’s the best use or an honest approach in general. I think it’s an amazing helper though, especially if you work in tech. In the old days if you wanted to know something you had to read the manual or ask someone senior, or hope some kind person mentioned it on a Stack Overflow post somewhere. Now you can scale yourself up quicker, aiding those early on in their careers or those looking to expand their domains.

I would like to see what it does for product development. One of the largest bottlenecks in technology are the lack of developers and the demand for their talents. I am curious what AI will do here. These agents that are coming out which are specifically geared for the task, if they can be scaled and utilized, are going to be game changers. I know some AI assisted IDEs like Cursor are wowing developers and gaining market share, while some others like CoPilot seem to be turning developers off. Perhaps there will be a shift there as the integrations start to mature and focus on what developers actually want in an AI assistant.