🤔 Can you depend on the answers AI gives?

Nope, and here's why

In partnership with

“I have a bad feeling about this.” ― Luke Skywalker , Star Wars: Episode IV

LLMs lie about the facts all the time. But Google, decided to add it to search anyway. We’ll look at why someone should’ve had a bad feeling about that.

In today’s newsletter:

  • Can you depend on the answers AI gives?

  • Startups

  • Papers

  • Learn

Penses

LLMs are the current tech hype because they are very easy for a non-techy consumer to use. But that doesn’t free it up from having tons of pitfalls.

Google released a new feature called AI Overviews, and here’s some of the more comically entries that were shared by the people across the web

If you’re brand is a entrenched in everyday life as Google, maybe you don’t need to worry about it as much. But some businesses this could make the brand take a major hit.

Three ways to mitigate the effects of hallucinations. .

  1. Don’t rely on external data. Instead of building a Large Language Model(LLM), you can build a Small Language Model ( SLM) tailored to your information.

  2. There’s also a method of data augmentation called Retrieval Augmented Generation (RAG) . RAG is basically attaching a vector database of your information to a LLM to help it retrieve a better set of answers.

  3. Fine-tuning the model is a bit more time consuming. It requires additional prompt/answer pairs to train the model on more information. It provides more control of the potential answers for common questions.

Get value stock insights free.

PayPal, Disney, and Nike recently dropped 50-80%. Are they undervalued? Can they recover? Read Value Investor Daily to find out. We read hundreds of value stock ideas daily and send you the best.

Startups

Vector Databases received a lot of funding as a result of the LLM boom. Each has a slightly different approach to vector database configuration. Here’s an interesting benchmark article on the topic.

Here’s a few you should probably watch:

Qdrant ($28M) - Open sourced option

Pinecone ($138M ) - Managed database solution with free version

Chroma ($18M) - Open source option

Weaviate ($66M) - Open source option

Papers

Here’s some of the insights I gathered from papers today:

  • This paper has a very large graph of papers on the causes, risks, and mitigation of hallucinations. Common causes of hallucinations are architecture, attention glitches, samples and the list goes on . Read more here.

  • LLMs are always going to hallucinate according to these researchers. Mostly because of time complexity. They get into a lot more of the math and theory here.

Learn

A intro to vector databases

A different take on vector databases

 

Interested in a Data Science mentor for the next 11 days, we have a limited time offer just for you.

Sometimes, you just need help...

And often times help is hard to find.

You get stuck troubleshooting packages.

You can't get your environment set up.

You create projects that anyone can find on Kaggle.

You don't contribute to open source projects because you don't know how.

You have no idea how to read research papers.

And you're wasting hours bumming Stack Overflow all to never get anywhere.

Well we're testing something out in the Geek's studio. It may not work. Heck, you may think it's a bad idea. But we won't know unless we ask. Since we want to see more people building in AI, we're starting a study group. Partnering with you along your learning journey.

For the next 11 days only, I'm opening the gates for applications for beta testers. There's a 30-day free trial to see if it's a good fit for you. After that it's $97 if you're coming solo, and 20% off if you bring a friend along for the ride.

We are limiting spaces and we’re selecting people based on their applications.

DM me if you’re interested in learning more. Share if you know someone who might be interested.

Was this email forwarded to you? Sign up here.

This has a been A Geeky Production