- No Longer a Nincompoop
- Posts
- Google's "ChatGPT" Moment
Google's "ChatGPT" Moment
Quick update: I’m now offering a Chief AI Officer service.
We turn time-consuming, mundane tasks into automated workflows that boost your bottom line.
If you’re interested, reply to this email with “Chief AI Officer” and let me know one thing you’d like me to automate in your business.
Here’s the tea 🍵
The Search Wars Continue 🔎
Google’s “ChatGPT” moment 💬
Anthropic has major release, but is the tech overhyped? 🤔
Understanding your docs has never been easier 📄
OpenAI recently released their new Search functionality - ChatGPT Search. If you’ve used Perplexity, this is nothing new.
However, this is a bigger deal than what I first thought for a few reasons.
1. ChatGPT has a lot of coverage. A lot of people use it every single day; it is one of the most visited websites on the planet.
No other AI company comes close.
If people see that it can also search the internet, they won’t hesitate to use it over Google.
Is this going to lose Google money?
Perhaps.
However, what this does do is set a precedent.
People are getting used to using AI that can also search the internet and be grounded in some truth.
Why bother doing extensive research when you have ChatGPT? (Most people probably don’t do this already, but, besides the point).
But, there’s a bigger play here.
2. ChatGPT Search uses a number of sources to obtain information when searching the web.
The main source?
You see where this is going?
Millions of people are now inadvertently using Bing.
Oh, and if a website is not on Bing or doesn’t rank on Bing, it won’t show up in ChatGPT Search either! [Link]
This is obviously a very big deal for SEO purposes. Companies that have spent a lot of time and money to rank on Google won’t show up in front of people when they search in ChatGPT.
This completely changes how people think about SEO and rankings.
Why bother trying to rank on Google when you won’t even show up in one of the most popular tools on the planet?
What happens when Google releases their own search AI?
Will you have to optimise for both?
(Probably).
But, don’t be fooled. There’s a lot of hype regarding ChatGPT Search. Lots of people talking about how they have stopped using Google altogether.
This is a problem for a few reasons.
One - What ChatGPT Search is essentially doing is getting information from its sources, like the Bing Search Index, and doing RAG over them.
This is fine if you didn’t care too much about factuality. The reality is, LLMs hallucinate.
Another example.
Fundamentally, these are two completely different technologies.
Two - This is a bubble. Google owns search, and frankly, I don’t see how they lose it.
Let me show you why.
NotebookLM
This is one of the most important and significant technology releases this year.
Google’s NotebookLM lets you understand information like never before.
You can upload PDFs, videos, audio - pretty much anything, and NotebookLM breaks it down for you and helps you learn.
This is Google’s “ChatGPT” moment. This product is so good, it’s hard to overstate how impressive it is.
I mean, you can give it like 50 PDFs and it will just… work.
There is no other tool that even comes close to being this good at information retrieval, and this is why I believe Google won’t lose the search engine battle.
NotebookLM basically tells us that Google has essentially figured out RAG, or, at the very least, is doing it better than anyone else. We have no idea how they’ve done it, but they have.
Google’s already experimenting with adding AI summaries and answers at the top of Google searches. I won’t be surprised if one day, they just start working.
But, one of the coolest features of NotebookLM is its “Audio overview” feature.
Simply put, given any data you provide, it creates a podcast of two people discussing the information and learning about it.
I can’t express this is any other way - It is uncanny how good it is. Seriously. It is so good I get giddy listening to it.
Google’s figured out realistic AI voices and it’s just another feature in NotebookLM. The voices and their conversations are unbelievably realistic. They laugh, they make “mhmm” sounds, they inquire.
But, without a doubt, the craziest thing they do is finish each others sentences. Like, it is mind boggling that it is completely AI generated. I am absolutely fascinated as to how they’ve done this.
So, within a single product, Google has shown us:
World class information ingestion and retrieval
Some of the most realistic AI voices on the planet
The ability for these voices to chat, in sync and mimic human conversation
I highly recommend anyone to go and try using this tool. Google has shipped an absolutely incredible product here. They’ll be releasing an API at some point as well!
This is an example of a podcast based on 3 of my previous newsletters [Link]. (If it doesn’t load, NotebookLM sharing isn’t working properly).
Cannot wait to build products around this.
The downside?
It will absolutely increase AI content online, be it video, audio or text. No way around that I’m afraid. As good as it is now, most people wouldn’t have a clue the AI podcasters are not real.
Oh, and it’s free.
You can sign up to the business pilot here [Link].
Funnily enough, Meta released an open source version of NotebookLM. It’s obviously not as good considering it’s using tiny models to reciprocate the behaviour, but, the point is, the architecture is simple.
Also, since it’s open source, you can swap any other AI model in and out. Right now, it’s using Llama 3.1 1B/8B/70B. Imagine if it was using 405B the whole way around?
Even better, swap out all the Llama models for ChatGPT or Claude. I imagine the performance would increase substantially.
I somehow doubt it would be as good as NotebookLM though.
How it’s being used in business
Someone uploaded 10 years’ (!) worth of emails to NotebookLM and started asking questions [Link].
The use cases for something that works this well are innumerable across every industry. Current CRM companies and the way they work won’t last with something like this.
The way information is stored and retrieved, particularly within businesses, is going to completely change.
I’m seeing this first hand building dashboards for clients and its clear that the traditional way of doing things is no longer the best way.
The biggest question here is - For humans, what’s the best form factor?
What is the most intuitive and easy way for us to digest information?
How much flexibility should there be?
Is there a one-size-fits-all approach that will work for everyone?
UX is now the most important component. Entire industries will be rebuilt because of UX paradigm shifts.
We are on the search for the best way for humans to work.
Google is shipping 🚢
There’s more???
Google is on fire.
Let’s say NotebookLM didn’t create the podcast exactly how you wanted it. Can you do anything about this?
Not really.
But, you could just use Google Illuminate, a new experimental website that lets you generate tailored AI podcasts based on an uploaded PDF.
With Illuminate, you can even suggest how long the podcast should be, who it’s for and the tone of the podcast.
I don’t generally advocate for AI generated content, but, this is really high level content. I’ve tested this and NotebookLM, and I’m seriously impressed by the results.
These are podcasts I would consider listening to, or, at the very least, I wouldn’t call them AI slop. You can generate 20 podcasts in a day.
There’s just one caveat.
Illuminate is mainly for research papers. If you want a podcast of other types of content, NotebookLM is your friend.
I don’t know why Google has different teams building similar products like this but it’s working. They are releasing some very cool products.
Having all of these tools in a single API and being able to tweak how you want the output is going to be incredibly powerful.
Many new apps will be built with this tech.
Anthropic’s Computer Use
Anthropic did something that many AI labs are looking to do right now, which is kind of funny because they arguably have spoken the most about AI safety and the dangers and doom of extinction.
So, what did they do?
They gave Claude the ability to use a computer.
Yes, Claude can now use your computer. Naturally, they have tried to stop it from doing anything sensitive like paying for things, but, people have worked around this.
Claude looks at screenshots of your screen and tracks how many pixels vertically or horizontally it needs to move so that it can click the right thing.
Read more about how it was built here [Link].
Google is also preparing to launch Project Jarvis, which aims to let an AI take over your computer and do research, buy your groceries and book your flights.
I’ve noticed something strange though.
This feature was released a few weeks ago, and now, no one is even talking about it. It’s just another Saturday…
These are things that previously would be big news, and now, people couldn’t care less.
This isn’t a gimmick, it can actually do things.
We went from “students are using ChatGPT to do their essays” to “ChatGPT is acting as the student online and doing their exams for them” within a year.
What happens when people can automate their entire jobs with something like this?
Sip on your coffee while watching Claude fill out your spreadsheets.
Funny times ahead.
You can try Computer Use yourself from some of these links - [Replit][Mac][Open Interpreter][Anthropic’s Github].
The easiest for me was Open Interpreter. Just follow these steps:
Open terminal, type “cd Desktop”
Now you’re in Desktop. Type “mkdir computer-use” then “cd computer-use”.
Now you’re in a new folder called computer-use.
Type “python -m venv myenv” then “source myenv/bin/activate” to create and start using a virtual environment
Type “pip install open-interpreter”. Then to start it type “interpreter --os”.
You should know see this screen.
Get your Anthropic key from console.anthropic.com. Paste it here then type in the command you want it to execute. To stop it, simply drag your mouse to a corner.
A funny story from development was that while coding, it took a break and started searching for photos of Yellowstone National Park.
It’s procrastinating lol.
We will solve data ingestion
I’ve been meaning to write about ColPali for quite a while.
Without getting too technical, ColPali is a new way to allow AI to retrieve information from documents like PDFs.
The way it works is, it finds relevant pages based on a question, and then provides those pages to a Vision Language Model (VLM) like Claude or GPT-4Vision. The AI looks at the relevant pages and formulates an answer.
(A VLM is a type of AI model that can understand both text and visual data).
In my experience, this method is extremely effective at document processing. At the moment, however, it is not the easiest workflow to set up… Until now.
Anthropic has released support for PDF parsing and it’s using vision to read docs and provide answers.
It’s really good at reading graphs, diagrams and inferring meaning and understanding across an entire document.
Building AI agents with the ability to parse through dozens of documents just got a whole lot easier.
If you want to test it, try this Repl [Link]. Just add your own API key (from here), and you can test with your own docs as well.
If you’re wondering, no, OpenAI’s APIs do not do information retrieval this way. I wouldn’t say this is better, however, when you have dozens of documents, I would probably use this method over traditional RAG.
Then again, you can also just chuck all your docs into NotebookLM and not worry about any technical implementation as well…
Future of work is going to be very, very interesting.
I’ll be writing more about ColPali, RAG and how I use these techniques and more when building AI products. You can sign up to the Avicenna newsletter to stay tuned for that.
This felt like an old-school, long newsletter. I hope you enjoyed it :).
How was this edition? |
As always, Thanks for Reading ❤️
Written by a human named Nofil
Reply