The Future of AI in Europe

Regulation is coming for OpenAI

Welcome to edition #19 of the “No Longer a Nincompoop with Nofil” newsletter.

 Subscribe | Go Premium 🚀

Here’s the tea ☕

  • The company to overthrow OpenAI 🧐

The Wind Blows in France

You’re probably wondering what on Earth I’m talking about. Which company could possibly compete with OpenAI that you wouldn’t already know about. Even the usual suspects - Google, Anthropic, Cohere and hell even Microsoft’s Bard can’t compete with ChatGPT at the moment. Well, how about Mistral. Who? Mistral is a company that just raised a $113M seed round at a $260M valuation. So their LLM must be pretty good right? I mean, they did just raise an absurd amount of money from some pretty big investors.

So why haven’t you heard of them? Well, the company didn’t even exist four weeks ago. That’s right. This company that just raised over $100M to build open-source AI models to compete with ChatGPT doesn’t even have a product. There is no MVP, no prototype and no models. There is nothing. I’m not sure I’ve ever seen anything like this before.

So what do they have? A pitch deck! Or at least something that’s supposed to resemble one. It’s actually a six page document, in google docs of all places. So what does it say, and how did they manage to convince investors to pour this much money into a pre-revenue, pre-product, pre-basically everything company? Let’s dive into it.

The Power of People & Open-Source

One of the main selling points of Mistral is its focus on open-source. They will not only release their data sets, including their architecture and weights, but also only train models on licensed data. Their pitch deck describes the “black-box” system of closed models like OpenAI, where users or companies have no idea where and how their data is being managed and used. They suggest that a “white-box” approach, where it is clear how data is used, will not only make it more compelling for companies to use, but also be faster and work better.

Even more importantly, however, is their belief that this approach will incentivise researchers and engineers to work with them. This is probably the most important point in the whole deck and is the reason why they raised so much money in the first place. Firstly, the founders of the company are research scientists and engineers from Google DeepMind and Meta. They have decades of experience building LLMs and definitely have the know-how when it comes to building AI models. One of the founders, Guillaume Lample, led the development of LLaMA and was the lead of the LLM team at Meta.

Mistral founding team

Secondly, there is something else about building AI models that most people aren’t aware of. There aren’t that many people on the planet who have the knowledge and experience that can do it. Investors believe there are probably 70-100 other people on the entire planet with the Mistral team’s expertise and knowledge on not only building LLMs but also optimising them. This is an incredibly important aspect of building AI models, not only in the technical sense but also in a geopolitical sense. The deck claims that many, if not most talents in LLMs are from Europe and according to their “extensive testing”, a large number of them can be convinced to join their project. They also see this as a means to fight off a potential monopoly on AI tech by the US and a way for Europe to compete in the oncoming AI boom.

This is one of the reasons why I think countries like China, Russia and India won’t catch up to the US. It’s not that they don’t have the talent, it’s that it would take time to nurture that talent. With the current rate of change and how things have progressed over the last few months, it would be difficult to catch up. The talent is already there and it’s not so easy to find new talent in this field. This is why Mistral exists today, why they were able to raise so much money in mere weeks. Is it warranted? I guess we’ll find out.

The AI Act

The announcement of Mistral and their philosophy regarding building AI models ties suspiciously well with the EU’s proposed AI Act. As far as regulation goes, the EU is not only the leader in this space, they seem to be moving unbelievably quickly as well. The proposed AI Act recently passed through EU parliament with an overwhelming majority. Here are some of the highlights:

  • Banning AI for emotion-recognition in education, policing and workplaces.

  • Banning real-time biometrics and predictive policing

  • Banning social scoring like the system used in China

  • Restricting the use of copyright material in training LLMs

There is a lot more to unpack here that will likely need its own newsletter, but one thing to note is the focus on copyright. At the moment, we know practically nothing about the training data used in the creation of OpenAI’s GPT-4. However, we do know a tiny bit about GPT-3.

In the GPT-3 paper, there is a table that shows where they collected data from. One of the items is WebText. This data set was created by taking links from Reddit posts if they had more than 3 upvotes. It was never released to the public because it’s more than likely that there are any number of copyrighted articles in that data set. We also know that Reddit data has also been used previously to train older models; it’s the reason Reddit made changes to their APIs and caused havoc within their communities leading to blackouts and calls for boycotts. You can read more about the training of GPT-3 here.

In the end, we know that OpenAI has used copyright material extensively to train their models. For OpenAI, regulation on this aspect is a big deal, which is probably why they lobbied the EU to water down proposed regulations. I’m sure they’re still lobbying right now.

The Future

Regulation and AI, copyright and training LLMs - the future of technology and essentially the world is being shaped right now. We are living through a period of time that will dictate the course of humanity. We may very well look back and study the rise of AI and its effects on geopolitics, and its impact on the success and failure of nations.

Although we, the average person like you and I, may not have much say in shaping such a future, we sure can make use of what we can control, and work to build a better future for ourselves and others. This is one of the most exciting times to build, to solve problems.

You may look back and lament the chance to live through such times again. Don’t regret. Do what you’ve always dreamed of. Build that startup. Take that step. You may never get the chance again.

As always, Thanks for reading ❤️

Written by a human named Nofil

Join the conversation

or to participate.