Karpathy’s Auto-Research — 701x System Explained
- Channel: Dream Labs AI
- Video: Watch on YouTube
- Tags:
karpathy system 701x auto research
Overview
Dream Labs AI covers Andrej Karpathy’s auto-research system — 85K GitHub stars, used by Shopify CEO to speed up code by 53%, and how it can be pointed at any part of your business.
Key Takeaways
- Andre capathy, the godfather of modern ai, has just released a simple system called auto-research, which the top ai labs in the world have been spending millions trying to create. since releasing this auto-research, it’s got over 85,000 stars on
- Ads, your landing pages, your ai skills, ai agents, or even your organic content. and it forces each part of your business to self-improve at the speed of light while you sleep. so, in this video, we’ll break down capathy’s
- Ask you return is you hit that like button down below, grab your slovakian flag and let’s jump okay, so the fun started here with andre capathy tweeting that got another 11 million views for a highly technical tweet, which
- Human, which is menu, will iterate on the prompt, giving the agent a set of instructions on what we want to improve and then the ai agent will iterate on the training code or whatever the asset it is that
- Ai agents to make the fastest research progress indefinitely and without any of your own human involvement, which is why we can be asleep. in the image below, which this auto-research image here is his lm getting smarter and smarter
- It smarter for a very long time and thought it was done. but his auto-research tool found another 11% improvement in its iq and this paragraph below the image might be one of the most fascinating things to read in
- With each other once in a while using sound wave interconnect which is obviously speaking inside a group meeting. it’s a facetious take on how slow our humans are to do these jobs compared to something like this auto-research system.
- And so the repo is up on github which is a crazy thing andre capathy just opened sources and made it free for any one-to-use. it’s called auto-research and you can see it has 85,000 stars at the time that
- Insane. before he goes to bed, he set it up with these experiments. and by the time he woke up, he woke up to a plus 19% score after 8 hours and 37 experiments. literally improving his code while he
- Says, okay, while i ran auto-research on the liquid code-based, it’s now 53% faster with 61% fewer object allocations. he says, this is probably somewhat overfit, but there are some absolutely amazing ideas in this, showing the iterations of auto-research
- A set of instructions on what you want improved. the bottleneck no longer is compute, it is your program md, which is the instructions you’re giving your agent. but i personally don’t want to apply this to code-based or ai
- It’s a recipe slash idea, give it to your agent and apply it to whatever you care about, which is where our magic begins. because even chemarth polyhopper tia had a potential use case, which is kind of mind-bending to
- Up your ai agent with a tiktok or instagram account and have a constantly producing incredible quality content, learning from the results of each piece of content in this loop feature that i’m about to reveal to you and improving
- Train, and prepare. but to me, that’s just confusing and unnecessary. we have the instructions file, which is locked to your ai agent and only used by me and you, the human actually setting the ai agent up for the
- Give you a few examples in just a second. it takes that asset, tries to test a new variation of it, and then we’ll compare this to the third file, which is a scoring mechanism. again, the scoring mechanism is
Transcript
Andre Capathy, the godfather of modern AI, has just released a simple system called auto-research, which the top AI labs in the world have been spending millions trying to create. Since releasing this auto-research, it’s got over 85,000 stars on GitHub and even Shopify CEO pointed at Shopify’s code and sped it up by 53%. But, here’s the thing, it doesn’t just work for coding. In fact, auto-research can be pointed at any part of your business. Your emails, your ads, your landing pages, your AI skills, AI agents, or even your organic content. And it forces each part of your business to self-improve at the speed of light while you sleep. So, in this video, we’ll break down Capathy’s simple auto-research system and show you exactly how to plug in any part of your business that you want to improve. And we’ll do this together by going through three real-world examples that you can start using today. All I ask you return is you hit that like button down below, grab your Slovakian flag and let’s jump Okay, so the fun started here with Andre Capathy tweeting that got another 11 million views for a highly technical tweet, which once again shows the demand and how impressive this stuff actually is. He wrote, I packaged up the auto-research project into a new self-contained minimal repo. If people would like to play with it over the weekend. He says, the human, which is menu, will iterate on the prompt, giving the agent a set of instructions on what we want to improve and then the AI agent will iterate on the training code or whatever the asset it is that we want them to improve and we give them it’s going to literally work all night trying to improve that asset while we’re asleep. Let me show you where Andre Capathy started. He says, the goal is to engineer your AI agents to make the fastest research progress indefinitely and without any of your own human involvement, which is why we can be asleep. In the image below, which this auto-research image here is his LM getting smarter and smarter with every iteration that it was running on a five-minute loop until 83 experiments later it was 11% faster than when Andre Capathy left and he also says that he’d been working on this same LM agent trying to make it smarter for a very long time and thought it was done. But his auto-research tool found another 11% improvement in its IQ and this paragraph below the image might be one of the most fascinating things to read in terms of a mindset shift on how powerful this system is. Capathy wrote, one day, frontier AI research used to be done by meat computers or humans in between eating, sleeping and having fun and these meat computers would synchronize with each other once in a while using sound wave interconnect which is obviously speaking inside a group meeting. It’s a facetious take on how slow our humans are to do these jobs compared to something like this auto-research system. He says that era is now long gone. Research is now entirely the domain of autonomous swarms of AI agents and he says that this repo or this system right here is the story of how all of this began. And so the repo is up on GitHub which is a crazy thing Andre Capathy just opened sources and made it free for any one-to-use. It’s called auto-research and you can see it has 85,000 stars at the time that I’m recording this. And so there’s been hundreds of thousands of people who have been running this system including, like I said in the intro, Toby Ludke, the billionaire CEO of Shopify and he said, okay, this thing is totally insane. Before he goes to bed, he set it up with these experiments. And by the time he woke up, he woke up to a plus 19% score after 8 hours and 37 experiments. Literally improving his code while he was asleep. And Andre Capathy replied and said, who knew early singularity could be this fun, showing more of the work that he’s had auto-research doing for himself. And four days later, Toby Ludke was still playing with auto-research. He says, okay, while I ran auto-research on the liquid code-based, it’s now 53% faster with 61% fewer object allocations. He says, this is probably somewhat overfit, but there are some absolutely amazing ideas in this, showing the iterations of auto-research without any human intervention. And even Gary Tan from White Company says, Capathy just opened source auto-research. One GPU, a hundred machine learning experiments overnight while you sleep, you never touch the code, just write a mark down file or a set of instructions on what you want improved. The bottleneck no longer is compute, it is your program MD, which is the instructions you’re giving your agent. But I personally don’t want to apply this to code-based or AI agents, I want to apply it to my business and my marketing, which is where we really start to see the brilliance of this auto-research system. Andre Capathy says, you don’t use it directly, talking about his GitHub any system. It’s a recipe slash idea, give it to your agent and apply it to whatever you care about, which is where our magic begins. Because even Chemarth Polyhopper Tia had a potential use case, which is kind of mind-bending to think about. He said, the biggest threat to today’s social media apps is an incredible video model, so something that can take text and turn it into incredible looking videos. Plus TTS, plus auto-research, which is where you can set up your AI agent with a TikTok or Instagram account and have a constantly producing incredible quality content, learning from the results of each piece of content in this loop feature that I’m about to reveal to you and improving and iterating a hundred times a night. Honestly, this thing in the hands of your competitors will be extremely scary. So let me show you how this system actually works. It’s a three-file system. Now, Capathy called these files, program, train, and prepare. But to me, that’s just confusing and unnecessary. We have the instructions file, which is locked to your AI agent and only used by me and you, the human actually setting the AI agent up for the task. Then we have the file or the asset that we want to optimize. This is the second file, and this is the file that the AI actually gets access to because it’s actually trying to optimize its performance. I’ll give you a few examples in just a second. It takes that asset, tries to test a new variation of it, and then we’ll compare this to the third file, which is a scoring mechanism. Again, the scoring mechanism is a file that is locked to the AI because we don’t want them tampering with it. In order to score higher, we want the scoring mechanism and the set of instructions only accessible to us the human, forcing the AI agent to actually do the task and optimize the asset. So let’s get practical. Let’s use a few examples to really understand this. This is the baseline of what Andre Capathy first tested his auto research on. So we essentially want to improve the intelligence of an AI agent. We’re going to use IQ here. He didn’t use IQ, but I’ve used it just to keep it simple so we can understand it first. So in the set of instructions, he says, I need you to improve the IQ of this AI agent. He gives auto research the AI file that makes up the current AI agent and auto research. We’ll take that file and create a test variable. It’ll change the code and make one test. It’ll then take that test of that new AI agent and compare the IQ of it to the original file. If it is smarter, it’ll keep that new file that it tested and replace the old file because it’s improved IQ and it will loop it again. It’ll take that new improved AI file that has higher IQ, make another change to it and test it. This is basically evolutionary biology and natural selection, but in the machine world. Now if its test variation doesn’t be the original IQ of the AI file, it will revert back to the original file and try again. And this is done in five minute loops repeating indefinitely until it reaches a certain goal that you have until a human comes and stops it. Okay, so what parts of our business can we actually apply this auto research to to skyrocket past our competitors? Well Ericsson had a very interesting article on Twitter. He says, Capathe’s autonomous AI can make you 701 times faster. It’s the future of business, not coding, but business generally. He says, most marketing teams will run 30 experiments a year, but the next generation will run 36,500 experiments per year easily. And they’ll run the experiments while they sleep using the auto research tool. And so technically this auto research tool could be pointed at any part of your business. But there are some criteria of what it works best for. So there’s three must-haves and then three nice to have. And if you fit this criteria, you can run auto research on that part of your business. We’re going to go through a lot of examples together in just a second. The must-have rule number one is it needs to be scored objectively. So I feel like make this page look better. There’s no objective measure. If you said, come up with the best video idea. There’s no objective measure. Come up with the funniest joke. How do you measure funny? Well, is it the most laugh? Now you’re starting to get an objective measure, but that would be make a joke that gets the most laughs and measure the decibel volume. You need that objective measure in order for AI to score it without a human in the loop. So things like load speed of a website, excellent number of impressions, a piece of content gets, excellent. Click through rate on a page, excellent. Then rule number two is you need a fast feedback loop. You need the results in minutes or maximum hours, not weeks. For example, a load speed on a website once again, you can test that in seconds, which means you get more iterations and more improvements and it’s going to actually work for you. Or email opens how many people are opening that email that hour that will pass. However, SEO rankings, you make a change to your website, and be like, let’s wait to see Google reindex this 10 days later. It’s not going to work for you, because there’s two big of a feedback loop for the AI to actually get enough data to learn. Or pricing. What if I lower my pricing? Is that going to reduce my turn in six months on now? Really hard for an AI to actually have a feedback loop and iterate on that. Number three, the AI obviously needs access to change it. So if it’s a HTML file or an API in a software use, excellent. It’s got access to the asset that you need. However, if it’s a video that’s been already published on YouTube and you’re like, oh, change the intro, you can’t log into a past YouTube video and change the intro because it’s already published and AI cannot have access to that. So if you take the box on those three things, then you want to have a look at the nice to have, because this will make your order research even more powerful. You want a high volume of feedback.
(Transcript truncated — full length available on YouTube)