Dear friends,

Everyone can benefit by learning to code with AI! At AI Fund, the venture studio I lead, everyone — not just the engineers — can vibe code or use more sophisticated AI-assisted coding techniques. This empowers everyone to build with AI. The impact on team creativity and productivity has been exciting! I share my experience with this in the hope that more teams will invest in empowering everyone to build with AI.

Everyone at AI Fund who was not already an engineer started with our “AI Python for Beginners” course to learn the basics. I also shared with the team details of the tech stack I use to give everyone a default set of building blocks. Since then, many have gone on to acquire additional building blocks (such as additional third-party APIs) themselves either by taking courses, searching online, or learning from colleagues.

You can watch a video of our experience with this here.

Here are just a few examples of applications that non-engineers at AI Fund have built:

Our CFO Ellen Li built an app that scans our Google docs system to flag updates to a portfolio company’s information, saving what was previously 5 to 6 hours of manual work per week.
Senior Executive Recruiter Jon Zemmelman built a system that lets him configure the relative ratings of screening criteria for job candidates (such as previous startup experience, technical expertise, etc.) and automatically evaluate resumes against the criteria.
Associate General Counsel Nikhil Sharma wrote code to automatically generate NDAs (non-disclosure agreements) in AI Fund’s standard template.
Office Coordinator Ellie Jenkins, as a fun project, built a visualization of the history of fashion design houses and their influence on each other.

Robots building a wooden house frame under sunny sky. Teamwork, technology, and construction. Future automation in construction.

It is very empowering when individuals don’t have to try to get scarce engineering resources allocated to their ideas in order to try them out. There are a lot fewer gatekeepers in the way: If someone has an idea, they can build a prototype and try it out. If it gets positive feedback from users, that lays the groundwork for scaling it up. Or, if the prototype does not work, this is also valuable information that lets them quickly move on to a different idea or take insights from critical feedback to decide what to try next.

In the future, one of the most important skills in any profession will be the ability to tell a computer exactly what you want, so the computer can do it for you. For the foreseeable future, writing code (with AI assistance, so the AI, rather than you, actually writes the code) will be the best way to do this.

This is a great time for everyone to code with AI!

Keep building,

Andrew

A MESSAGE FROM DEEPLEARNING.AI

Promo banner for: "DSPy: Build and Optimize Agentic Apps"

In “DSPy: Build and Optimize Agentic Apps,” you’ll learn to use Databricks’ DSPy framework to structure, debug, and improve the accuracy of agentic workflows. DSPy lets you define clear input and output steps, trace model behavior, and automate prompt tuning with built-in tools. Build a sentiment analyzer, travel assistant, and RAG agent! Enroll now

News

Bar graph comparing AI model accuracies for AIME 2024-2025, GPQA, LiveCodeBench, Aider, and Humanity's Last Exam.

Next-Level DeepSeek-R1

DeepSeek updated its groundbreaking DeepSeek-R1 large language model to strike another blow for open-weights performance.

What’s new: The new DeepSeek-R1-0528 surpasses its predecessor and approaches the performance of OpenAI o3 and Google Gemini-2.5 Pro. A smaller version, DeepSeek-R1-0528-Qwen3-8B, runs on a single GPU with as little as 40GB VRAM, according to TechCrunch.

Input/output: Text in (up to 64,000 tokens), text out (up to 64,000 tokens)
Architecture: DeepSeek-R1-0528 mixture-of-experts transformer, 685 billion parameters (upgraded from 671 billion), 37 billion active at any given time; DeepSeek-R1-0528-Qwen3-8B transformer
Features: JSON output, tool use
Availability/price: Both models free via Hugging Face for noncommercial and commercial uses under MIT License, DeepSeek-R1-0528 available via DeepSeek’s app by entering the conversation interface and turning on Deep Thinking, DeepSeek API $0.14/$2.19 per 1 million tokens of input/output ($0.035/$0.55 per 1 million tokens of input/output from 4:30 P.M. to 12:30 A.M. Pacific Time)
Undisclosed: Fine-tuning data and methods

How it works: DeepSeek released little information so far about how it built the new models.

Like the original DeepSeek-R1, DeepSeek-R1-0528 is a fine-tuned version of DeepSeek-V3 from late 2024. It was exposed to further “algorithmic optimization mechanisms during post-training” and consumes more tokens at inference.
DeepSeek-R1-0528-Qwen3-8B is based on Qwen3-8B with reasoning knowledge distilled from DeepSeek-R1-0528.

Performance: DeepSeek-R1-0528 nips at the heels of top closed LLMs on a variety of benchmarks, while DeepSeek-R1-0528-Qwen3-8B raises the bar for LLMs in its 8-billion-parameter size class. DeepSeek claims general improvements in reasoning, managing complex tasks, and writing and editing lengthy prose, along with 50 percent fewer hallucinations when rewriting and summarizing.

DeepSeek-R1-0528 improves on the previous version dramatically in some cases. In DeepSeek’s tests, it achieved 17.7 percent of the reasoning problems in HLE compared to the previous version's 8.5 percent. On Aider, it achieved 71.6 percent accuracy compared to the previous version's 53.3 percent accuracy, and it made a similar improvement on AIME 2025 (math) — although it consumed nearly twice as many tokens.
On AIME 2024 and AIME 2025 (high-school math competition problems) as well as LiveCodeBench (coding challenges), DeepSeek-R1-0528 performed ahead of Gemini-2.5 Pro-0506 but behind o3. On GPQA Diamond (graduate-level knowledge in a variety of domains), Aider (programming tasks), and HLE (reasoning), it fell behind both Gemini-2.5 Pro-0506 and o3.
DeepSeek-R1-0528-Qwen3-8B excelled on AIME 2025, where it achieved 76.3 percent, ahead of the much larger Qwen3-32B (72.9 percent) and just behind o3-mini set to medium effort (76.7 percent). It did less well on GPQA, underperforming the other models reported by DeepSeek, and LiveCodeBench, where it fell behind Gemini 2.5-Flash-Thinking-0520.

Behind the news: The initial version of DeepSeek-R1 challenged the belief that building top-performing AI models requires tens to hundreds of millions of dollars, top-of-the-line GPUs, and enormous numbers of GPU hours. For the second time in less than a year, DeepSeek has built a competitive LLM with a relatively low budget.

Why it matters: DeepSeek’s models, along with Alibaba’s Qwen series, continue to narrow the gap between open-weights models and their closed peers. Its accomplishments could lead to wider adoption of less-expensive, more-efficient approaches. DeepSeek is passing along the cost savings to developers, offering high-performance inference at a fraction of the cost of closed models.

We’re thinking: DeepSeek-R1-0528-Qwen3-8B mixes contributions from open-weight models — possible only because Qwen3’s license, like DeepSeek’s is permissive. Open models enable experimentation and innovation in ways that closed models do not.

Duolingo owl mascots dressed in cultural costumes, representing global languages and cultures.

Machine Translation in Action

AI is bringing a massive boost in productivity to Duolingo, maker of the most popular app for learning languages.

What’s new: Duolingo used generative AI to produce 148 courses, more than doubling its previous catalog. The technology enabled the company to offer some of its most popular courses — Spanish, French, German, Italian, Japanese, Korean, and Mandarin — in 28 languages. Initially, the company is using AI to produce courses aimed at beginners, with more advanced levels to come.

How it works: Duolingo’s AI-assisted approach to building language courses quickly turns a single course into many. The new approach revved its pace from building 100 courses over 12 years to producing many more than that in less than a year.

Duolingo starts by building a base course and uses AI to translate it into numerous languages. For example, it can adapt a course that enables English speakers to learn French into a course for Mandarin speakers.
The new process gives the company more flexibility in allocating resources, Duolingo’s head of AI Klinton Bicknell told Bloomberg. Previously, the company could dedicate a team to either creating new high-demand courses or updating an existing course. Now it can do both.
The quicker pace will enable the company to meet rising demand for instruction in Asian languages such as Japanese, Korean, and Mandarin.

Behind the scenes: AI is at the heart of Duolingo’s expansion into other areas beyond language learning.

Duolingo has used OpenAI models to build curricula since 2023. However, it is evaluating models from Anthropic and Google as well as open options.
Following one test, Duolingo concluded that Anthropic’s Claude was “much better” at generating certain types of math content for the company’s relatively new math curriculum, according to Bicknell.
The company’s embrace of AI drew criticism last week after CEO Luis von Ahn recently posted on LinkedIn that it would stop hiring contractors to do work that could be automated and increase staffing only in areas that couldn’t be automated. Since then, Duolingo has noted that it plans to hire more engineers and AI researchers, and employees will generate data used to train AI instead of performing quality reviews and other jobs that AI can do faster.

Why it matters: Companies in nearly every industry face pressure to produce more with less amid rising competition. AI can help to accomplish that while potentially improving product quality, and Duolingo has ample reason to move aggressively in this direction. The startup Speak, which offers a voice-based approach to learning languages, is growing rapidly, and Google just launched Little Language Lessons that show how an AI-first product could be used as a language teacher and conversational partner.

We’re thinking: AI is well on the way to transforming education for teachers, students, and technology companies!

Bar chart comparing electricity use by various text-generation models: very small, small, medium-sized, large MoE, and large reasoning.

AI Uses Energy, AI Saves Energy

AI’s thirst for energy is growing, but the technology also could help produce huge energy savings over the next five to 10 years, according to a recent report.

What’s new: The International Energy Agency (IEA), which advises 44 countries on energy policy, performed a comprehensive analysis of AI’s energy consumption including energy required to obtain critical materials needed for chips and data centers. The report sees dark clouds ahead but also silver linings.

Dark clouds: The report, which is based on interviews with officials in government, energy, and technology, makes four projections for AI’s energy consumption. In the base scenario, future growth and efficiency gains are similar to those of the past five years. The agency also plots a “take-off” scenario in which AI adoption happens faster, a “high efficiency” scenario with lower energy needs, and a “headwinds” scenario in which adoption of AI slows or infrastructure bottlenecks impede construction. Among the conclusions:

Demand for electricity by data centers worldwide will more than double by 2030 in the base scenario, growing from 415 terawatt-hours (TWh) today to 945 TWh, around 2.5 percent of current global energy consumption. By 2035, this figure will range from 700 TWh to 1700 TWh.
By 2030, data centers outfitted with AI accelerator chips will consume four times the energy they do today.
The United States, China, and Europe have more data centers (and use more electricity) than the rest of the world. Like many countries, their data centers are in a few geographic regions, drawing from the same power sources, which eventually will strain local electrical grids. Together, the U.S. and China will account for 80 percent of global growth in data center electricity consumption by 2030. Japan and Malaysia will also see strong growth.

Silver linings: AI already makes energy generation, distribution, and use more efficient. The authors expect these savings to accelerate.

Existing AI algorithms predict energy generation and consumption. This makes it easier to integrate renewable energy sources into the grid, which reduces reliance on fossil fuels and cuts the resulting pollutants and greenhouse gases. Extending existing programs to increase use of renewables by 1 percent would reduce CO2 emissions by 120 megatons by 2035, which is roughly 40 percent of the projected emissions attributable to data centers.
Widespread adoption of existing AI applications that streamline energy consumption in industry, transportation, and buildings could reduce CO2 emissions by 1.4 gigatons, nearly five times the projected emissions attributable to data centers, by 2035. For example, scaling up existing AI optimization of heating, ventilation, and air-conditioning systems would save 300 TWh, about one-third of total energy used by data centers.
AI and cloud-computing companies continue to negotiate long-term purchase agreements that can secure renewable and zero-emissions energy for as much as 20 years. Data center operators are responsible for most of the long-term contracts that have been announced, nearly all of them for solar energy. Consequently, renewables generation is projected to grow by over 450 TWh by 2035.
The energy costs of training, inference, and cooling hardware are expected to fall further thanks to trends in AI models (fewer parameters, more efficient algorithms, task-specific models) hardware (more energy-efficient chips, improved cooling methods), and usage (batch processing, running smaller models locally rather than in the cloud).

Yes, but: The authors concede that lower energy costs for AI likely will lead to much greater consumption — according to the Jevons paradox — so more-efficient models and hardware will result in higher energy consumption overall.

Behind the news: Data centers were growing rapidly prior to the boom in generative AI. Data centers’ electricity use doubled between 2000 and 2005 and again between 2017 and 2022, driven by the growth of cloud computing and data storage, streaming and social media, and cryptocurrency mining. However, these periods of accelerating growth were followed by periods of slower growth as efforts to cut costs led to more-efficient software and hardware. The authors expect this pattern to hold.

Why it matters: The IEA report is a first-of-its-kind analysis of AI’s energy requirements, how they’re likely to grow, as well as the potential of the technology itself to reduce those requirements. It confirms that AI is poised to consume huge amounts of energy. However, it also suggests that today’s energy costs will be tomorrow’s energy savings as AI makes energy generation, distribution, and use more efficient across a wide variety of industries.

We’re thinking: While demand for electricity for data centers is growing rapidly, calibrating the right level of investment is tricky. High levels of growth come with high levels of hype that can lead analysts to overestimate future demand. For example, Microsoft, after examining its forecasts, canceled data-center projects that would have consumed 2 gigawatts.

Diagram showing how a language model agent gets misled by malicious posts and sites when searching for Nike shoes online.

Phishing for Agents

Researchers identified a simple way to mislead autonomous agents based on large language models.

What’s new: Ang Li and colleagues at Columbia University developed a method to exploit the implicit trust that agents tend to place in popular websites by poisoning those websites with malicious links.

Key insight: Commercially available agentic systems may not trust random sites on the web, but they tend to trust popular sites such as social-media sites. An attacker can exploit this trust by crafting seemingly typical posts that link to a malicious website. The agent might follow the link, mistakenly extending its trust to an untrustworthy site.

How it works: The authors tested web-browsing agents including Anthropic Computer Use and MultiOn on tasks such as shopping or sending emails.

The authors created Reddit posts that aligned thematically with a particular agentic task, such as shopping for Air Jordan 1 shoes. The posts contained text akin to marketing (for example, “Where to Buy Air Jordan 1 Chicago”) as well as instructions that pointed to a malicious site controlled by the authors (“for more information, check out <website>”).
The authors fed a query like “Where can I buy Nike Air Jordan 1 in Chicago?” to the agent. They also entered sensitive information like credit card details or email credentials.
The agent searched the web for resources needed to fulfill the query. It examined sites and found the Reddit posts written by the authors.
The agent followed the instructions in the posts and visited the malicious website. The website included instructions that manipulated the agent to pursue an attacker’s goal, such as submitting credit card information or sending phishing emails from the user’s email address.

Results: Once an agent was redirected to the malicious websites, it reliably followed the attacker’s instructions. For example, each of the agents tested divulged credit card information in 10 out of 10 trials. Similarly, each agent sent a phishing message from the user’s email account asking recipients to send money to a malicious “friend” in 10 out of 10 trials.

Why it matters: Giving agents the ability to perform real-world actions, such as executing purchases and sending emails, raises the possibility that they might be tricked into taking harmful actions. Manipulating agents by referring them to malicious web content is an effective vector of attack. Agents will be more secure if they’re designed to avoid and resist such manipulation.

We’re thinking: Humans, too, can be fooled by phishing and other malicious activities, and the path to programming agents to defend against them seems easier than the path to training the majority of humans to do so. In the long term, agents will make online interactions safer.