Search company, investor...

Sakana AI

sakana.ai

Founded Year

2023

Stage

Series A - II | Alive

Total Raised

$167M

Mosaic Score
The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.

+584 points in the past 30 days

About Sakana AI

Sakana AI focuses on developing artificial intelligence through nature-inspired foundation models within the research and development sector. The company's main offering includes creating a new kind of foundation model that draws inspiration from natural intelligence, designed to advance the field of AI. It was founded in 2023 and is based in Tokyo, Japan.

Headquarters Location

3-24-8 Nishishinbashi Minato Yamauchi Building 3th Floor

Tokyo, 105-0003,

Japan

ESPs containing Sakana AI

The ESP matrix leverages data and analyst insight to identify and rank leading companies in a given technology landscape.

Small language models (SLMs) tools & development

Enterprise Tech / Development

The Small Language Models (SLMs) tools and development market focuses on the creation, optimization, and deployment of smaller-scale language models. These models, while less complex and resource-intensive than large language models (LLMs) like GPT-4, offer several advantages including cost-effectiveness, faster deployment times, and reduced computational requirements.

Sakana AI named as Highflier among 11 other companies, including Hugging Face, Sarvam AI, and Arcee.

Research containing Sakana AI

Get data-driven expert analysis from the CB Insights Intelligence Unit.

CB Insights Intelligence Analysts have mentioned Sakana AI in 2 CB Insights research briefs, most recently on Jul 15, 2024.

Jul 15, 2024

Small language models are gaining traction — here’s what you need to know

Apr 2, 2024 report

AI 100: The most promising artificial intelligence startups of 2024

Expert Collections containing Sakana AI

Expert Collections are analyst-curated lists that highlight the companies you need to know in the most important technology spaces.

Sakana AI is included in 4 Expert Collections, including Artificial Intelligence.

Artificial Intelligence

9,074 items

Companies developing artificial intelligence solutions, including cross-industry applications, industry-specific products, and AI infrastructure solutions.

Generative AI

942 items

Companies working on generative AI applications and infrastructure.

AI 100 (2024)

100 items

Unicorns- Billion Dollar Startups

1,249 items

Latest Sakana AI News

AI That Can Invent AI Is Coming. Buckle Up.

Nov 3, 2024

I write about the big picture of artificial intelligence. Source: waitbutwhy.com Leopold Aschenbrenner’s “Situational Awareness” manifesto made waves when it was published this summer. In this provocative essay, Aschenbrenner—a 22-year-old wunderkind and former OpenAI researcher—argues that artificial general intelligence (AGI) will be here by 2027, that artificial intelligence will consume 20% of all U.S. electricity by 2029, and that AI will unleash untold powers of destruction that within years will reshape the world geopolitical order. Aschenbrenner’s startling thesis about exponentially accelerating AI progress rests on one core premise: that AI will soon become powerful enough to carry out AI research itself, leading to recursive self-improvement and runaway superintelligence. The idea of an “intelligence explosion” fueled by self-improving AI is not new. From Nick Bostrom’s seminal 2014 book Superintelligence to the popular film Her, this concept has long figured prominently in discourse about the long-term future of AI. Indeed, all the way back in 1965, Alan Turing’s close collaborator I.J. Good eloquently articulated this possibility: “Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.” MORE FOR YOU Self-improving AI is an intellectually interesting concept but, even amid today’s AI hype, it retains a whiff of science fiction, or at the very least still feels abstract and hypothetical, akin to the idea of the singularity . But—though few people have yet noticed—this concept is in fact starting to get more real. At the frontiers of AI science, researchers have begun making tangible progress toward building AI systems that can themselves build better AI systems. These systems are not yet ready for prime time. But they may be here sooner than you think. If you are interested in the future of artificial intelligence, you should be paying attention. Pointing AI At Itself Here is an intuitive way to frame this topic: Artificial intelligence is gaining the ability to automate ever-broader swaths of human activity. Before long, it will be able to carry out entire human jobs itself, from customer service agent to software engineer to taxi driver. In order for AI to become recursively self-improving, all that is required is for it to learn to carry out one human job in particular: that of an AI researcher. If AI systems can do their own AI research, they can come up with superior AI architectures and methods. Via a simple feedback loop, those superior AI architectures can then themselves devise even more powerful architectures—and so on. (It has long been common practice to use AI to automate narrow parts of the AI development process. Neural architecture search and hyperparameter optimization are two examples of this. But an automated AI researcher that can carry out the entire process of scientific discovery in AI end-to-end with no human involvement—this is a dramatically different and more powerful concept.) At first blush, this may sound far-fetched. Isn’t fundamental research on artificial intelligence one of the most cognitively complex activities of which humanity is capable? Especially to those outside the AI industry, the work of an AI scientist may seem mystifying and therefore difficult to imagine automating. But what does the job of an AI scientist actually consist of? In the words of Leopold Aschenbrenner: “The job of an AI researcher is fairly straightforward, in the grand scheme of things: read ML literature and come up with new questions or ideas, implement experiments to test those ideas, interpret the results, and repeat.” This description may sound oversimplified and reductive, and in some sense it is. But it points to the fact that automating AI research may prove surprisingly tractable. For one thing, research on core AI algorithms and methods can be carried out digitally. Contrast this with research in fields like biology or materials science, which (at least today) require the ability to navigate and manipulate the physical world via complex laboratory setups. Dealing with the real world is a far gnarlier challenge for AI and introduces significant constraints on the rate of learning and progress. Tasks that can be completed entirely in the realm of “bits, not atoms” are more achievable to automate. A colorable argument could be made that AI will sooner learn to automate the job of an AI researcher than to automate the job of a plumber. Consider, too, that the people developing cutting-edge AI systems are precisely those people who most intimately understand how AI research is done. Because they are deeply familiar with their own jobs, they are particularly well positioned to build systems to automate those activities. To quote Aschenbrenner again, further demystifying the work of AI researchers: “It’s worth emphasizing just how straightforward and hacky some of the biggest machine learning breakthroughs of the last decade have been: ‘oh, just add some normalization’ (LayerNorm / BatchNorm) or ‘do f(x)+x instead of f(x)’ (residual connections) or ‘fix an implementation bug’ (Kaplan → Chinchilla scaling laws). AI research can be automated. And automating AI research is all it takes to kick off extraordinary feedback loops.” Sakana’s AI Scientist This narrative about AI carrying out AI research is intellectually fascinating. But it may also feel hypothetical and unsubstantiated. This makes it easy to brush off. It became a lot harder to brush off after Sakana AI published its “AI Scientist” paper this August. Based in Japan, Sakana is a well-funded AI startup founded by two prominent AI researchers from Google, including one of the co-inventors of the transformer architecture . Sakana’s “AI Scientist” is an AI system that can carry out the entire lifecycle of artificial intelligence research itself: reading the existing literature, generating novel research ideas, designing experiments to test those ideas, carrying out those experiments, writing up a research paper to report its findings, and then conducting a process of peer review on its work. An overview of the AI Scientist's capabilities and workflow, from Sakana's August 2024 paper. Source: Sakana AI It does this entirely autonomously, with no human input. The AI Scientist conducted research across three diverse fields of artificial intelligence: transformer-based language models, diffusion models, and neural network learning dynamics. The full texts of these AI-generated papers are available online. We recommend taking a moment to review a few of them yourself in order to get a first-hand feel for the AI Scientist’s output. So—how good is the research that this “AI Scientist” produces? Is it just a trite regurgitation of its training data, with no incremental insight added? Or, is it going to replace all the human AI researchers at OpenAI tomorrow? The answer is neither. As the Sakana team summarized: “Overall, we judge the performance of The AI Scientist to be about the level of an early-stage ML researcher who can competently execute an idea but may not have the full background knowledge to fully interpret the reasons behind an algorithm’s success. If a human supervisor was presented with these results, a reasonable next course of action could be to advise The AI Scientist to re-scope the project to further investigate [certain related topics].” The AI Scientist proved itself capable of coming up with reasonable and relevant new hypotheses about AI systems; of designing and then executing simple experiments to evaluate those hypotheses; and of writing up its results in a research paper. In other words, it proved itself capable of carrying out AI science. That is remarkable. Some of the papers that it produced were judged to be above the quality threshold for acceptance at NeurIPS, the world’s leading machine learning conference. That is even more remarkable. In order to fully grasp what the AI Scientist is capable of—both its strengths and its current limitations—it is worth spending a bit of time to walk through one of its papers in more detail. (Stay with me here; I promise this will be worth it.) Let’s consider its paper “DualScale Diffusion: Adaptive Feature Balancing for Low-Dimensional Generative Models.” This is neither one of the AI Scientist’s strongest papers nor one of its weakest. The AI Scientist first identifies an unsolved problem in the AI literature to focus on: the challenge that diffusion models face in balancing global structure with local detail when generating samples. It proposes a novel architectural design to address this problem: implementing two parallel branches in the standard denoiser network in order to make diffusion models better at capturing both global structure and local details. As the (human) Sakana researchers observe, the topic that the AI Scientist chose to focus on is a sensible and well-motivated research direction, and the particular idea that it came up with is novel and “to the best of our knowledge has not been widely studied.” The system then designs an experimental plan to test its idea, including specifying evaluation metrics and comparisons to baseline, and writes the necessary code to carry out these experiments. After reviewing results from an initial set of experiments, the AI Scientist iterates on the code and carries out further experiments, making some creative design choices in the process (for instance, using an unconventional type of activation function, LeakyReLU). Having completed its experiments, the system then produces an 11-page research paper reporting its results—complete with charts, mathematical equations and all the standard sections you would expect in a scientific paper. Of note, the paper’s “Conclusion and Future Work” section proposes a thoughtful set of next steps to push this research direction further, including scaling to higher-dimensional problems, trying more sophisticated adaptive mechanisms and developing better theoretical foundations. Importantly, the novel architectural design proposed by the AI Scientist does in fact result in better diffusion model performance. Of course, the paper is not perfect. It makes some minor technical errors related to the model architecture. It falls victim to some hallucinations—for instance, incorrectly claiming that its experiments were run on Nvidia V100 GPUs. It mistakenly describes an experimental result as reflecting an increase in a variable when in fact that variable had decreased. In the final step of the research process, an “automated reviewer” (a separate module within the AI Scientist system) carries out a peer review of the paper. The automated reviewer accurately identifies and enumerates both the paper’s strengths (e.g., “Novel approach to balancing global and local features in diffusion models for low-dimensional data”) and its weaknesses (e.g., “Computational cost is significantly higher, which may limit practical applicability”). Overall, the reviewer rates the paper a 5 out of 10 according to the NeurIPS conference review guidelines : “Borderline Accept.” What If This Is GPT-1? Sakana’s AI Scientist is a primitive proof of concept for what a recursively self-improving AI system might look like. It has numerous obvious limitations. Many of these limitations represent near-term opportunities to improve its capabilities. For instance: It can only read text, not images, even though much information in the scientific literature is contained in graphs and charts. AI models that can understand both text and images are widely available today. It would be straightforward to upgrade the AI Scientist by giving it multimodal capabilities. It has no access to the internet. This, too, would be an easy upgrade. The Sakana team did not pretrain or fine-tune any models for this work, instead relying entirely on prompting of existing general-purpose frontier models. It is safe to assume that fine-tuning models for particular tasks within the AI Scientist system (e.g., the automated reviewer) would meaningfully boost performance. And perhaps the two most significant opportunities for future performance gains: First, the AI Scientist work was published before the release of OpenAI’s new o1 model, whose innovative inference-time search architecture would dramatically improve the ability of a system like this to plan and reason. And second, these results were obtained using an almost comically small amount of compute: a single Nvidia H100 node (8 GPUs) running for one week. Ramping up the compute available to the system would likely dramatically improve the quality of the AI Scientist’s research efforts, even holding everything else constant, by enabling it to generate many more ideas, run many more experiments and explore many more research directions in parallel. Pairing that increase in compute resources with ever-improving frontier models and algorithmic advances like o1 could unleash dramatic performance improvements in these systems in short order. The most important takeaway from Sakana’s AI Scientist work, therefore, is not what the system is capable of today. It is what systems like this might soon be capable of. In the words of Cong Lu, one of the lead researchers on the AI Scientist work: “We really believe this is the GPT-1 of AI science.” OpenAI’s GPT-1 paper , published in 2018, was noticed by almost no one. A few short years later, GPT-3 (2020) and then GPT-4 (2023) changed the world. If there is one thing to bet on in the field of AI today, it is that the underlying technology will continue to get better at a breathtaking rate. If efforts like Sakana’s AI Scientist improve at a pace that even remotely resembles the trajectory of language models over the past few years—we are in for dramatic, disorienting change. As Lu put it: “By next year these systems are going to be so much better. Version 2.0 of the AI Scientist is going to be pretty much unrecognizable.” Concluding Thoughts Today’s artificial intelligence technology is powerful, but it is not capable of making itself more powerful. GPT-4 is an amazing technological accomplishment, but it is not self-improving. Moving from GPT-4 to GPT-5 will require many humans to spend many long hours ideating, experimenting and iterating. Developing cutting-edge AI today is still a manual, handcrafted human activity. But what if this changed? What if AI systems were able to autonomously create more powerful AI systems—which could then create even more powerful AI systems? This possibility is more real than most people yet appreciate. We believe that, in the coming years, the concept of an “intelligence explosion” sparked by self-improving AI—articulated over the decades by thinkers like I.J. Good, Nick Bostrom and Leopold Aschenbrenner—will shift from a far-fetched theoretical fantasy to a real possibility, one that AI technologists, entrepreneurs, policymakers and investors will begin to take seriously. Just last month, Anthropic updated its risk governance framework to emphasize two particular sources of risk from AI: (1) AI models that can assist a human user in creating chemical, biological, radiological or nuclear weapons; and (2) AI models that can “independently conduct complex AI research tasks typically requiring human expertise—potentially significantly accelerating AI development in an unpredictable way.” Consider it a sign of things to come. It is worth addressing an important conceptual, almost philosophical, question that often arises on this topic. Even if AI systems are capable of devising incremental improvements to existing AI architectures, as we saw in the Sakana example above, will they ever be able to come up with truly original, paradigm-shifting, “zero-to-one” breakthroughs? Could AI ever produce a scientific advance as fundamental as, say, the transformer, the convolutional neural network or backpropagation? Put differently, is the difference between “DualScale Diffusion: Adaptive Feature Balancing for Low-Dimensional Generative Models” (the AI-generated paper discussed above) and “Attention Is All You Need” (the seminal 2017 paper that introduced the transformer architecture) a difference in kind? Or is it possible that it is just a difference in degree? Might orders of magnitude more compute, and a few more generations of increasingly advanced frontier models, be enough to bridge the gap between the two? The answer is that we don’t yet know. But this technology is likely to be a game-changer either way. “A key point to keep in mind is that the vast majority of AI research is incremental in nature,” said Eliot Cowan, CEO/cofounder of a young startup called AutoScience that is building an AI platform to autonomously conduct AI research. “That is mostly how progress happens. As an AI researcher, you often come up with an idea that you think will be transformative, and then it ends up only driving a 1.1x improvement or something like that, but that is still an improvement, and your system gets better as a result of it. AI is capable of autonomously completing that kind of research today.” One thing that you can be sure of: while they won’t acknowledge it publicly, leading frontier labs like OpenAI and Anthropic are taking the possibility of automated AI researchers very seriously and are already devoting real resources to pursue the concept. The most limited and precious resource in the world of artificial intelligence is talent. Despite the fervor around AI today, there are still no more than a few thousand individuals in the entire world who have the training and skillset to carry out frontier AI research. Imagine if there were a way to multiply that number a thousandfold, or a millionfold, using AI. OpenAI and Anthropic cannot afford not to take this seriously, lest they be left behind. If the pace of AI progress feels disorientingly fast now, imagine what it will feel like once millions of automated AI researchers are deployed 24/7 to push forward the frontiers of the field. What breakthroughs might soon become possible in the life sciences, in robotics, in materials science, in the fight against climate change? What unanticipated risks to human well-being might emerge? Buckle up.

Oct 31, 2024

Nvidia-backed Sakana AI targets first product launch in 2025

Oct 25, 2024

The 16 Largest Global Startup Funding Rounds of September 2024

Oct 25, 2024

The 16 Largest Global Startup Funding Rounds of September 2024

Oct 7, 2024

DeepMind and BioNTech Bet AI Lab Assistants Will Accelerate Science

Sakana AI Frequently Asked Questions (FAQ)

When was Sakana AI founded?
Sakana AI was founded in 2023.
Where is Sakana AI's headquarters?
Sakana AI's headquarters is located at 3-24-8 Nishishinbashi Minato, Tokyo.
What is Sakana AI's latest funding round?
Sakana AI's latest funding round is Series A - II.
How much did Sakana AI raise?
Sakana AI raised a total of $167M.
Who are the investors of Sakana AI?
Investors of Sakana AI include Sumitomo Mitsui Financial Group, NEC Orchestrating Future Fund, Mitsubishi UFJ Financial Group, KDDI, Mizuho Financial Group and 27 more.
Who are Sakana AI's competitors?
Competitors of Sakana AI include Anthropic and 8 more.

Compare Sakana AI to Competitors

Hugging Face

Hugging Face focuses on advancing artificial intelligence through collaboration in the technology sector. It provides a platform for machine learning professionals to build, share, and collaborate on models, datasets, and applications. The company offers solutions that cater to various modalities, including text, image, video, audio, and 3D, as well as enterprise-grade services for teams requiring advanced AI tooling with enhanced security and support. It was founded in 2016 and is based in Paris, France.

Inflection

Inflection is an artificial intelligence (AI) studio. The studio offers a personal AI named Pi, designed to be supportive and empathetic, providing users with a new class of digital experiences. Inflection primarily serves individuals seeking personal AI interactions. It was founded in 2022 and is based in Palo Alto, California.

AI21 Labs

AI21 Labs operates as an artificial intelligence (AI) lab and product company. The company offers a range of AI-powered tools, including a writing companion tool to assist users in rephrasing their writing and an AI reader that summarizes long documents. It also provides language models for developers to create AI-powered applications. It was founded in 2017 and is based in Tel Aviv, Israel.

One AI

One AI is a company that specializes in generative artificial intelligence (AI) within the technology sector. The company offers services such as language analytics, customizable AI skills, and the processing of text, audio, and video data into structured, actionable insights. It primarily serves sectors such as customer service, e-commerce, media, healthcare, and government. It was founded in 2021 and is based in Ramat Gan, Israel.

Symbl.ai

Symbl.ai is a technology company specializing in real-time artificial intelligence for processing human conversations across various communication channels. The company offers a platform that captures and analyzes live call data to generate actionable insights, such as sentiment analysis, intent detection, and compliance monitoring. Symbl.ai's solutions cater to a range of sectors including customer service, sales, and data analytics. It was founded in 2018 and is based in Seattle, Washington.

xAI

xAI focuses on artificial intelligence, specifically in the domain of language learning models. The company's main product, Grok, is designed to answer questions and suggest potential inquiries, functioning as a research assistant that helps users find information online. xAI primarily caters to the AI research community and the general public seeking AI tools for information retrieval and understanding. It was founded in 2023 and is based in Burlingame, California.

CBI websites generally use certain cookies to enable better interactions with our sites and services. Use of these cookies, which may be stored on your device, permits us to improve and customize your experience. You can read more about your cookie choices at our privacy policy here. By continuing to use this site you are consenting to these choices.

How VCs Use CB Insights

Professional Services

Platform Overview

Sakana AI

Founded Year

Stage

Total Raised

Mosaic Score
The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.

About Sakana AI

Headquarters Location

ESPs containing Sakana AI

Research containing Sakana AI

Expert Collections containing Sakana AI

Artificial Intelligence

Generative AI

AI 100 (2024)

Unicorns- Billion Dollar Startups

Latest Sakana AI News

Sakana AI Frequently Asked Questions (FAQ)

Compare Sakana AI to Competitors

How VCs Use CB Insights

Professional Services

Platform Overview

Founded Year

Stage

Total Raised

Mosaic Score The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.

About Sakana AI

Headquarters Location

ESPs containing Sakana AI

Research containing Sakana AI

Expert Collections containing Sakana AI

Artificial Intelligence

Generative AI

AI 100 (2024)

Unicorns- Billion Dollar Startups

Latest Sakana AI News

Sakana AI Frequently Asked Questions (FAQ)

Compare Sakana AI to Competitors

Mosaic Score
The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.