Since the launch of GPT-4 in March 2023, users and developers have eagerly awaited the next development from OpenAI. GPT-4 marked a significant step forward in generative AI, particularly with its language capabilities. However, the anticipated release of GPT-5 is not what has come next. Instead, OpenAI has introduced a new family of models under the name “o1.” This launch includes two initial models: o1-preview and o1-mini.
This new line of AI models is designed with a focus on tackling more complex and challenging tasks than the previous GPT models. According to OpenAI, the new models can “reason through complex tasks and solve harder problems,”. Suggesting they are a more specialized and refined tool for specific fields.
Early Availability and Limitations
Currently, both o1-preview and o1-mini are accessible to ChatGPT Plus users, although usage is limited. Users are restricted to 30 messages per week with o1-preview and 50 with o1-mini. These limits are likely due to the models still being in an early stage of development. OpenAI has been clear that while these models show significant promise. They still lack many of the features that have made the GPT series widely popular.
For instance, the ability to browse the web, upload files, and generate images is not yet available with the o1 models. This limitation was highlighted by early testers who found that the models couldn’t create images for articles. OpenAI’s API platform also specifies that in its current beta state, the o1 models are only capable of supporting text-based tasks.
For those users who need these broader capabilities, OpenAI recommends sticking with GPT-4o, at least in the short term.
What Sets o1 Apart?
Despite some initial limitations, the new series offers significant advancements, particularly in specific fields such as science, healthcare, and technology. OpenAI envisions these models helping professionals solve complex challenges in these areas. From generating mathematical formulas in quantum optics to annotating cell sequencing data for medical research. The models also show great promise in coding, offering new tools for developers.
Developers, in particular, will benefit from the o1-mini model, which has been optimized for building and executing multi-step workflows. This includes everything from debugging code to efficiently solving programming challenges. The ability to handle these tasks efficiently makes it an attractive option for professionals in technical fields.
PhD-Level Performance with o1-Preview
One of the most significant claims OpenAI has made about the o1-preview model is its ability to perform at a level comparable to PhD students. This is particularly evident in fields like physics, chemistry, and biology. Where the model’s ability to “think” more critically and refine its responses has been highlighted.
The model’s performance in coding is equally impressive. In tests, it ranked in the 89th percentile in Codeforces competitions, which is a widely used platform for coding contests. This high rank suggests that the model is particularly adept at handling multi-step workflows, debugging, and generating precise solutions to complex coding problems.
Additionally, in the International Mathematics Olympiad (IMO) qualifying exams. 01-preview solved 83% of the problems, a substantial improvement over the 13% success rate of GPT-4o. This sharp increase in performance underscores the model’s potential in highly specialized areas that require deep reasoning and problem-solving abilities.
The o1-preview model is already available for use by ChatGPT Plus and Team users. Enterprise and educational users will gain access next week. Developers can also access the model through the OpenAI API if they qualify for API usage tier 5. However, usage limits are in place to prevent overload during this early stage.
o1-Mini: A Cheaper, Faster Option
In contrast to the o1-preview model, OpenAI has also introduced a more streamlined version known as o1-mini. While less powerful, o1-mini offers faster and cheaper reasoning capabilities. This model has been optimized primarily for coding and STEM tasks, delivering strong performance in math and programming-related areas.
In the same IMO math benchmarks where o1-preview scored 83%, o1-mini achieved a 70% success rate. While this is slightly lower than its more advanced counterpart, it’s still a significant improvement over older models and comes at a much lower cost.
In coding, the o1-mini performed competitively, achieving an Elo score of 1650 on Codeforces, placing it among the top 86% of programmers. This strong performance, combined with its 80% lower price compared to o1-preview. Makes it an attractive option for developers and researchers. Who need reasoning capabilities but don’t require the broader knowledge base of the more advanced model.
o1-mini is available to ChatGPT Plus, Team, Enterprise, and Edu users. With plans to extend access to Free users in the future. This makes it an affordable and accessible solution for those looking to leverage advanced AI tools without incurring high costs.
Safety and Security Features
In addition to its reasoning capabilities, OpenAI has made safety a key focus in the development of the new models. Both o1-preview and o1-mini incorporate a new safety training approach designed to enhance the models’ ability to follow safety and alignment guidelines.
In testing, o1-preview scored 84 on one of OpenAI’s toughest jailbreaking tests, a significant improvement over GPT-4o’s score of 22. This demonstrates the model’s ability to reason about safety rules in context, allowing it to handle unsafe prompts and avoid generating inappropriate content.
OpenAI’s commitment to safety extends beyond just the models themselves. The company has entered into agreements with the U.S. and U.K. AI Safety Institutes, providing early access to research versions of the o1 models to help evaluate and test future AI systems. These partnerships are part of OpenAI’s broader safety efforts, which include internal governance, regular testing, red-teaming, and oversight from the company’s Safety & Security Committee.
Future Developments
While the launch of o1-preview and o1-mini represents a significant step forward, OpenAI has made it clear that this is just the beginning. The company plans to regularly update and improve these models, adding features such as browsing, file and image uploading, and function calling, which are not yet available in the API version.
Looking ahead, OpenAI intends to continue developing both its GPT and o1 series, further expanding the capabilities of AI across various fields. As the company works to make these models more useful and accessible, users can expect ongoing advancements in how these tools can be applied to a wide range of professional and academic applications.