GPT-4
Multimodal AI

Listen to Article
Audio narration available
OpenAI released GPT-4, a multimodal model that can understand both text and images with human-level performance on many benchmarks.
Introduction
GPT-4 is a large multimodal model developed by OpenAI, released on March 14, 2023. It represents a significant advancement over GPT-3.5, with the ability to accept both image and text inputs and produce text outputs. GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score in the top 10% of test takers.
Historical Context
GPT-4's release marked another leap forward in large language model capabilities. The model was developed over a two-year period with extensive work on alignment and safety. OpenAI collaborated with over 50 experts in domains like AI safety and security to gather feedback and improve the model's behavior. The release was more controlled than previous models, with initial access limited to ChatGPT Plus subscribers and API waitlist members.
Technical Details
GPT-4 is a Transformer-based model pre-trained to predict the next token in a document, using both publicly available data and data licensed from third-party providers. The model was then fine-tuned using Reinforcement Learning from Human Feedback (RLHF). Key capabilities include multimodal inputs (accepts both images and text), increased context window (8,192 tokens standard, 32,768 tokens extended), improved reasoning and problem-solving, better performance on standardized tests, and enhanced safety and alignment. The exact model size has not been disclosed, though it's rumored to be significantly larger than GPT-3's 175 billion parameters. OpenAI noted that GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5.
Notable Quotes
"GPT-4 is more reliable, creative, and able to handle much more nuanced instructions than GPT-3.5."
Cultural Impact
GPT-4's performance on professional benchmarks demonstrated AI's potential to match or exceed human expertise in many domains. Benchmark results included Uniform Bar Exam (90th percentile), SAT Math (89th percentile), SAT Evidence-Based Reading & Writing (93rd percentile), GRE Quantitative (80th percentile), and GRE Verbal (99th percentile). The model's capabilities raised important questions about the future of professional work and education, sparking debates about AI's role in fields from law to medicine to software development.
Contemporary Reactions
The AI research community was impressed by GPT-4's multimodal capabilities and improved performance across benchmarks. However, the release also intensified concerns about AI safety, with some researchers calling for a pause in training systems more powerful than GPT-4. The model's capabilities in understanding images opened new possibilities for accessibility tools and visual AI applications, but also raised concerns about potential misuse.
Timeline of Events
Legacy
GPT-4 represents a significant step toward more capable and general-purpose AI systems. Its multimodal capabilities and improved performance across diverse tasks demonstrate continued progress in AI development. The model has been integrated into numerous products and services, including Microsoft's Bing Chat (now Copilot), GitHub Copilot, and various enterprise applications. GPT-4 has also been used for research in AI safety and alignment, with organizations studying its capabilities and limitations to inform the development of even more advanced systems.
Impact on AI
Advanced toward artificial general intelligence (AGI) with multimodal reasoning and superior performance.
Fun Facts
Scored in top 10% on the bar exam
Can process images and text together
Rumored to have 1+ trillion parameters
Frequently Asked Questions
What is GPT-4?
GPT-4 is a large multimodal AI model developed by OpenAI, released in March 2023. It can accept both text and images as input and produce text outputs. GPT-4 exhibits human-level performance on many professional and academic benchmarks.
How is GPT-4 different from GPT-3.5?
GPT-4 is significantly more capable than GPT-3.5. It can understand images, has better reasoning abilities, scores in the 90th percentile on the bar exam (vs 10th percentile for GPT-3.5), and is 82% less likely to respond to disallowed content and 40% more likely to produce factual responses.
Can GPT-4 see images?
Yes, GPT-4 has multimodal capabilities and can analyze images, charts, diagrams, and screenshots. It can describe what's in images, answer questions about visual content, and even help with tasks like solving math problems shown in pictures.
How much does GPT-4 cost?
GPT-4 is available through ChatGPT Plus ($20/month subscription) or via OpenAI's API. API pricing is based on usage: approximately $0.03 per 1K prompt tokens and $0.06 per 1K completion tokens for the 8K context version.
Is GPT-4 better than ChatGPT?
GPT-4 is the model that powers the advanced version of ChatGPT (ChatGPT Plus). The free version of ChatGPT uses GPT-3.5. GPT-4 offers improved accuracy, better reasoning, multimodal capabilities, and enhanced performance across virtually all tasks.