Visual Prompting

Building computer vision systems in minutes via natural prompting interactions

Andrew Ng

Founder, Landing AI

The traditional AI modeling workflow requires multiple steps: (i) finding and labeling data, (ii) training a model, and then (iii) making predictions. In contrast, text interfaces like ChatGPT have a dramatically simpler process where a user can give a text prompt saying what they want, and get an answer quickly. This has revolutionized NLP (natural language processing).
Traditional AI: Label→ Train → Predict (taking days/months)
Prompting based AI: Prompt → Predict (taking minutes/seconds)
In this livestream presentation, Andrew Ng will share some early thoughts – and present Landing AI’s results – on generalizing this concept from text to computer vision, so that users can input a simple Visual Prompt (that indicates a few things on an image) and quickly get a result. Meta’s SAM model is one example of Visual Prompting, applied to image segmentation of individual images. In the next few years, Andrew expects that Visual Prompting tools will make computer vision much more accessible, just as text prompting has for NLP.
Join Andrew to discuss:
  • Lessons from NLP (transformers and large language models) for computer vision (vision transformers, foundation vision models)
  • Visual Prompting as a fast, easy and natural way to have a vision-based interaction
  • Live demo
  • Implications of prompting on machine learning project lifecycle
  • Live Q&A
Can’t attend live? Don’t worry. For anyone registered, we will be sending out the recorded sessions afterward.

Register for the Replay