Intro to Computer Vision with PoseNet
Are you ready to play with your first computer vision model? In this module, you will learn how computers make sense of the content of a picture, how they detect humans, and how to use PoseNet and p5.js to create an augmented reality filter!
Quiz
Quiz
Watch the video above first, then answer the quiz to make sure you understand the main notions. Some questions may need to look up elsewhere through a quick Internet search!
This quiz is mandatory. You can answer this quiz as many times as you want, only your best score will be taken into account. Simply reload the page to get a new quiz.
The due date has expired. You can no longer answer this quiz.
Assignment
Make it happen!
This assignment is mandatory. If you update your work but the link doesn't change, you don't need to re-submit it.
Tools
p5.js Web Editoris a web editor for p5.js, a JavaScript library with the goal of making coding accessible to artists, designers, educators, and beginners.
Tensorflow.js, is its Javascript implementation. You just learned how to use PoseNet, one of its pre-trained models.
PoseNet is a machine learning model which allows for real-time human pose estimation in the browser.
Project
-
Log in on p5.js web editor with your GitHub account and create a copy of this project.
If you don't have a GitHub account, now is the time to sign up!
-
Use PoseNet to make your own augmented reality filter. You can use any picture you like or functions from the p5.js library as long as your filter:
- follows at least two elements of the pose other than the ones showed in the video (nose and eyes)
- rescale according to the distance between user and camera.
- Use PoseNet to make your own augmented reality filter. Feel free to use any picture you like and/or drawing functions from the p5.js library.
- Be creative! Feel free to add sound, animations, or multiple images!
You can refer to the Posenet Github repository to find an in-depth documentation and the keypoints list.
Submit
When you’re done, copy the link to your p5.js sketch in the form below.
The due date has expired. You can no longer submit your work.
Going further
Definitions
- Computer Vision, is a subfield of computer science that focuses on giving computer an higher lever understanding of images (photos or videos). Examples of computer vision research fields are: image segmentation, optical character recognition (OCR), face recognition, etc.
- Augmented Reality (AR), is the superposition of computer generated elements (including but not limited to: text, pictures or sounds) on a representation of the real world.
Tools
- TensorFlow, is an open-source Python framework developped by Google, originally developped for doing complex calcultations, it quickly became one of the most used tools for machine learning.
Tensorflow.js, is its Javascript implementation. You just learned how to use PoseNet, one of its pre-trained models. - ml5.js, is a Javascript framework developed by teachers from NYU. Built on top of Tensorflow.js it enables you to quickly use pre-trained model in your browser.
- Runway ML, is a desktop app that makes it easy to try a lot a machine learning models.
- Lens Studio, is the official app by Snapchat for creating custom filters.
Resources
- PoseNet model documentation
- In-depth Medium post by the TensorFlow team, going into the details of how Posenet works, it will tell you everything that you've always wanted to know!