Intro to Computer Vision with PoseNet

Are you ready to play with your first computer vision model? In this module, you will learn how computers make sense of the content of a picture, how they detect humans, and how to use PoseNet and p5.js to create an augmented reality filter!

Quiz

Watch the video above first, then answer the quiz to make sure you understand the main notions. Some questions may need to look up elsewhere through a quick Internet search!

This quiz is mandatory. You can answer this quiz as many times as you want, only your best score will be taken into account. Simply reload the page to get a new quiz.

The due date has expired. You can no longer answer this quiz.

Assignment

Make it happen!

This assignment is mandatory. If you update your work but the link doesn't change, you don't need to re-submit it.

Tools

p5.js Web Editoris a web editor for p5.js, a JavaScript library with the goal of making coding accessible to artists, designers, educators, and beginners.

Tensorflow.js, is its Javascript implementation. You just learned how to use PoseNet, one of its pre-trained models.

PoseNet is a machine learning model which allows for real-time human pose estimation in the browser.

Project

Log in on p5.js web editor with your GitHub account and create a copy of this project.

If you don't have a GitHub account, now is the time to sign up!

Use PoseNet to make your own augmented reality filter. You can use any picture you like or functions from the p5.js library as long as your filter:

follows at least two elements of the pose other than the ones showed in the video (nose and eyes)
rescale according to the distance between user and camera.

Use PoseNet to make your own augmented reality filter. Feel free to use any picture you like and/or drawing functions from the p5.js library.

You can refer to the Posenet Github repository to find an in-depth documentation and the keypoints list.

Be creative! Feel free to add sound, animations, or multiple images!

Submit

When you’re done, copy the link to your p5.js sketch in the form below.

The due date has expired. You can no longer submit your work.

Before submitting, make sure you check all of the criteria below.

My filter follow at least two element of the "pose" other than the eyes or the nose
My filter rescales automatically based on the distance between user and camera
You did this work by yourself, and it’s not someone else’s

Going further

Definitions

Computer Vision, is a subfield of computer science that focuses on giving computer an higher lever understanding of images (photos or videos). Examples of computer vision research fields are: image segmentation, optical character recognition (OCR), face recognition, etc.
Augmented Reality (AR), is the superposition of computer generated elements (including but not limited to: text, pictures or sounds) on a representation of the real world.

Tools

TensorFlow, is an open-source Python framework developped by Google, originally developped for doing complex calcultations, it quickly became one of the most used tools for machine learning.
Tensorflow.js, is its Javascript implementation. You just learned how to use PoseNet, one of its pre-trained models.
ml5.js, is a Javascript framework developed by teachers from NYU. Built on top of Tensorflow.js it enables you to quickly use pre-trained model in your browser.
Runway ML, is a desktop app that makes it easy to try a lot a machine learning models.
Lens Studio, is the official app by Snapchat for creating custom filters.

Resources

PoseNet model documentation
In-depth Medium post by the TensorFlow team, going into the details of how Posenet works, it will tell you everything that you've always wanted to know!

Projects

**Body, Movement, Language**
Dance performance

**remove.bg**
Automatic image background removal

**Makers' bootcamp #7**
Magic mirror
(coming soon)