**Assignment 12: Inference and Applications** *Due date found on gradescope* [Back to Neural Networks](http://cs.pomona.edu/classes/cs152/) # Learning Goals - Learn how to export a trained model. - Learn how to import a trained model. - Learn how to use a model for *inference*. - Understand the general process for building an AI application. # Grading Walk-Throughs This assignment will be graded as pass/needs-revisions by a TA. To pass the assignment, you must 1. Complete the assignment and submit your work to gradescope. - You should start this assignment in class on the day shown on the calendar. - **Complete the assignment as early as possible**. 2. Schedule a time to meet with a TA prior to the deadline. - You must book a time to meet with a TA - Sign-up on the Google Sheet **with at least 36 hours of notice**. - Contact your TA on Slack after signing-up. - All partners must meet with the TA. If you can't all make it at the same time, then each of you needs to schedule a time to meet with the TAs. 3. Walk the TA through your solutions prior to the deadline. - Walk-throughs should take no more than 20 minutes. - You should be well prepared to walk a TA through your answers. - You may not make any significant corrections during the walk-through. You should plan on making corrections afterward and scheduling a new walk-through time. Mistakes are expected--nobody is perfect. - You must be prepared to explain your answers and justify your assumptions. TAs do not need to lead you to the correct answer during a walk-through--this is best left to a mentor session. 4. The TA will then either - mark your assignment as "pass" on gradescope, or - mark your assignment as "needs-revisions" and inform you that you have some corrections to make. 5. If corrections are needed, then you will need to complete them and then schedule a new time to meet with the TA. - You will ideally complete any needed revisions by the end of the day the following Monday If you have concerns about the grading walk-through, you can meet with me after you've first met with a TA. # Overview In this assignment you will create a [gradio](https://gradio.app/) application for an already-trained model from either PyTorch ([torchvision](https://pytorch.org/vision/stable/models.html)) or [Hugging Face](https://huggingface.co/). # Image Collection for Lecture Demo Prior to the assignment, I will demonstrate the process using the image classification task. To do so, we need to create out own dataset. We will be training a model to classify pictures as either a **palm tree** (any [arecaceae trees](https://en.wikipedia.org/wiki/Arecaceae) will suffice) or a **pine tree** (any [coniferous tree](https://en.wikipedia.org/wiki/Conifer) will suffice). Everyone must: 2. Take 16 pictures of [palm trees](https://en.wikipedia.org/wiki/Arecaceae) 1. Take 16 pictures of [pine trees](https://en.wikipedia.org/wiki/Conifer) 3. Put them on Box in the folder (you'll receive an email to your college email address) - Palm pictures go in the "Palm" sub-folder - Pine pictures go in the "Pine" sub-folder (You can add more than 32 total images if you'd like!) Tips for taking pictures: - Take pictures in different locations - Use different angles - Don't always have the item in the center - Use different lighting - Don't always take the image with the same background File naming conventions: For the sake of anonymity, you can use whatever random identifier you'd like in your filenames. You do not have to, but I also recommend removing the meta-data from your image files (if you do not do so, I will before class--and I'll show you how using `jpegtran -copy none`). Name your files using this scheme: `--<0#>.` Where - `` is any random identifier that you choose that helps me know that they are your images (so that you get credit) - `` is either "palm" or "pine" - `<0#>` denotes a zero padded number ("01", "02", ... "16") - `` is your file extension (probably ".jpg" or the like) For example, this is what I'll name my files: ~~~text profclark-palm-01.jpg profclark-palm-02.jpg profclark-palm-03.jpg profclark-palm-04.jpg profclark-palm-05.jpg profclark-palm-06.jpg profclark-palm-07.jpg profclark-palm-08.jpg profclark-palm-09.jpg profclark-palm-10.jpg profclark-palm-11.jpg profclark-palm-12.jpg profclark-palm-13.jpg profclark-palm-14.jpg profclark-palm-15.jpg profclark-palm-16.jpg profclark-pine-01.jpg profclark-pine-02.jpg profclark-pine-03.jpg profclark-pine-04.jpg profclark-pine-05.jpg profclark-pine-06.jpg profclark-pine-07.jpg profclark-pine-08.jpg profclark-pine-09.jpg profclark-pine-10.jpg profclark-pine-11.jpg profclark-pine-12.jpg profclark-pine-13.jpg profclark-pine-14.jpg profclark-pine-15.jpg profclark-pine-16.jpg ~~~ If you make a mistake, you *should* be able to upload a file with the same name to overwrite the original upload. **You will put your images in the folders at the box link you receive by email.** # Assignment Tasks For this assignment you will 1. Select a pre-trained model from one of these sources: - [torchvision](https://pytorch.org/vision/stable/models.html): these are just for vision problems, including classification, multi-class classification, segmentation, object detection, and video classification - [Hugging Face](https://huggingface.co/models) 2. Build an application that runs on the server. You do not need to (and probably should not) deploy your application to any cloud service. It is possible to complete this assignment with pretty minimal effort, but I highly encourage you to spend some time and create an app that you'd find interesting. For example, if object detection sounds interesting, then take a look at [this documentation](https://pytorch.org/vision/stable/models.html#object-detection-instance-segmentation-and-person-keypoint-detection) and grab a pretrained [Faster R-CNN model](https://arxiv.org/abs/1506.01497) (I also have a demo of using this model listed below). Here are some examples and documentation that you will find useful: - [Hugging Face demos](https://github.com/anthonyjclark/cs152sp22/blob/main/Assignments/A01-Demos/ModelPlayground.ipynb): I show some basic usage of Hugging Face models for sentiment analysis, text generation, translation, and a few image tasks - [torchvision demos](https://github.com/anthonyjclark/cs152fa21/blob/main/lectures/l27-playground.ipynb): Here is some code I demonstrated last semester that shows how to use pretrained torchvision models for object detection and segmentation - [Pipelines for inference](https://huggingface.co/docs/transformers/pipeline_tutorial): This documentation steps you through how you can use existing Hugging Face models for inference, they have examples for - [List of pipelines](https://huggingface.co/docs/transformers/v4.18.0/en/main_classes/pipelines#transformers.pipeline): This lists all existing pipelines (some for audio, text, tables, and images) - [Getting started with gradio](https://gradio.app/getting_started/): Once you've selected a model, you'll use gradio to build an app - [Hugging face tutorial](https://huggingface.co/course/chapter1/1): Here is a more detailed tutorial for Hugging Face (maybe something you could look at for your projects) ## Hugging Face with gradio gradio makes it (almost too) easy to use Hugging Face models. [Here is a two-line python script for image classification](https://gradio.app/image_classification_with_vision_transformers/). I'd like you to put in more effort than just taking this existing example. # Submitting Your Assignment You will submit your code and/or responses on gradescope. **Only one partner should submit.** The submitter will add the other partner through the gradescope interface. To pass the autograder (if one exists for this assignment), your output must exactly match the expected output. Your program output should be similar to the example execution above, and the autograder on gradescope will show you the correct output if yours is not formatted properly. You can use [text-compare](https://text-compare.com/) to compare your output to the expected output and that should give you an idea if you have a misspelled word or extra space (or if I do). Additional details for using gradescope can be found here: - [Submitting an Assignment](https://help.gradescope.com/article/ccbpppziu9-student-submit-work) - [Adding Group Members](https://help.gradescope.com/article/m5qz2xsnjy-student-add-group-members) - [gradescope Student Help Center](https://help.gradescope.com/category/cyk4ij2dwi-student-workflow)