Build Your Custom Computer Vision App Using Python Library

Dillon Laird

Introduction

In this tutorial, you will learn how to build a simple โ€œOpen Door Detectorโ€ app with the computer vision platform, LandingLens. You will then learn how to deploy it to the cloud and use the landingai-python library to predict new images from your terminal. If you use Zoomโ€”or any other video conferencing toolโ€”this custom computer vision software model could be used to automatically detect if you left your door open in the background!

Requirements

  • LandingLens (register for a free trialย here)
  • Python (install for freeย here)
  • Access to your computerโ€™s terminal
  • Webcam
  • A door that you can open and close

Build the Model

First, letโ€™s start off by creating a new Object Detection Project in LandingLens and using the webcam to upload some pictures. After you open the Upload pop-up window, you can click the blue circular icon to take a picture. Take a few photos with your door open, and a few with the door closed. LandingLens requires at least 10 labeled images to train a model, so make sure you get at least 10 photos. Then create a Class and name it โ€œOpen Doorโ€. Use the labeling tool (bounding box) to label the door when itโ€™s open. Once youโ€™re done, click theย Trainย button in the upper right corner. Once the model is done training, click theย Saveย button in theย Modelsย panel. You can name it something like โ€œMy Modelโ€. To deploy the model, click theย Deployย button in theย Modelsย panel. A pop-up window opens, prompting you select either Cloud Deployment or Edge Deployment. Keep Cloud Deployment selected and click the plus sign at the bottom of the pop-up window to create an endpoint. An endpoint is the virtual location where the model will run inference. Now make sureย My Modelย is selected and clickย Deployย to deploy the model to that endpoint. Congratulations! Youโ€™ve deployed your first model.

Call the Model From the Python Library

First youโ€™ll want to generate an API key to run inference on the model. You can find this by clicking the User Menu in the top right corner and selectingย API Key. Then click Create New Key and save your API Key. If youโ€™re having trouble check out theย API Key tutorial. To ensure the software knows what endpoint to call, you will also need the endpoint ID. To find this, go to theย โ€œDeployโ€ย page, then click theย Copy Endpoint IDย button next to the endpoint name. Be sure to save it somewhere. Next, weโ€™ll install the Python Library. To install theย landingai-pythonย library run the following command in your terminal:

ย 
pip install landingai

Now, youโ€™re ready to use the Python Library! The main class youโ€™ll be using is the โ€œPredictorโ€ class, which allows you to send images to your endpoint. The model then makes predictions on those images. You can create a Predictor class by importing it from landingai, and passing it the endpoint ID and API Key.


from landingai.predict import Predictor

endpoint_id = "place your endpoint ID here"
api_key = "place your api key here"

door_model = Predictor(endpoint_id, api_key=api_key)

Now, take a photo with your webcam (or download one of the previous webcam photos from the LandingLens project) and save it as โ€œmy_image.pngโ€. We can open up the image using the Python Image Library (PIL)Image class. Then convert the image to a NumPy array to pass to the predictor. We can also import โ€œoverlay_predictionsโ€ to see the predictions on the image.


from landingai.visualize import overlay_predictions

image = np.asarray(Image.open("my_image.png"))
preds = door_model.predict(image)
image_with_preds = overlay_predictions(preds, image)
image_with_preds.save("my_image_with_preds.png")

You can save the image with the predictions overlaid as โ€œmy_image_with_preds.pngโ€ and view it on your computer You can take this a step further and run this live on your webcam. To do this, import the โ€œNetworkedCameraโ€ object, which gives you an easier way to interface with the model and the images. You can loop over frames captured from the NetworkedCamera object and automatically run predictions on them with โ€œrun_predictโ€, overlay the predictions with โ€œoverlay_predictionsโ€, and save them all in one line! This example is illustrated in the following snipper. To run the code, ensure that your terminal has permissions to access your webcam.


from landingai.predict import Predictor
from landingai.pipeline.image_source import NetworkedCamera

endpoint_id = "place your endpoint ID here"
api_key = "place your api key here"

door_model = Predictor(endpoint_id, api_key=api_key)
# 0 is the name of your webcam device
camera = NetworkedCamera(0)

for i, frame in enumerate(camera):
    # only run for 5 iterations so it doesn't run indefinitely
    if i > 5:
        break

    # the image_src="overlay" tells it to save the image from the
    # overlay operation, not the original image
    frame.run_predict(predictor=door_model).overlay_predictions().save_image(
        filename_prefix=f"webcam.{i}", image_src="overlay"
    )

And you can view the images! You could continue with the application and run this live while working and create a pop-up window if the door behind you suddenly opens.

Conclusion

In this tutorial, you learned to create a computer vision application using Pythonโ€”from starting a new Object Detection project to training, deploying, and calling your model from the Python Library for integration into a real application. To learn more about the APIs, check out our GitHub page.

 

Join our Newsletter

Subscribe to our Newsletter to receive exclusive offers, latest news and updates.

Decorative icon