Applying computer vision techniques to Tello

volate!lo · Apr 20, 2020

I have no clue if you are still around. If so:
This video kept inspiring me to write TelloMe (an app that can do followme based on computer vision, both for Android and iOS).

I also had a look at pose estimation. Looks like this "tello selfie" is a possibility. The detection precision isn't as perfect as on your full scale model but maybe I can work around this.

I just wanted to say thank you. Really cool and inspiring. Are you stil on Android or iOS now?I'd like to send you a promo code for TelloMe

geaxgx · Apr 21, 2020

volate!lo said:
I have no clue if you are still around. If so:
This video kept inspiring me to write TelloMe (an app that can do followme based on computer vision, both for Android and iOS).

I also had a look at pose estimation. Looks like this "tello selfie" is a possibility. The detection precision isn't as perfect as on your full scale model but maybe I can work around this.

I just wanted to say thank you. Really cool and inspiring. Are you stil on Android or iOS now?I'd like to send you a promo code for TelloMe

That sounds great, volatello ! I am curious to know how you deal with the follow-me problem.
On my side, recently I had the opportunity to study (via Udacity) the Intel Openvino toolkit: a toolkit which allows to run (essentially) computer vision neural networks on intel processors. Models from well-know frameworks (tensorflow, pytorch,...) can be converted in a optimized representation and run efficiently on a CPU for instance. It was occasion for me to port my tello selfie project on Openvino. The github repo is there : geaxgx/tello-humanpose-openvino
I am using an optimized pose estimation model provided by intel, it is not as accurate as the original Openpose model, but I am able to run the program at around 10 fps, and it is still usable.
I would not be surprised if pose estimation can work now on phones, specially if you don't do multiperson pose estimation (vs one person pose estimation). It is something I wasn't aware of before working on the openvino project : the postprocessing in the multiperson case, where you associate body parts to bodies, can be very cpu intensive. When you know you have only one person in your image, you can skip this costly part.

Sure, I would be very happy to try TelloMe (I am on android)

Thank you volatello !

volate!lo · Apr 21, 2020

For TelloMe iOS I use the ios vision kit.This is proprietary and I have no clue what they use under the hood, but it works very well. Even an "old" iphone 6s can do 7-8 fps for face / person / object detection and tracking, and even more for tracking just visual patterns.

For TelloMe Android I use Firebase MLKit which has a nice face tracker. Its a bit slow but very precise and gives a lot of detail information that might be useful for future developments (smile detection, euler angles, facemarks, etc)
For object tracking I use tensorflow which has some decent lean models. So far I am relying on pre-trained models as I never found the time to dig into the details of training.

I'll send you an android promo code in a minute

shaiq ahmed · Nov 28, 2021

can i get the source code, i am doing my final year project in this and i am stuck in some places..

Hacky · Nov 28, 2021

@shaiq ahmed :
This thread started with a pose tracking approach based on OpenPose and Volatello joined the discussion with "TelloMe" app, which uses simpler models for face / person / object detection due to limited ressources on a smartphone.

So what exactly is the approach you are looking for your "final year project"?

If you are looking for pose tracking, perhaps you should also take a look at this thread: YAPT - Yet Another Pose Tracking (AI based)
In the first post, I started with a TensorFlow based pose tracking model, which was pretty small (only about 7,5 MB). In contrast to OpenPose, it only detects one person (the most prominent one, which is desired, when you want to follow it) and has only fraction of the performance requirements of OpenPose.

As the frame rate on non-GPU accelerated systems was still pretty slow, I switched to another pose tracking model shown in post #3 (YAPT - Yet Another Pose Tracking (AI based)). This model is way more accurate and reliable (e.g. also with unusual arm positions). It delivers about 6-8 fps even on non-GPU accelerated systems. Even on a current smartphone it allows about 2-4 fps.

I do not intend to share my whole code for that project, but if you are interested, we can discuss some details via PN.

BriterDays · Nov 30, 2021

geaxgx said:
Hi, nice to have a dedicated Tello dev forum !

My first experiments :

To view this content we will need your consent to set third party cookies.
For more detailed information, see our cookies page.

Looking to do this with my customized tello for my final project of my Fundamentals of Engineering Design Class. Well done!

Roshan04 · Dec 25, 2021

geaxgx said:
Hi, nice to have a dedicated Tello dev forum !

My first experiments :

To view this content we will need your consent to set third party cookies.
For more detailed information, see our cookies page.

Hi
I realize the thread is rather old. I have a question for you or anyone else who might have more info. I see in your video (before you use the CSRT tracker) that articulate your requirement that the face is always at the center of the bounding box and your video shows that your face is at the center. Is that using face detector + tracker?

I have a unique problem but not unlike yours. I have an object detector (using DNN) that detects a stationary object. But I need Tello to navigate to that object. I would like Tello to keep that object at the center of its camera frame. In other words I would like tello to track that object as it navigates closer to that object. do you have any recommendations on how to do this?

One other question: How close does your tracker track the face? As you get closer to the object the entire object may not be in the box. so what do you do in that case? I have a feeling that may not have been a problem for you, since you wanted to take selfies at a certain distance. But if you have any suggestions, please let me know

Thanks for your help

Hacky · Dec 26, 2021

Navigating to an object is not so much different from tracking an object. You need a closed feedback loop with some kind of PID controller. Movements can be controlled using "RC" command. In order to get closer, your logic must make sure that the bounding box increases size.

When the bounding box exceeds the frame, you have a problem. In that case, you can make a guess, how much distance is still left and use a "GO" command with that distance for the final approach.

Roshan04 · Dec 26, 2021

Hacky said:
Navigating to an object is not so much different from tracking an object. You need a closed feedback loop with some kind of PID controller. Movements can be controlled using "RC" command. In order to get closer, your logic must make sure that the bounding box increases size.

When the bounding box exceeds the frame, you have a problem. In that case, you can make a guess, how much distance is still left and use a "GO" command with that distance for the final approach

Hi thanks for the input. I should have been more specific about my question. My question is this: as I detect the object using DNN, I get the center of the bounding box of that object and height and width of the box. I would like my camera co-ordinates to be centered around the center of the bounding box. In order to do that, I have to move the drone using rc command (or any other command) such that the drone camera center is the center of the bounding box that was detected by the object detector. How can I translate the pixel co-ordinate differential between the current image center and the bounding box of the object detected (where I would like my image to be centered), to actual movements of the drone? Do you have any suggestions on how to do this? I cant apply typical 3d to 2d graphics transformations for this, since I dont have any of the distances to the object or F (Focal point) etc.

Another question is this: if I specify both forward velocity and the up velocity in the rc command, would the drone move up and forward at an angle? Given that rc command allows you to specify only the velocity, does the drone keep on moving until it has no more battery or it crashes onto another object or such condition?

Hacky · Dec 27, 2021

Regarding the translation of pixel-distance in your "image coordinate system" into movement commands: You will have to find out by yourself. Basically it means to figure out a factor that you apply to the number of pixels you currently determined as offset between the actual and the desired position - the "position error" as well as for the width of the bounding box (giving you an indicator for the distance). This factor will be different for the yaw turns, the horizontal movements (x, y) and the vertical movements (z).

In feedback control systems, you usually feed this error into a PID controller. If you only look at the errors and multiply them with a certain factor to retrieve values for the x, y and z values of the rc command, you only hava a "P controller". This means, you will typically overshoot and get into oszillations due to the latency in the control loop and the mass inertia. In order to compensate latency and inertia, you'll also have to figure out values for I and D of the PID controller.

These values do not fall from the sky and probably no one here can give you these values. It's a time consuming tuning process with a lot of trial and error. There are lots of videos on YT that explain you these principles and give you hints for PID tuning.

Regarding your second question: Yes. If you specify forward and upward velocity at the same time, the drone will move forward at an angle. But not endless. If Tello does not receive further rc comands within a certain time frame, it will stop moving and land.

Roshan04 · Dec 27, 2021

Hacky said:
Regarding the translation of pixel-distance in your "image coordinate system" into movement commands: You will have to find out by yourself. Basically it means to figure out a factor that you apply to the number of pixels you currently determined as offset between the actual and the desired position - the "position error" as well as for the width of the bounding box (giving you an indicator for the distance). This factor will be different for the yaw turns, the horizontal movements (x, y) and the vertical movements (z).

In feedback control systems, you usually feed this error into a PID controller. If you only look at the errors and multiply them with a certain factor to retrieve values for the x, y and z values of the rc command, you only hava a "P controller". This means, you will typically overshoot and get into oszillations due to the latency in the control loop and the mass inertia. In order to compensate latency and inertia, you'll also have to figure out values for I and D of the PID controller.

These values do not fall from the sky and probably no one here can give you these values. It's a time consuming tuning process with a lot of trial and error. There are lots of videos on YT that explain you these principles and give you hints for PID tuning.

Regarding your second question: Yes. If you specify forward and upward velocity at the same time, the drone will move forward at an angle. But not endless. If Tello does not receive further rc comands within a certain time frame, it will stop moving and land.

Thanks for your reply. I understand the theory quite well. I was trying to see if somebody who has done this with Tello can tell me what they found through their calibration exercise, say how much drone movement results in how much pixel movement etc. Looks like you dont have direct experience doing this. I appreciate your input though.

Hacky · Dec 27, 2021

The examples linked in my signature, would not be possible without "direct experience". I can give you hints to cook your meal but you seem to expect convenience food served on a silver tray.