|
| 1 | +{ |
| 2 | + "cells": [ |
| 3 | + { |
| 4 | + "cell_type": "code", |
| 5 | + "execution_count": 1, |
| 6 | + "metadata": {}, |
| 7 | + "outputs": [], |
| 8 | + "source": [ |
| 9 | + "# import the necessary packages\n", |
| 10 | + "import time\n", |
| 11 | + "import cv2\n", |
| 12 | + "import os\n", |
| 13 | + "import numpy as np\n", |
| 14 | + "import matplotlib.pyplot as plt" |
| 15 | + ] |
| 16 | + }, |
| 17 | + { |
| 18 | + "cell_type": "markdown", |
| 19 | + "metadata": {}, |
| 20 | + "source": [ |
| 21 | + "## Position of ball" |
| 22 | + ] |
| 23 | + }, |
| 24 | + { |
| 25 | + "cell_type": "markdown", |
| 26 | + "metadata": {}, |
| 27 | + "source": [ |
| 28 | + "We're given 15 frames of a batter hitting the baseball and we have to programatically find the positions of the baseball in these different frames. It can be done in a few steps:\n", |
| 29 | + "\n", |
| 30 | + "1. Background Subtraction : Subtract the previous frame from the current frame to get the moving pixels in the current frame. In our case, objects in motion would mainly be the body of the hitter, the bat and the baseball.\n", |
| 31 | + "2. Hough Circle : Once, we have the areas of interest, then we perfom `HoughCircles` on these regions to look for circular shaped objects (baseball in our case).\n", |
| 32 | + "3. Filter : In some cases, `HoughCircles` outputs false positives for a baseball at the bottom of the bat, which is also circular. This can be filtered out in a number of ways- by using greyscale intensities or positions. In our case, we use positions, i.e., we introduced a variable `leftmost`, which represents the leftmost cirle and assumes that the ball is always travelling in the left direction and any detection to the right of this variable is ignored. This gives the correct location for the baseball." |
| 33 | + ] |
| 34 | + }, |
| 35 | + { |
| 36 | + "cell_type": "markdown", |
| 37 | + "metadata": {}, |
| 38 | + "source": [ |
| 39 | + "Step 1 : Perform Background Subtraction" |
| 40 | + ] |
| 41 | + }, |
| 42 | + { |
| 43 | + "cell_type": "code", |
| 44 | + "execution_count": 2, |
| 45 | + "metadata": {}, |
| 46 | + "outputs": [], |
| 47 | + "source": [ |
| 48 | + "seg_imgs = []\n", |
| 49 | + "firstFrame = None\n", |
| 50 | + "for i in range(1,16):\n", |
| 51 | + " \n", |
| 52 | + " img_file = 'images/IMG' + str(i) + '.bmp'\n", |
| 53 | + " # read the image\n", |
| 54 | + " img = cv2.imread(img_file)\n", |
| 55 | + " # convert to greyscale\n", |
| 56 | + " gray_ = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)\n", |
| 57 | + " # Remove the noise\n", |
| 58 | + " gray = cv2.GaussianBlur(gray_, (15, 15), 0)\n", |
| 59 | + "\n", |
| 60 | + " # if the first frame is None, initialize it\n", |
| 61 | + " if firstFrame is None:\n", |
| 62 | + " firstFrame = gray\n", |
| 63 | + " continue\n", |
| 64 | + " \n", |
| 65 | + " # compute the absolute difference between the current frame and first frame\n", |
| 66 | + " frameDelta = cv2.absdiff(firstFrame, gray)\n", |
| 67 | + " \n", |
| 68 | + " # Create a binary representation\n", |
| 69 | + " thresh = cv2.threshold(frameDelta, 25, 255, cv2.THRESH_BINARY)[1]\n", |
| 70 | + " \n", |
| 71 | + " # dilate the thresholded image to fill in holes\n", |
| 72 | + " thresh = cv2.dilate(thresh, None, iterations=2)\n", |
| 73 | + "\n", |
| 74 | + " # Get the image\n", |
| 75 | + " seg = thresh*gray_\n", |
| 76 | + "\n", |
| 77 | + " # Set the new frame to the previous one\n", |
| 78 | + " firstFrame = gray\n", |
| 79 | + " \n", |
| 80 | + " seg_imgs.append(seg)\n" |
| 81 | + ] |
| 82 | + }, |
| 83 | + { |
| 84 | + "cell_type": "markdown", |
| 85 | + "metadata": {}, |
| 86 | + "source": [ |
| 87 | + "Step 2 & 3: Apply `Hough Transform` to the subtracted image to get circular shaped objects (baseball in our case)" |
| 88 | + ] |
| 89 | + }, |
| 90 | + { |
| 91 | + "cell_type": "code", |
| 92 | + "execution_count": 3, |
| 93 | + "metadata": {}, |
| 94 | + "outputs": [ |
| 95 | + { |
| 96 | + "name": "stdout", |
| 97 | + "output_type": "stream", |
| 98 | + "text": [ |
| 99 | + "position of ball in 2 image : [551, 802]\n", |
| 100 | + "position of ball in 3 image : [538, 803]\n", |
| 101 | + "position of ball in 4 image : [510, 799]\n", |
| 102 | + "position of ball in 5 image : [481, 792]\n", |
| 103 | + "position of ball in 6 image : [450, 784]\n", |
| 104 | + "position of ball in 7 image : [417, 778]\n", |
| 105 | + "position of ball in 8 image : [379, 769]\n", |
| 106 | + "position of ball in 9 image : [341, 761]\n", |
| 107 | + "position of ball in 10 image : [296, 747]\n", |
| 108 | + "position of ball in 11 image : [253, 742]\n", |
| 109 | + "position of ball in 12 image : [202, 732]\n", |
| 110 | + "position of ball in 13 image : [149, 723]\n", |
| 111 | + "position of ball in 14 image : [93, 707]\n", |
| 112 | + "position of ball in 15 image : [30, 695]\n" |
| 113 | + ] |
| 114 | + } |
| 115 | + ], |
| 116 | + "source": [ |
| 117 | + "center = []\n", |
| 118 | + "leftmost = 1280\n", |
| 119 | + "for j, seg1 in enumerate(seg_imgs):\n", |
| 120 | + " bgr2gray = seg1.copy()\n", |
| 121 | + " \n", |
| 122 | + " # define maxradius depending on the frame as ball gets bigger in radius in later frames\n", |
| 123 | + " if j >6:\n", |
| 124 | + " maxRadius = 23\n", |
| 125 | + " else:\n", |
| 126 | + " maxRadius = 19\n", |
| 127 | + " \n", |
| 128 | + " #apply houghcircles\n", |
| 129 | + " circle = cv2.HoughCircles(bgr2gray, cv2.HOUGH_GRADIENT, 1, 30, param1 = 100, param2 = 14, minRadius = 13, maxRadius = maxRadius)\n", |
| 130 | + " circles = np.uint(circle)\n", |
| 131 | + " \n", |
| 132 | + " # Loop through the circles\n", |
| 133 | + " for\ti in circles[0,:]:\n", |
| 134 | + " \n", |
| 135 | + " # only consider points to the left of the center of baseball in previous frame\n", |
| 136 | + " if i[0]<leftmost-10:\n", |
| 137 | + " #\tdraw\tthe\touter\tcircle\n", |
| 138 | + " cv2.circle(bgr2gray,(i[0],i[1]),i[2],(0,255,0),1)\n", |
| 139 | + " #\tdraw\tthe\tcenter\tof\tthe\tcircle\n", |
| 140 | + " cv2.circle(bgr2gray,(i[0],i[1]),2,(0,0,255),3)\n", |
| 141 | + " center.append([i[0],i[1]])\n", |
| 142 | + " print('position of ball in {} image : {}'.format(j+2, [i[0],i[1]]))\n", |
| 143 | + "\n", |
| 144 | + " # update the leftmost point after evry frame \n", |
| 145 | + " leftmost = np.sort(np.array(center), axis=0)[0][0]\n" |
| 146 | + ] |
| 147 | + }, |
| 148 | + { |
| 149 | + "cell_type": "markdown", |
| 150 | + "metadata": {}, |
| 151 | + "source": [ |
| 152 | + "Write the results to `results/` folder. Each image shows the center of the baseball." |
| 153 | + ] |
| 154 | + }, |
| 155 | + { |
| 156 | + "cell_type": "code", |
| 157 | + "execution_count": 4, |
| 158 | + "metadata": {}, |
| 159 | + "outputs": [], |
| 160 | + "source": [ |
| 161 | + "# Write the results \n", |
| 162 | + "for j in range(2,16):\n", |
| 163 | + " img_file = 'IMG' + str(j) + '.bmp'\n", |
| 164 | + " img = cv2.imread(img_file)\n", |
| 165 | + " cv2.circle(img,(center[j-2][0],center[j-2][1]),2,(0,0,255),3)\n", |
| 166 | + " cv2.imwrite('results/IMG' + str(j) + 'result.jpg', img)" |
| 167 | + ] |
| 168 | + }, |
| 169 | + { |
| 170 | + "cell_type": "markdown", |
| 171 | + "metadata": {}, |
| 172 | + "source": [ |
| 173 | + "## Velocity of ball" |
| 174 | + ] |
| 175 | + }, |
| 176 | + { |
| 177 | + "cell_type": "markdown", |
| 178 | + "metadata": {}, |
| 179 | + "source": [ |
| 180 | + "Calculate velocity by dividing the Euclidean Distance between points in subsequent frames with the time elpased between frames.\n", |
| 181 | + "\n", |
| 182 | + "Euclidean distance is calculated using \n", |
| 183 | + "\n", |
| 184 | + "<center>$d(p,q) = \\sqrt{(p1-q1)^2 + (p2-q3)^2 + ... + (pn-qn)^2}$</center>\n", |
| 185 | + "<center>where p and q are n-dimesional points of the same object in subsequent frames</center>\n", |
| 186 | + "\n", |
| 187 | + "<br>\n", |
| 188 | + "For time elapsed, we take the reciprocal for `fps`, which is 240 in our case\n", |
| 189 | + "\n", |
| 190 | + "<center>$t = (1/fps)$</center>\n", |
| 191 | + "\n", |
| 192 | + "<br>\n", |
| 193 | + "Finally, velocity is calculated\n", |
| 194 | + "\n", |
| 195 | + "<center>$vel = d(p,q)/t$</center>" |
| 196 | + ] |
| 197 | + }, |
| 198 | + { |
| 199 | + "cell_type": "code", |
| 200 | + "execution_count": 5, |
| 201 | + "metadata": {}, |
| 202 | + "outputs": [ |
| 203 | + { |
| 204 | + "name": "stdout", |
| 205 | + "output_type": "stream", |
| 206 | + "text": [ |
| 207 | + "velocity of ball in 3 image : 15.02 mm/s\n", |
| 208 | + "velocity of ball in 4 image : 32.583 mm/s\n", |
| 209 | + "velocity of ball in 5 image : 34.367 mm/s\n", |
| 210 | + "velocity of ball in 6 image : 36.882 mm/s\n", |
| 211 | + "velocity of ball in 7 image : 38.639 mm/s\n", |
| 212 | + "velocity of ball in 8 image : 44.987 mm/s\n", |
| 213 | + "velocity of ball in 9 image : 44.736 mm/s\n", |
| 214 | + "velocity of ball in 10 image : 54.291 mm/s\n", |
| 215 | + "velocity of ball in 11 image : 49.87 mm/s\n", |
| 216 | + "velocity of ball in 12 image : 59.871 mm/s\n", |
| 217 | + "velocity of ball in 13 image : 61.93 mm/s\n", |
| 218 | + "velocity of ball in 14 image : 67.093 mm/s\n", |
| 219 | + "velocity of ball in 15 image : 73.881 mm/s\n" |
| 220 | + ] |
| 221 | + } |
| 222 | + ], |
| 223 | + "source": [ |
| 224 | + "# Get the velocity \n", |
| 225 | + "velocity = []\n", |
| 226 | + "# convert fps so spf \n", |
| 227 | + "delta_t = (1/240)\n", |
| 228 | + "for i in range(1,len(center)):\n", |
| 229 | + "\n", |
| 230 | + " # Calculate Euclidean distance between current and previous frame\n", |
| 231 | + " dist = np.sqrt((np.int(center[i][0])- np.int(center[i-1][0]))**2 + \n", |
| 232 | + " (np.int(center[i][1])- np.int(center[i-1][1]))**2)\n", |
| 233 | + "\n", |
| 234 | + " # Convert pixel to mm\n", |
| 235 | + " pix2mm = dist*0.0048 \n", |
| 236 | + " \n", |
| 237 | + " # Calculate velocity\n", |
| 238 | + " vel = np.round((pix2mm/delta_t),3)\n", |
| 239 | + " \n", |
| 240 | + " velocity.append(vel)\n", |
| 241 | + " print('velocity of ball in {} image : {} mm/s'.format(i+2, vel))\n", |
| 242 | + " \n", |
| 243 | + " " |
| 244 | + ] |
| 245 | + }, |
| 246 | + { |
| 247 | + "cell_type": "markdown", |
| 248 | + "metadata": {}, |
| 249 | + "source": [ |
| 250 | + "## Alternatives" |
| 251 | + ] |
| 252 | + }, |
| 253 | + { |
| 254 | + "cell_type": "markdown", |
| 255 | + "metadata": {}, |
| 256 | + "source": [ |
| 257 | + "<h>**Lucas-Kanade Method**</h>\n", |
| 258 | + " \n", |
| 259 | + " I tried to use Sparse Optical flow to track movement of baseball. I performed the following steps:\n", |
| 260 | + " \n", |
| 261 | + " 1. Used `HoughCircles` on the first frame to get a few circular objects (baseball being one of them)\n", |
| 262 | + " 2. For the next frame, I used `cv2.calcOpticalFlowPyrLK` to get the optical flow of the selected circular object and only kept the ones with new points(gives the baseball location).\n", |
| 263 | + " 3. Repeated step 2 for all the subsequent frames\n", |
| 264 | + " \n", |
| 265 | + "Problem : After frame 9, it loses track of the baseball and gives wrong predictions.<br>\n", |
| 266 | + "Reason : After frame 9, the motion gets bigger from camera perspective, i.e., the distance between baseball in subsequent frames increases.<br>\n", |
| 267 | + "Solutions tried : Choose pyramid structures with Lucas-Kanade and use larger window size.\n", |
| 268 | + "\n", |
| 269 | + "I was still unable to improve the predictions with the solutions tried so I gave up on this approach." |
| 270 | + ] |
| 271 | + }, |
| 272 | + { |
| 273 | + "cell_type": "code", |
| 274 | + "execution_count": 6, |
| 275 | + "metadata": {}, |
| 276 | + "outputs": [ |
| 277 | + { |
| 278 | + "name": "stdout", |
| 279 | + "output_type": "stream", |
| 280 | + "text": [ |
| 281 | + "2 frame\n", |
| 282 | + "position of ball: [551, 802]\n", |
| 283 | + "\n", |
| 284 | + "\n", |
| 285 | + "3 frame\n", |
| 286 | + "position of ball: [538, 803]\n", |
| 287 | + "velocity of ball : 15.02 mm/s\n", |
| 288 | + "\n", |
| 289 | + "\n", |
| 290 | + "4 frame\n", |
| 291 | + "position of ball: [510, 799]\n", |
| 292 | + "velocity of ball : 32.583 mm/s\n", |
| 293 | + "\n", |
| 294 | + "\n", |
| 295 | + "5 frame\n", |
| 296 | + "position of ball: [481, 792]\n", |
| 297 | + "velocity of ball : 34.367 mm/s\n", |
| 298 | + "\n", |
| 299 | + "\n", |
| 300 | + "6 frame\n", |
| 301 | + "position of ball: [450, 784]\n", |
| 302 | + "velocity of ball : 36.882 mm/s\n", |
| 303 | + "\n", |
| 304 | + "\n", |
| 305 | + "7 frame\n", |
| 306 | + "position of ball: [417, 778]\n", |
| 307 | + "velocity of ball : 38.639 mm/s\n", |
| 308 | + "\n", |
| 309 | + "\n", |
| 310 | + "8 frame\n", |
| 311 | + "position of ball: [379, 769]\n", |
| 312 | + "velocity of ball : 44.987 mm/s\n", |
| 313 | + "\n", |
| 314 | + "\n", |
| 315 | + "9 frame\n", |
| 316 | + "position of ball: [341, 761]\n", |
| 317 | + "velocity of ball : 44.736 mm/s\n", |
| 318 | + "\n", |
| 319 | + "\n", |
| 320 | + "10 frame\n", |
| 321 | + "position of ball: [296, 747]\n", |
| 322 | + "velocity of ball : 54.291 mm/s\n", |
| 323 | + "\n", |
| 324 | + "\n", |
| 325 | + "11 frame\n", |
| 326 | + "position of ball: [253, 742]\n", |
| 327 | + "velocity of ball : 49.87 mm/s\n", |
| 328 | + "\n", |
| 329 | + "\n", |
| 330 | + "12 frame\n", |
| 331 | + "position of ball: [202, 732]\n", |
| 332 | + "velocity of ball : 59.871 mm/s\n", |
| 333 | + "\n", |
| 334 | + "\n", |
| 335 | + "13 frame\n", |
| 336 | + "position of ball: [149, 723]\n", |
| 337 | + "velocity of ball : 61.93 mm/s\n", |
| 338 | + "\n", |
| 339 | + "\n", |
| 340 | + "14 frame\n", |
| 341 | + "position of ball: [93, 707]\n", |
| 342 | + "velocity of ball : 67.093 mm/s\n", |
| 343 | + "\n", |
| 344 | + "\n", |
| 345 | + "15 frame\n", |
| 346 | + "position of ball: [30, 695]\n", |
| 347 | + "velocity of ball : 73.881 mm/s\n", |
| 348 | + "\n", |
| 349 | + "\n" |
| 350 | + ] |
| 351 | + } |
| 352 | + ], |
| 353 | + "source": [ |
| 354 | + "# Print the Final values\n", |
| 355 | + "\n", |
| 356 | + "for i, c in enumerate(center):\n", |
| 357 | + " \n", |
| 358 | + " print('{} frame'.format(i+2))\n", |
| 359 | + " print('position of ball: {}'.format(c))\n", |
| 360 | + " if i>0:\n", |
| 361 | + " print('velocity of ball : {} mm/s'.format(velocity[i-1]))\n", |
| 362 | + " print('\\n')" |
| 363 | + ] |
| 364 | + } |
| 365 | + ], |
| 366 | + "metadata": { |
| 367 | + "kernelspec": { |
| 368 | + "display_name": "Python 3", |
| 369 | + "language": "python", |
| 370 | + "name": "python3" |
| 371 | + }, |
| 372 | + "language_info": { |
| 373 | + "codemirror_mode": { |
| 374 | + "name": "ipython", |
| 375 | + "version": 3 |
| 376 | + }, |
| 377 | + "file_extension": ".py", |
| 378 | + "mimetype": "text/x-python", |
| 379 | + "name": "python", |
| 380 | + "nbconvert_exporter": "python", |
| 381 | + "pygments_lexer": "ipython3", |
| 382 | + "version": "3.6.4" |
| 383 | + } |
| 384 | + }, |
| 385 | + "nbformat": 4, |
| 386 | + "nbformat_minor": 2 |
| 387 | +} |
0 commit comments