{"id":1706,"date":"2025-07-13T23:48:36","date_gmt":"2025-07-13T21:48:36","guid":{"rendered":"https:\/\/cammonte.com\/?page_id=1706"},"modified":"2025-07-14T07:29:35","modified_gmt":"2025-07-14T05:29:35","slug":"opencv-interview-questions","status":"publish","type":"page","link":"https:\/\/cammonte.com\/index.php\/opencv-interview-questions\/","title":{"rendered":"OpenCV Interview Questions"},"content":{"rendered":"\n<h1 class=\"wp-block-heading\">Random notes<\/h1>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OpenCV uses BGR instead of RGB, need to convert if using different libraries<\/li>\n\n\n\n<li>OpenCV coord system on an image originates at top left corner<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\">Basic Questions<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">What is OpenCV and why is it used?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Open-source library for real-time computer vision<\/li>\n\n\n\n<li>Used for image processing, video processing<\/li>\n\n\n\n<li>Cross-platform and can be used in Python, C++, Java and MATLAB<\/li>\n\n\n\n<li>Efficient (can be used for real-time) and extensive function library<\/li>\n\n\n\n<li>Real world uses\n<ul class=\"wp-block-list\">\n<li>Security and surveillance<\/li>\n\n\n\n<li>Industrial automation: visual inspection systems<\/li>\n\n\n\n<li>Healthcare and medical imaging: anomalies detection in imaging<\/li>\n\n\n\n<li>Robotics and drones: SLAM<\/li>\n\n\n\n<li>Augmented reality: detect AR markers<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">How would you load and display an image<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Load<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><code>imread()<\/code>with path to the file, URL or bytestring (see colab)<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Error handling: it will return None if the image could not be found, check for that<\/li>\n\n\n\n<li>Flags: can read the image in colour, grayscale, with alpha channel, &#8230; Changes how the image will be interpreted<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Display<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\"><code>imshow()<\/code>with image as argument to display it<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Has to be followed by <code>waitKey()<\/code>to keep the window open\n<ul class=\"wp-block-list\">\n<li><code>waitKey(0)<\/code> waits indefinitely for a key pressed<\/li>\n\n\n\n<li><code>waitKey(&gt;0)<\/code> waits for that many milliseconds<\/li>\n\n\n\n<li><code>waitKey(1)<\/code> used in loops, for non-blocking behaviour, window does not freeze and still responds to inputs<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Follow again with <code>destroyAllWindows()<\/code> to close it properly<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What is image thresholding and how do you use it?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Simple image segmentation method based on pixel intensity: creates a binary (black and white) image <strong>from a grayscale image<\/strong> where all pixels above threshold are white and all pixels below threshold are black\n<ul class=\"wp-block-list\">\n<li><strong>Pixel intensity<\/strong>: grayscale value of the pixel (0 to 255)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Used for separating foreground from background, or isolating objects of interest<\/li>\n\n\n\n<li>Useful when there&#8217;s high contrast between object and background<\/li>\n\n\n\n<li><strong>Simple thresholding<\/strong>\n<ul class=\"wp-block-list\">\n<li><code>cv2.threshold(img, thresh, maxval, type)<\/code><\/li>\n\n\n\n<li>Same threshold value to every pixel in the image<\/li>\n\n\n\n<li>Useful for even lighting, strong contrast between foreground and background<\/li>\n\n\n\n<li>Fails under uneven lighting (shadows, highlights)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Adaptative thresholding<\/strong>\n<ul class=\"wp-block-list\">\n<li><code>cv2.adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType, blockSize, C<\/code>)<\/li>\n\n\n\n<li>Each pixel gets its own threshold based on the <strong>local neighborhood<\/strong>.<\/li>\n\n\n\n<li>Useful for non-uniform lighting, documents with shadows, gradients or smudges<\/li>\n\n\n\n<li>Can introduce noise in very uniform images<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Choosing appropriate threshold value: Otsu&#8217;s method<\/strong>\n<ul class=\"wp-block-list\">\n<li><code>ret2,th2 = cv.threshold(img, 0, 255, cv.THRESH_BINARY+cv.THRESH_OTSU)<\/code><\/li>\n\n\n\n<li>Computes an optimal threshold that minimizes intra-class variance (difference within foreground and background pixels)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">How would you detect edges in an image<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Edge detection is finding boundaries or transitions in an image: areas where pixel intensity changes sharply<\/li>\n\n\n\n<li>OpenCV methods are <strong>gradient-based<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Sobel<\/strong>: finds gradient in x or y direction using convolution with small kernels<\/li>\n\n\n\n<li><strong>Laplacian<\/strong>: finds areas of rapid intensity change, measures second derivative (how fast gradient is changing)<\/li>\n\n\n\n<li><strong>Canny<\/strong>: multi-stage: applies gaussian blur, uses Sobel to compute gradients, thin out edges to 1-pixel width, thresholds weak and strong edges\n<ul class=\"wp-block-list\">\n<li>Usually preferred, gives better results and less susceptible to noise<\/li>\n\n\n\n<li>Based on two thresholds: edges below t1 are discarded, edges between t1 and t2 are weak edges (kept only if connected to strong ones), edges above t2 are strong edges (thresholds on gradient magnitude)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Steps<\/strong>: convert image to grayscale, reduce noise (Gaussian blur), apply edge detection<\/li>\n\n\n\n<li>Useful for object detection, image segmentation<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What is image blurring and why is it useful?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Reduce noise and details in an image, convolve image with a low-pass filter kernel<\/li>\n\n\n\n<li><strong>Gaussian blur<\/strong>: apply a gaussian kernel to the image, nearby pixels are weighted based on a normal distribution (closer pixels contribute more)<\/li>\n\n\n\n<li><strong>Median blur<\/strong>: replaces each pixel with median value in its neighbourhood\n<ul class=\"wp-block-list\">\n<li>Especially good for salt-and-pepper noise<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Bilateral filtering<\/strong>: weighted average of the neighbouring pixels but takes into account that close pixels get higher weight, pixels with similar intensity gets higher weight, pixels that are too different in intensity get very little weight -&gt; keeps edges sharp\n<ul class=\"wp-block-list\">\n<li>Best as preserving edges while smoothing<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Useful as preprocessing for a lot of tasks, smoothes out small details and minor variations that can interfere with larger scale analysis<\/li>\n\n\n\n<li>How choice of kernel affects the result\n<ul class=\"wp-block-list\">\n<li>Larger kernel = better noise removal but more loss of detail<\/li>\n\n\n\n<li>Method and kernel size changes trade off between noise reduction and loss of detail<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What are image moments and how are they used?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Scalar values that summarise the spatial distribution of pixel intensities in the image\n<ul class=\"wp-block-list\">\n<li>Capture area, centroid, orientation, symmetry of shapes in the image<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><code><code>cv2.moments(binary_image_or_contour)<\/code><\/code> returns a dictionary of moment values\n<ul class=\"wp-block-list\">\n<li><strong>Spatial moments <code>mij<\/code><\/strong>\n<ul class=\"wp-block-list\">\n<li>Area m00 (for binary images it&#8217;s just the number of white pixels)<\/li>\n\n\n\n<li>values to compute the centroid m10 and m01<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Central moments <code>muij<\/code><\/strong>\n<ul class=\"wp-block-list\">\n<li>Measure shape&#8217;s distribution relative to its centroid, they are translation invariant (do not change it object moves in the image)<\/li>\n\n\n\n<li>Useful to compute orientation, compare shape regardless of location<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Normalised central moments <code>nuij<\/code><\/strong>\n<ul class=\"wp-block-list\">\n<li>Translation and scale invariant<\/li>\n\n\n\n<li>Useful to compare shape at different sizes<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Hu moments (shape descriptors)<\/strong>\n<ul class=\"wp-block-list\">\n<li>7 values derived from normalised central moments<\/li>\n\n\n\n<li>Translation, scale and rotation invariant<\/li>\n\n\n\n<li><code>cv2.HuMoments(M)<\/code><\/li>\n\n\n\n<li>Useful for shape matching and recognition<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>If you give <code>cv2.moments()<\/code> a <strong>binary image<\/strong>, it treats all white pixels (non-zero) as a <strong>single shape\/region<\/strong>.<\/li>\n\n\n\n<li>If you give it a <strong>contour<\/strong> (like one item from <code>cv2.findContours()<\/code>), it computes the moments <strong>for that specific contour<\/strong> (i.e., one object).<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What is template matching and how do you use it?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Finding a small image (template) within a larger image (search area) by sliding the template across the larger image and comparing pixel values at each location.<\/li>\n\n\n\n<li><code>result = cv2.matchTemplate(image, template, method)<\/code>\n<ul class=\"wp-block-list\">\n<li><code>result<\/code> is matrix where each pixel represents how well the template matches at that location, extract the min\/max value and its location to find the best match<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Method<\/th><th>Description<\/th><th>Limitations\/notes<\/th><\/tr><\/thead><tbody><tr><td><code>cv2.TM_SQDIFF<\/code><\/td><td>Squared difference<\/td><td>Sensitive to brightness<\/td><\/tr><tr><td><code>cv2.TM_SQDIFF_NORMED<\/code><\/td><td>Normalized squared difference<\/td><td>All the normed versions are better for varying lighting<\/td><\/tr><tr><td><code>cv2.TM_CCORR<\/code><\/td><td>Cross-correlation<\/td><td>Sensitive to overall intensity<\/td><\/tr><tr><td><code>cv2.TM_CCORR_NORMED<\/code><\/td><td>Normalized cross-correlation<\/td><td>Usually best<\/td><\/tr><tr><td><code>cv2.TM_CCOEFF<\/code><\/td><td>Correlation coefficient<\/td><td>Can fail with uniform regions<\/td><\/tr><tr><td><code>cv2.TM_CCOEFF_NORMED<\/code><\/td><td>Normalized correlation coefficient<\/td><td><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Sensitive to scale, rotation and lighting changes (won&#8217;t match), usually not suited for real-world complex scenes<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What are basic drawing operations?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OpenCV lets you draw directly onto images using basic geometric shapes and text<\/li>\n\n\n\n<li>Basic drawing functions\n<ul class=\"wp-block-list\">\n<li><code>cv2.line(img, pt1, pt2, color, thickness)<\/code><\/li>\n\n\n\n<li><code>cv2.rectangle(img, pt1, pt2, color, thickness)<\/code><\/li>\n\n\n\n<li><code>cv2.circle(img, center, radius, color, thickness)<\/code><\/li>\n\n\n\n<li><code>cv2.putText(img, text, org, font, fontScale, color, thickness, lineType)<\/code><\/li>\n\n\n\n<li><code>cv2.polylines(img, [pts], isClosed=True, color=(0, 255, 0), thickness=2)<\/code><\/li>\n\n\n\n<li><code>cv2.drawContours(img, contours, -1, (255, 255, 0), 2)<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Drawing modifies the image you pass in, so if you want to preserve the original, make a copy <code>img_copy = img.copy()<\/code><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What is histogram equalization and why is it useful?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Contrast enhancement technique that calculates histogram of input image (nb of pixel at each intensity), redistributes the intensities so that the histogram is more balanced and stretches out regions where pixel intensities are clumped together<\/li>\n\n\n\n<li><code>cv2.equalizeHist(gray)<\/code><\/li>\n\n\n\n<li>Can amplify noise and does not preserve local detail<\/li>\n\n\n\n<li>Adaptative histogram equalisation (CLAHE)\n<ul class=\"wp-block-list\">\n<li>Divides the image into small tiles and equalizes each tile separately -&gt; reduce noise boosting<\/li>\n\n\n\n<li><code>clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))<\/code><\/li>\n\n\n\n<li><code>equalized = clahe.apply(gray)<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Useful preprocessing step to make features more distinguishable in low contrast images<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">How would you resize an image while maintaining its aspect ratio?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>OpenCV expects you to provide target width and height of the resized image.<\/li>\n<\/ul>\n\n\n\n<p class=\"has-text-align-center wp-block-paragraph\"><code>resized = cv2.resize(img, (new_width, new_height), interpolation=cv2.INTER_AREA)<\/code><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>To preserve aspect ratio, resize by width or height: choose target width or height and scale the other according to scaling factor original_width\/target_width<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\">Junior questions<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">How would you detect lines in an image?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Hough Transform detects straight lines in an edge-detected image by voting for potential lines, transforms each point in the image into a set of possible lines and finds the most consistent (i.e., voted) ones.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But Hough uses polar coordinates, have to convert to cartesian when drawing<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>edges = cv2.Canny(gray_img, 50, 150)\nlines = cv2.HoughLines(edges, rho=1, theta=np.pi\/180, threshold=100)\nif lines is not None:\n    for line in lines:\n        rho, theta = line&#91;0]\n        a = np.cos(theta)\n        b = np.sin(theta)\n        x0 = a * rho\n        y0 = b * rho\n        # Convert polar to Cartesian line endpoints\n        x1 = int(x0 + 1000 * (-b))\n        y1 = int(y0 + 1000 * (a))\n        x2 = int(x0 - 1000 * (-b))\n        y2 = int(y0 - 1000 * (a))\n        cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 2)<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">How would you detect circles in an image?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Use the Hough circle transform: extension of the hough transform used to detect lines, it detects circles by searching for groups of pixels that form circular shapes based on circle equation.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Detect circles\ncircles = cv2.HoughCircles(\n    img,\n    cv2.HOUGH_GRADIENT,\n    dp=1.2,\n    minDist=30,\n    param1=100,\n    param2=30,\n    minRadius=10,\n    maxRadius=100\n)\n\n# Draw the detected circles\nif circles is not None:\n    circles = np.uint16(np.around(circles))\n    for (x, y, r) in circles&#91;0, :]:\n        cv2.circle(img, (x, y), r, (0, 255, 0), 2)      # Outer circle\n        cv2.circle(img, (x, y), 2, (0, 0, 255), 3)      # Center point<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>minDist is minimum distance between detected centers, param1 is upper threshold for Canny edge detector, param2 is threshold for center detection (lower = more circles)<\/li>\n\n\n\n<li>Sensitive to noise, lighting and partial occlusions, doesn&#8217;t work for ellipses<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Can you explain what a kernel is in the context of image convolution?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Small grid of numbers (typically 3\u00d73, 5\u00d75, or 7\u00d77) that <strong>slides over the image<\/strong> and performs a <strong>convolution operation<\/strong> at each pixel location.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The kernel determines:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>What kind of transformation is applied (e.g., blur, detect edges)<\/li>\n\n\n\n<li>How each pixel&#8217;s value is changed based on its neighborhood<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Convolution: multiply each value in the kernel with the corresponding pixel in the image region, <strong>sum the result<\/strong>, and assign it to the output pixel (repeated for each pixel in the image)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How would you rotate an image by a specific angle?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Compute the rotation matrix and apply the rotation matrix using affine transformation<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>cv2.getRotationMatrix2D(center, angle, scale=1.0)<\/code>\n<ul class=\"wp-block-list\">\n<li>center = rotation point, usually image center<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><code>rotated = cv2.warpAffine(img, M, (w, h))<\/code>\n<ul class=\"wp-block-list\">\n<li><code>(w, h)<\/code> sets the output size (same as original in this case)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">If you want to rotate by multiples of 90deg just use <code>cv2.rotate(src, cv2.ROTATE_90_CLOCKWISE)<\/code><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is the purpose of cv2.inRange() function and how is it commonly used?<\/h2>\n\n\n\n<p class=\"has-text-align-center wp-block-paragraph\"><code>mask = cv2.inRange(image, lower_bound, upper_bound)<\/code><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Compares each pixel in the input image to a lower and upper bound, returns a binary mask:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>255<\/code> (white) where the pixel falls within the range<\/li>\n\n\n\n<li><code>0<\/code> (black) where it does not<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Commonly used to extract objects of a certain colour, region segmentation, &#8230;<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Image processing questions<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">What is image segmentation is and why is it used for?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Divide an image into distinct regions based on certain criteria (colour, intensity, &#8230;) -&gt; make it easier to understand an analyse for detection, classification, measurement<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Types of segmentation<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Thresholding<\/strong>: separates pixels based on intensity\n<ul class=\"wp-block-list\">\n<li><code>cv2.threshold(gray_img, 127, 255, cv2.THRESH_BINARY)<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Colour-based segmentation<\/strong>: use colour ranges to extract regions\n<ul class=\"wp-block-list\">\n<li><code>cv2.inRange(hsv_image, lower_bound, upper_bound)<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Contour-based segmentation<\/strong>: find object outlines using <code>cv2.findContours()<\/code><\/li>\n\n\n\n<li><strong>Watershed algorithm<\/strong>: separating touching or overlapping objects\n<ul class=\"wp-block-list\">\n<li>Treats image like a topographic surface: pixel intensity is elevation, bright areas are peaks and dark areas are valleys<\/li>\n\n\n\n<li>&#8220;Floods&#8221; the image from specific markers, boundaries are built where different regions meet<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>GrabCut<\/strong> (interactive segmentation): foreground\/background segmentation with user input\n<ul class=\"wp-block-list\">\n<li><code>cv2.grabCut(image, mask, rect, bgModel, fgModel, iterCount, mode)<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Can you describe the process of histogram analysis in image processing?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Graphical representation of the distribution of pixel intensities in an image, can compute separate histogram for each channel for RGB images.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>What It Tells You<\/th><th>Interpretation Example<\/th><\/tr><\/thead><tbody><tr><td><strong>Brightness<\/strong><\/td><td>Histogram shifted left = darker image<\/td><\/tr><tr><td><strong>Contrast<\/strong><\/td><td>Narrow histogram = low contrast<\/td><\/tr><tr><td><strong>Dynamic range<\/strong><\/td><td>Full-width histogram = good range<\/td><\/tr><tr><td><strong>Dominant tones<\/strong><\/td><td>Peaks at certain intensity values<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use cases\n<ul class=\"wp-block-list\">\n<li>Image enhancement: histogram equalisation modifies global or local contrast<\/li>\n\n\n\n<li>Thresholding: Otsu&#8217;s method uses histogram to find optimal threshold<\/li>\n\n\n\n<li>Image comparison: compare colour histograms to identify similar images<\/li>\n\n\n\n<li>Colour filtering: identify dominant colour ranges to create masks or segment specific regions<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Colour histogram example<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>channels = ('b', 'g', 'r')\nfor i, col in enumerate(channels):\n    hist = cv2.calcHist(&#91;img], &#91;i], None, &#91;256], &#91;0, 256])\n    plt.plot(hist, color=col)<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Put [0] instead of i for a grayscale image<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is feature matching and how is it performed?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Identifying corresponding points (features) between two images, relies on finding distinctive, repeatable points and describing them using feature descriptors, descriptors are then compared across images to find matches<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Method<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Detect kepoints: identify interesting points in image (corners, blobs)<\/li>\n\n\n\n<li>Compute descriptors: describe local neighbourhood around each keypoint<\/li>\n\n\n\n<li>Match descriptors: compare descriptors between two images to find matching points<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">OpenCV feature detectors and descriptors<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">ORB is rotation invariant<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>orb = cv2.ORB_create()\nkp1, des1 = orb.detectAndCompute(img1, None)\nkp2, des2 = orb.detectAndCompute(img2, None)<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">Matching techniques<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Brute-force matcher `BFMatcher` compares every descriptor in one image to every descriptor in another<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)  # For ORB\nmatches = bf.match(des1, des2)\nmatches = sorted(matches, key=lambda x: x.distance)\n\nmatched_img = cv2.drawMatches(img1, kp1, img2, kp2, good&#91;:50], None, flags=2)\ncv2.imshow(\"Matches\", matched_img)<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Useful for 3d reconstruction: match features across views for triangulation<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is the role of image pyramids?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Image pyramid<\/strong>: a collection of images derived from a single source image, where\n<ul class=\"wp-block-list\">\n<li>Each level in the pyramid is a lower-resolution version of the previous one<\/li>\n\n\n\n<li>Resolution typically reduced by a factor of 2 at each level<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Gaussian pyramid<\/strong>: successively blurred and downsampled (reduce resolution by half) versions\n<ul class=\"wp-block-list\">\n<li><code>lower_res = cv2.pyrDown(image)<\/code> \u2192 downscale image by half<\/li>\n\n\n\n<li><code>higher_res = cv2.pyrUp(lower_res)<\/code> \u2192 upscale image by 2\u00d7 (not the same as original)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Laplacian pyramid:<\/strong> stores difference between levels of a gaussian pyramid, captures the detail lost during downsampling<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>gaussian_down = cv2.pyrDown(image)\ngaussian_up = cv2.pyrUp(gaussian_down)\nlaplacian = cv2.subtract(image, gaussian_up)<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Can use them when trying to match something at different scales (build a pyramid of the template and try to match it at different resolutions)<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Algorithms questions<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Explain the concept of camera calibration and how it&#8217;s performed<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Camera calibration estimates the internal characteristics of a camera (intrinsic parameters: focal length, distorsion coeffs) and how it&#8217;s positioned in space (extrinsic parameters: rotation and translation relative to scene) to remove lens distorsion and map 2D image points to 3D real-world coordinates (reconstruct 3D scenes from multiple views, improve accuracy in 3D scanning).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Process<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Capture multiple images of a chessboard from different angles and distances<\/li>\n\n\n\n<li>Detect corners of chessboard with <code>cv2.findChessboardCorners()<\/code><\/li>\n\n\n\n<li>Prepare object points (known 3D coordinates) and image points (detected 3D corners in each image)<\/li>\n\n\n\n<li>Calibrate the camera with\n<ul class=\"wp-block-list\">\n<li> <code>ret, camera_matrix, dist_coeffs, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, image_size, None, None)<\/code><\/li>\n\n\n\n<li><code>camera_matrix<\/code>: contains focal lengths and optical center<\/li>\n\n\n\n<li><code>dist_coeffs<\/code>: distortion coefficients (radial &amp; tangential)<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Undistort future images\n<ul class=\"wp-block-list\">\n<li><code>undistorted = cv2.undistort(img, camera_matrix, dist_coeffs)<\/code><\/li>\n\n\n\n<li>Basically remove lens distorsions like barrel distortion, pincushion distortion, tangential distortion, line appear straight after removing distortion<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">Explain the concept of non-maximum suppression in the context of object detection<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Non-Maximum Supression (NMS) is a post-processing step in object detection. Object detectors often detect the same object multiple times, outputting overlapping bounding boxes with different confidence scores -&gt; NMS keeps only the one with highest confidence score and removes the rest. It eliminates duplicate detection of the same object.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><code>indices = cv2.dnn.NMSBoxes(boxes, confidences, score_thresh, nms_thresh)<\/code><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How would you approach the task of 3D reconstruction from multiple 2D images using OpenCV?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Goal<\/strong>: estimate 3D coordinates of points in the real world using their 2D projections in two or more images.<\/li>\n\n\n\n<li><strong>Key concept: triangulation<\/strong>: if you observe a point from at least two different angles, you can <strong>triangulate<\/strong> its position in 3D space<\/li>\n<\/ul>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Camera calibration\n<ul class=\"wp-block-list\">\n<li>To get intrinsic parameters and distortion coefficients<\/li>\n\n\n\n<li><code>ret, K, dist, rvecs, tvecs = cv2.calibrateCamera(...)<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Detect and match features\n<ul class=\"wp-block-list\">\n<li>Use feature detectors (SIFT, ORB, etc.) to find and match keypoints between images.\n<ul class=\"wp-block-list\">\n<li><code>kp1, des1 = sift.detectAndCompute(img1, None)<\/code><\/li>\n\n\n\n<li><code>kp2, des2 = sift.detectAndCompute(img2, None)<\/code><\/li>\n\n\n\n<li><code>matches = bf.match(des1, des2)<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Estimate fundamental matrix\n<ul class=\"wp-block-list\">\n<li>Describes geometric relationship between the images<\/li>\n\n\n\n<li><code>F, mask = cv2.findFundamentalMat(pts1, pts2, method=cv2.FM_RANSAC)<\/code><\/li>\n\n\n\n<li>Or if you know the camera&#8217;s intrinsics\n<ul class=\"wp-block-list\">\n<li><code>E, _ = cv2.findEssentialMat(pts1, pts2, K)<\/code><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Recover camera pose\n<ul class=\"wp-block-list\">\n<li><code>_, R, t, mask = cv2.recoverPose(E, pts1, pts2, K)<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Triangulate points\n<ul class=\"wp-block-list\">\n<li>With the relative camera poses and matched points, you can reconstruct the 3D points.<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<pre class=\"wp-block-code\"><code>proj1 = K @ np.hstack((np.eye(3), np.zeros((3,1))))      # First camera matrix\nproj2 = K @ np.hstack((R, t))                            # Second camera matrix\n\npoints_4d = cv2.triangulatePoints(proj1, proj2, pts1.T, pts2.T)\npoints_3d = points_4d&#91;:3] \/ points_4d&#91;3]  # Convert from homogeneous to 3D<\/code><\/pre>\n\n\n\n<h1 class=\"wp-block-heading\">Advanced questions<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">How would you approach improving the performance of an existing OpenCV application that is running slowly?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Profile, don&#8217;t guess (<code>timeit<\/code>)<\/li>\n\n\n\n<li>Reduce image size before heavy computation if possible\n<ul class=\"wp-block-list\">\n<li><code>small = cv2.resize(frame, (width \/\/ 2, height \/\/ 2))<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Avoid recalculating things unnecessarily\n<ul class=\"wp-block-list\">\n<li>Cache constant results (kernels), precompute masks or lookup tables, reuse results across frames when possible<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Use vectorized or built-in cv2 functions (instead of loops)<\/li>\n\n\n\n<li>Apply region of interest (ROI): if you&#8217;re only interested in part of the image (like a face or license plate), crop it and process only that region\n<ul class=\"wp-block-list\">\n<li><code>roi = frame[y:y+h, x:x+w]<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Use efficient algorithms, swap out slow algos for faster ones<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Task<\/th><th>Slow<\/th><th>Faster Alternative<\/th><\/tr><\/thead><tbody><tr><td>Feature detection<\/td><td>SIFT\/SURF<\/td><td>ORB or AKAZE<\/td><\/tr><tr><td>Background subtraction<\/td><td>MOG2<\/td><td>KNN or custom thresholding<\/td><\/tr><tr><td>Dense optical flow<\/td><td>Farneback<\/td><td>Lucas-Kanade (sparse)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use OpenCV with GPU (if available)\n<ul class=\"wp-block-list\">\n<li>To look into<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Batch or Approximate Expensive Work\n<ul class=\"wp-block-list\">\n<li>Don&#8217;t run detection every frame \u2014 use every Nth frame<\/li>\n\n\n\n<li>Approximate with faster methods if precision isn\u2019t critical<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Use Efficient File I\/O\n<ul class=\"wp-block-list\">\n<li>Load and save images using OpenCV (not PIL or other slower libs)<\/li>\n\n\n\n<li>Minimize disk I\/O in loops<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Suppose you encounter a situation where the image quality is poor due to low lighting. What techniques would you use to enhance the image for better analysis?<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Histogram equalisation or CLAHE (Contrast Limited Adaptive Histogram Equalization)<\/strong>: enhance contrast by spreading out pixel intensities, CLAHE if uneven lighting<\/li>\n\n\n\n<li><strong>Gamma Correction<\/strong>: brightens image non-linearly. Useful when image is very dark but not noisy<\/li>\n\n\n\n<li><strong>Denoising<\/strong>: low light often increases sensor noise\n<ul class=\"wp-block-list\">\n<li><code>cv2.fastNlMeansDenoisingColored(image, None, 10, 10, 7, 21)<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Bilateral Filtering<\/strong>: optional, but helps in smoothing without losing details:<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\">If you find that your image segmentation results are not satisfactory, what strategies would you use to troubleshoot and refine your approach?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Look at your results, what is not satisfactory? Look at your input data, what could improve its clarity? Look at your different stages, when does it start looking bad?<\/li>\n\n\n\n<li>Improve preprocessing: denoising, histogram eq, colour space conversion<\/li>\n\n\n\n<li>Fine tuning parameters and thresholds<\/li>\n\n\n\n<li>Morphological operations<\/li>\n\n\n\n<li>Try different segmentation techniques<\/li>\n\n\n\n<li>Consider switching to DL<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">How would you approach integrating OpenCV with other machine learning frameworks for a comprehensive project?<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>OpenCV + PyTorch<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Preprocessing:<\/strong> Use OpenCV to load and preprocess images, convert to PyTorch tensors.<\/li>\n\n\n\n<li><strong>Postprocessing:<\/strong> Use OpenCV to display model output (e.g. draw boxes for object detection).<\/li>\n\n\n\n<li><strong>Example<\/strong>: <strong>Real-time object detection on webcam using OpenCV + PyTorch:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Use OpenCV to access webcam and preprocess frames.<\/li>\n\n\n\n<li>Run inference using a PyTorch model.<\/li>\n\n\n\n<li>Use OpenCV to draw results (e.g., bounding boxes, labels).<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Preprocessing with OpenCV vs torchvision<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Reasons to Use OpenCV for Preprocessing<\/strong>\n<ul class=\"wp-block-list\">\n<li><strong>Performance (Especially on CPU)<\/strong>\n<ul class=\"wp-block-list\">\n<li>OpenCV is implemented in C\/C++ under the hood and is <strong>highly optimized<\/strong> for image I\/O and manipulation on CPU.<\/li>\n\n\n\n<li>For large-scale or real-time applications (like video frames), OpenCV tends to be <strong>faster than torchvision.transforms<\/strong>, especially for tasks like resizing, blurring, or color space conversion.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>More Versatile and Feature-Rich<\/strong>\n<ul class=\"wp-block-list\">\n<li>OpenCV supports a <strong>broader range of image processing operations<\/strong>, you can use it for classic vision tasks like contour detection<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Gives more low level control over images<\/strong>\n<ul class=\"wp-block-list\">\n<li>You can fine-tune resizing (e.g. interpolation type), manually handle color spaces, or crop with pixel-level precision<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>When you might prefer torchvision<\/strong>\n<ul class=\"wp-block-list\">\n<li>Some transforms can be GPU accelerated, useful for data augmentation<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h1 class=\"wp-block-heading\">Structure from Motion (SfM)<\/h1>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Goal: recover 3D geometry from 2D images<\/li>\n\n\n\n<li>Common solution is triangulation: use corresponding image points in multiple views, important prerequisite is determination of camera calibration and position (projection matrix)<\/li>\n\n\n\n<li>SfM algos allow simultaneous computation of projection matrices and 3D points using corresponding points in each view\n<ul class=\"wp-block-list\">\n<li>Given [math]n[\/math] projected points [math]u_{ij}[\/math] with [math]i \\in {1 . . . m}[\/math] and [math]j \\in\u2208 {1 . . . n}[\/math] in [math]m[\/math] images, the goal is to find both projection matrices [math]P_1, &#8230;, P_m[\/math] and a consistent 3D structure [math]X_1, &#8230;, X_n[\/math].<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Process<\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Feature extraction\n<ul class=\"wp-block-list\">\n<li>Detect a number of key points in each image, 8 minimum, usually corners<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Feature matching\n<ul class=\"wp-block-list\">\n<li>Match each key point to its equivalent in each point of view<\/li>\n\n\n\n<li>Template matching, optical flow, &#8230;<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>3D reconstruction\n<ul class=\"wp-block-list\">\n<li>When you look at a 3D scene with two cameras from two different views, 3D point projects to a 2D point in each image. These 2D points lie along known epipolar lines<\/li>\n\n\n\n<li>All such corresponding points must satisfy an equation involving the fundamental matrix F:\n<ul class=\"wp-block-list\">\n<li>A 3&#215;3 matrix that encodes the epipolar geometry between two uncalibrated cameras<\/li>\n\n\n\n<li>If x_1 and x_2 are corresponding points in image 1 and image 2, then: x_2^T \\cdot F \\cdot x_1 = 0\n<ul class=\"wp-block-list\">\n<li>the point in image 2 lies on the <strong>epipolar line<\/strong> computed from the point in image 1<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>So what you want to do to 3D reconstruct:\n<ul class=\"wp-block-list\">\n<li>Compute the fundamental matrix F<\/li>\n\n\n\n<li>Decompose F into the projection matrices of cameras 1 and 2<\/li>\n\n\n\n<li>Triangulate points using the 2D points and camera matrices<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Bundle adjustment\n<ul class=\"wp-block-list\">\n<li>Minimizing a cost function that is related to a weighted sum of squared reprojection errors of the projection of the computed 3D points and their multi-view original image points.<\/li>\n\n\n\n<li>Filter out inconsistent 3D points by detecting their reprojection errors as outliers<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">Public repos: Open SfM and Colmap<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Multi-View Stereo (MVS)<\/h1>\n\n\n\n<p class=\"wp-block-paragraph\">TODO<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Performances improvements<\/h1>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Avoid using loops in Python as much as possible, especially double\/triple loops etc. They are inherently slow.<\/li>\n\n\n\n<li>Vectorise the algorithm\/code to the maximum extent possible, because Numpy and OpenCV are optimized for vector operations.<\/li>\n\n\n\n<li>Exploit the cache coherence.<\/li>\n\n\n\n<li>Never make copies of an array unless it is necessary. Try to use views instead. Array copying is a costly operation.<\/li>\n\n\n\n<li>Python map function and list comprehension are faster than basic for loops<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code># Loop version\nnewlist = &#91;]\nfor word in oldlist:\n    newlist.append(word.upper())\n\n# Map version instead\nnewlist = map(str.upper, oldlist)\n\n# List comprehension version instead\nnewlist = &#91;s.upper() for s in oldlist]<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Data aggregation because of function call overhead<\/li>\n<\/ul>\n\n\n\n<pre class=\"wp-block-code\"><code>x = 0\n\ndef doit1(i):\n    global x\n    x t = time.time()= x + i\nlist = range(100000)\nfor i in list:\n    doit1(i)\n\n# Faster version\nx = 0\ndef doit2(list):\n    global x\n    for i in list:\n        x = x + i\nlist = range(100000)\ndoit2(list)<\/code><\/pre>\n\n\n\n<h1 class=\"wp-block-heading\">OpenCV vs Pillow vs Scikit Image<\/h1>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use <strong>Pillow<\/strong> if you&#8217;re doing lightweight, clean image editing (e.g. web apps, thumbnails).<\/li>\n\n\n\n<li>Use <strong>OpenCV<\/strong> for performance-heavy or vision-heavy work (e.g. object detection, tracking, real-time processing).<\/li>\n\n\n\n<li>Use <strong>skimage<\/strong> for research, education, and NumPy-integrated scientific image processing.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">OpenCV and skimage both have np compatibility (treat images as numpy ndarrays), Pillow does not.<\/p>\n\n\n\n<h1 class=\"wp-block-heading\">Computer Vision Data Augmentations<\/h1>\n\n\n\n<h2 class=\"wp-block-heading\">Geometric data augmentations<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Help the model become invariant to position and orientation<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Augmentation<\/th><th>Description<\/th><\/tr><\/thead><tbody><tr><td><strong>Flip<\/strong><\/td><td>Horizontal\/vertical mirroring<\/td><\/tr><tr><td><strong>Rotation<\/strong><\/td><td>Rotate image by small angles (e.g. \u00b115\u00b0)<\/td><\/tr><tr><td><strong>Scaling<\/strong><\/td><td>Resize image, optionally keeping aspect ratio<\/td><\/tr><tr><td><strong>Translation<\/strong><\/td><td>Shift image in x and\/or y direction<\/td><\/tr><tr><td><strong>Cropping<\/strong><\/td><td>Random or center crops (useful for zoom or context variation)<\/td><\/tr><tr><td><strong>Shearing<\/strong><\/td><td>Slant the image along an axis<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Color &amp; Lighting Adjustments<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Useful for natural images where lighting varies<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Augmentation<\/th><th>Description<\/th><\/tr><\/thead><tbody><tr><td><strong>Brightness<\/strong><\/td><td>Lighten or darken image<\/td><\/tr><tr><td><strong>Contrast<\/strong><\/td><td>Enhance or reduce contrast<\/td><\/tr><tr><td><strong>Saturation<\/strong><\/td><td>Modify color intensity<\/td><\/tr><tr><td><strong>Hue adjustment<\/strong><\/td><td>Shift color tones<\/td><\/tr><tr><td><strong>Color jittering<\/strong><\/td><td>Random combo of the above<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Noise and blur<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Helps with robustness to camera quality and real-world conditions<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Augmentation<\/th><th>Description<\/th><\/tr><\/thead><tbody><tr><td><strong>Gaussian noise<\/strong><\/td><td>Add small pixel-wise noise<\/td><\/tr><tr><td><strong>Salt and pepper<\/strong><\/td><td>Random black\/white pixels<\/td><\/tr><tr><td><strong>Gaussian blur<\/strong><\/td><td>Slight blurring to simulate focus loss<\/td><\/tr><tr><td><strong>Motion blur<\/strong><\/td><td>Mimic camera or object motion<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Occlusion and cutout<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Teaches the model not to depend on any one region of the image.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Augmentation<\/th><th>Description<\/th><\/tr><\/thead><tbody><tr><td><strong>Cutout \/ Random Erasing<\/strong><\/td><td>Black out a random square patch<\/td><\/tr><tr><td><strong>Random occlusion<\/strong><\/td><td>Simulate objects partially hidden<\/td><\/tr><tr><td><strong>Grid mask<\/strong><\/td><td>Overlay mask with missing patches<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Synthetic data<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Great for robustness and data diversity.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Augmentation<\/th><th>Description<\/th><\/tr><\/thead><tbody><tr><td><strong>Mixup<\/strong><\/td><td>Combine two images and labels by blending<\/td><\/tr><tr><td><strong>CutMix<\/strong><\/td><td>Paste a patch from one image into another<\/td><\/tr><tr><td><strong>Style transfer<\/strong><\/td><td>Alter texture while keeping structure<\/td><\/tr><tr><td><strong>GAN-based augmentation<\/strong><\/td><td>Generate synthetic images from real samples<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Random notes Basic Questions What is OpenCV and why is it used? How would you load and display an image Load imread()with path to the file, URL or bytestring (see colab) Display imshow()with image as argument to display it What is image thresholding and how do you use it? How would you detect edges in [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"ub_ctt_via":"","site-container-style":"default","site-container-layout":"default","site-sidebar-layout":"default","disable-article-header":"default","disable-site-header":"default","disable-site-footer":"default","disable-content-area-spacing":"default","footnotes":""},"class_list":["post-1706","page","type-page","status-publish","hentry"],"featured_image_src":null,"jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/cammonte.com\/index.php\/wp-json\/wp\/v2\/pages\/1706","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cammonte.com\/index.php\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/cammonte.com\/index.php\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/cammonte.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cammonte.com\/index.php\/wp-json\/wp\/v2\/comments?post=1706"}],"version-history":[{"count":75,"href":"https:\/\/cammonte.com\/index.php\/wp-json\/wp\/v2\/pages\/1706\/revisions"}],"predecessor-version":[{"id":1789,"href":"https:\/\/cammonte.com\/index.php\/wp-json\/wp\/v2\/pages\/1706\/revisions\/1789"}],"wp:attachment":[{"href":"https:\/\/cammonte.com\/index.php\/wp-json\/wp\/v2\/media?parent=1706"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}