As part of my robotics research at Carnegie Mellon, I developed a computer vision pipeline to quickly and reliably detect crop leaves in real-time from images taken by our group's mobile agricultural robot. These computer vision algorithms take in stereo images of sorghum (or corn) and outputs a simplified, three dimensional model of the stalks and leaves within the scene. A Multisense S7 stereo camera by Carnegie Robotics is rigidly mounted to the side of the vertical mast of the mobile robot base and provides dense 3D range data of the environment. The Multisense S7 begins by capturing simultaneous images from right and left cameras and uses an onboard FPGA to perform rapid correspondence matching between images to calculate distance values for points in the scene. Through this process, the Multisense can output depth maps and colored point clouds at up to 15 fps. An example of a set of stereo images from the Multisense as well as the resulting point cloud can be seen below.
The leaf detection pipeline begins by creating a voxelized representation of the raw point cloud. This method significantly reduces the number of points in the cloud by instead storing only the centroids of points within small discretized 3D volume elements. This voxel grid is then converted into a k-d tree using the Point Cloud Library (PCL), a form of binary search tree that dramatically speeds up searching for points throughout space. A clustering algorithm is then employed, using the k-d tree for added efficiency, to detect and store large groups of points using a 2 cm distance threshold. Only clusters consisting of a certain number of points, in this case 100 points, are saved for later processing.
The next stage of the pipeline works to identify stalks within these large clusters through the use of an iterative linear random sample consensus (RANSAC) algorithm. Ultimately, all points belonging to stalks will be segmented out of the cloud in order to simplify the search for leaves. The linear ransac algorithm attempts to find one or more 3D line(s) that best fit the data within each cluster while ignoring potential outliers. This linear RANSAC algorithm for stalk segmentation is discussed in much more detail in the section below. For each cluster, all of the points that are not determined as inliers to a detected stalk are retained while the stalk inlier points are segmented out.
At this point in the pipeline, the clusters now represent a significantly downsampled version of the original point cloud where all stalk points as well as noisey outlier points have been removed. Thus, the existence of leaves should now be much easier to detect. Next, another RANSAC style algorithm is used that attempts to fit parabolic leaf models to the remaining points. A second order polynomial functions as an accurate model for sorghum leaves given the natural curvature. The details of this parabolic RANSAC leaf detection algorithm are discussed in much more detail in the section below. The inliers of a successful parabolic leaf model are considered to be points that belong to a leaf in the scene. Furthermore, the inliers projected onto the parabolic model form a clean parabola in 3D space that cuts through the middle of each leaf to form a sort of virtual leaf midrib. The inliers from both the linear stalk detection and parabolic leaf detection algorithms can be plotted together to form a clean 3D model of all the sorghum stalks and leaves from the original scene, as shown in the pipeline figure above.
Example of Detected Stalks Using Linear RANSAC
As stated in the previous section, stalk segmentation is performed using a RANSAC algorithm that attempts to fit a line to each cluster by maximizing the number of points within a certain threshold distance while ignoring outliers in the data. The random nature of a RANSAC algorithm makes it non-deterministic, however, it is often run many thousands of times to ensure an accurate model is found. The steps of the linear RANSAC stalk detection algorithm are detailed below.
(ax + by + c = 0)
are derived and stored.The value of key parameters used throughout the algorithm can be seen below:
Example of Detected Leaves Points (light green) Overlaid on Raw Point Cloud
The Parabolic RANSAC leaf detection algorithm is similar in principle to the linear RANSAC method described above. In essence, the algorithm attempts to find the set of model parameters, in this case for a parabola, that maximize the number of inliers in a cluster of points. However, the key difference with this algorithm is in the complexity of solving for the model parameters given a random set of points. Intuitively, it may seem as though three randomly selected points would define a unique parabola in three dimensional space much like three points can define a parabola in two dimensional space, however, this is far from the case. In two dimensions, three points of the form (x,y) only uniquely define a parabola given the constraint that the parabola is a function of y in terms of x, and thus has an axis of symmetry parallel with the y-axis. Without this constraint, a set of three points can be fit by an infinite number of parabolas in most cases. Therefore, in order for a set of three points to uniquely represent a single parabola, the plane on which the parabola will lie as well as a vector specifying the direction of the axis of symmetry must be given.
The general shape and orientation of sorghum leaves can be used to help form these constraints. Primarily, it can be noted that the midrib of almost all sorghum leaves lies on a plane that is perpendicular to the ground plane. In other words, from a top down view of the leaf, the midrib does not appear to curve at all. Furthermore, it is almost always the case that the point of inflection along a leaf’s midrib is also the highest point on the leaf (or the point furthest from the ground plane). This would mean that a parabolic representation of a leaf would have an axis of symmetry very close to the vertical vector. For these two reasons, all parabolic leaf models are constrained to planes that are perpendicular to the ground plane while also having axes of symmetry equal to the vertical vector.
The value of key parameters used throughout the algorithm can be seen below:
After running this parabolic RANSAC algorithm to completion on all of the clusters, a full model of all stalks and leaves in the original scene should now be extracted. This is the last stage of the leaf detection pipeline, however, the parabolic leaf models will now be used to detect optimal grasp points for the robotic manipulator.