The 1 frame is down sampled by the leaf size of 1 cm, in order to make the following process faster.
Detecting all planes using normal-based plane detection.
Looking for the plane whose centroid is nearest to the camera as the desk.
Calculating the centroid and normal of the desk plane. Computing the transform matrix, in order to make the coordinate's origin (0, 0, 0) at the centroid of the desk, and the x-y plane on the desk, and the +z axis is vertical to the desk on the above side. This transformation will be applied in every frame in the future.
This region will be segmented as the interested area in every future frame. The region was selecting by calculating the max-min value of the desk (x-min, x-max, y-min, y-max, z-min, z-max), and then segmenting the area of (x-min, x-max), (y-min, y-max), (z-min, z-max+0.25). That means, the 0.25m area above the desk will be considered and segmented.
However, in the case which the desk is connected with the wall, sometimes part of the wall will be in this area due to some errors in reality. So we can segment the area of (x-min, x-max), (y-min, y-max-0.02), (z-min, z-max+0.25) to avoid the clustering error.
Octree is a hierarchical tree data structure for representing point cloud data (to separate the 3D space in to voxels, and to use the voxels which contain the points to represent the cloud).
Point Cloud Module Octree Wikipedia of Octree
The octree of the segmented is build for all the first 11 visible frames. The parameter, the leaf length is set to 5 cm.
In point cloud libaray, the Octree change detection function can detect the changes between two point clouds. The new voxels, which do not appear in another point cloud can be detected and stored.
The change detection will be applied in the first 11 frames of the desk region, frame by frame. The voxels which exist in all 11 frames will be considered as the model structre.
The points extracted from the 11th frame based on the octree structure is considered as the model's point cloud.
The resolution of the octree is chosen as 0.05(leaf size of 5cm).
Euclidean clustering is applied on the near-desk region. The distance threshold is set as 2 cm. That is, the points in the neighborhood 2cm of the object will all be considered as part of the object.
The information, like centroid coordinates and the bounding box, of all the objects in the model are stored.
The method to get the centroid of the objects is the function getCentroid in the PCL library.
The normal vector of the front plane is detected. Based on this, the box will be rotated parallel to the x-y axis.
After that, the maximum and minimum value of the coordinates can represent the bounding box. In order to make it more accurate, some methods are being developed.
Every frame after the initialization would be compared with the model by octree. And the following parameters will be calculated and recorded.
The number of the points in new voxels (voxels exist in the current frame but not in the model) will be recorded.This will be an important index of the mode judgement (whether the camera is covered by obstacles) for the detection. If an visible obstacle is covering the camera, this value should get larger.
Here I used the points number in the desk world model to subtract the points number in the current frame which locate in the common voxels with the model. The quantity can roughly indicate how many points are mising from the desk model.
A queue of length 10 is build, in which the number of new points in this frame and its last 9 frames are dynamically recorded. The SD(standard deviation) of this 10 values is always calculated, which is also an important index for telling whether the scene is dramatically changing or stable.
The SD(standard deviation) of the missing points in 10 frames is also calculated, which also represents the points changing rate.
So we have for every frame 4 values, no. of new points/missing points, SD of new points/missing points, for judging whether the camera is covered and whether the scenario is stable.
When suitable conditions are fulfilled, the model will be updated as well as the information of the objects.
The objects will be clustered again, and the information of the objects will also be updated.