当前位置：首页 > 编程资源 > 编程问答 >内容正文

编程问答

纹理对象的实时姿态估计

发布时间：2025/7/25 编程问答 51 豆豆

生活随笔收集整理的这篇文章主要介绍了纹理对象的实时姿态估计小编觉得挺不错的,现在分享给大家,帮大家做个参考.

如今，增强现实技术是计算机视觉和机器人领域的热门研究课题之一。The most elemental problem in augmented reality is the estimation of the camera pose respect of an object in the case of computer vision area to do later some 3D rendering or in the case of robotics obtain an object pose in order to grasp it and do some manipulation。然而，这不是图像处理中最常见的问题，实际上最常见的问题是应用的很多算法或数学运算解决一个问题的计算消耗。

目的：

在文将解释如何建立一个实时相机姿态估计应用程序，用来跟踪一个具有六个自由度（six degrees of freedom）的纹理对象（基于给定的 2D image and its 3D textured model）.

代码具有以下功能部分：

1) 读取3D纹理对象和对象的网格数据(Read 3D textured object model and object mesh)
2) 输入源为摄像头或者视频文件（Take input from Camera or Video）
3) 提取ORB特征（Extract ORB features and descriptors from the scene）
4) 基于Flann算法对ORB特征描述子进行匹配（Match scene descriptors with model descriptors using Flann matcher）
5) 使用PnP + Ransac进行姿态估计（Pose estimation using PnP + Ransac）
6) 使用线性卡尔曼滤波去除错误的姿态估计（Linear Kalman Filter for bad poses rejection）

理论

在计算机视觉中，从 n对3D到2D点对应关系中估计相机的的姿态是一个很基础和很好理解的问题。姿态估计（The most general version）一般需要估计姿态的六个自由度和五个标定参数: focal length, principal point, aspect ratio and skew. 　姿态估计使用著名的DLT(Direct Linear Transform)算法，可以使用最少6组对应关系建立（It could be established with a minimum of 6 correspondences）。对于DLT算法的精度提高，有非常多的简化方法，最常用简化方法为参数标定，也就是Perspective-*n*-Point问题：

（问题的公式）Problem Formulation:

给定一组3D点pi（expressed in a world reference frame）和投影到图像的上的2D点ui的对应关系，我们尝试得到相机相对于世界坐标系(w.r.t. the world)的姿态（R 和 t）和焦距f.。OpenCV 提供了4种不同的方法解决Perspective-*n*-Point 问题，返回值为 R 和 t 。然后使用下面的公式可以把3D的点投影到图像平面上:

s ⎡⎣⎢⎢uv1⎤⎦⎥⎥=⎡⎣⎢⎢⎢fx000fy0cxcy1⎤⎦⎥⎥⎥⎡⎣⎢⎢r11r21r31r12r22r32r13r23r33t1t2t3⎤⎦⎥⎥⎡⎣⎢⎢⎢⎢XYZ1⎤⎦⎥⎥⎥⎥

关于公式的详细文档参见：　 Camera Calibration and 3D Reconstruction .

源代码

源代码路径：(opencv 3.1.0 )　samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/

下载路径：　real_time_pose_estimation

包含两个主要的程序：

１）　Model registration

This applicaton is exclusive to whom don't have a 3D textured model of the object to be detected. 你可以使用这个代码创建自定义的textured 3D model.当前代码只对平面的(或者说面试平的)物体有效,如果需要得到一个形状复杂的模型需要使用复杂的软件创建。

当前代码的输入为一幅图像和图像对应的3D网格。我们提供相机的内参，用来校正获取的图像作为算法的输入图像.所有的文件需要通过绝对路径或者相对于工作目录的相对路径指定。如果没有指定文件，代码将会尝试使用默认参数.

程序执行，首先从输入图像中提取 ORB特征描述子，然后使用网格数据和 Möller–Trumbore intersection 算法来计算特征的3D坐标系。最后，3D坐标点和特征描述子存在YAML格式文件的不同列表中，每一行存储一个不同的点.关于文件存储和打开的技术文档，请参见File Input and Output using XML and YAML files .

２）　Model detection

本程序的目的为实时估计对象的姿态并给出其对应的3D textured model.

模型检测程序运行时，先加载YAML(structure explained in the model registration program)文件格式的3D textured model.在视觉场景中, 会检测和提取ORB特征描述子.然后使用 cv::FlannBasedMatcher 和 cv::flann::GenericIndex 进行场景中特征描述子和模型中特征描述子的匹配. 将找到的匹配关系，用 cv::solvePnPRansac 函数计算得到相机的 R和t参数。最后使用 KalmanFilter 去除错误的姿态估计.

如果在编译OpenCV的时候编译了源代码中的例子程序，可以待对应路径下找到编译好的程序： opencv/build/bin/cpp-tutorial-pnp_detection. 然后你可以运行程序，并尝试改变一些参数:

//This program shows how to detect an object given its 3D textured model. You can choose to use a recorded video or the webcam.Usage:./cpp-tutorial-pnp_detection -help Keys:'esc' - to quit. -------------------------------------------------------------------------- Usage: cpp-tutorial-pnp_detection [params]-c, --confidence (value:0.95)RANSAC confidence-e, --error (value:2.0)RANSAC reprojection errror-f, --fast (value:true)use of robust fast match-h, --help (value:true)print this message--in, --inliers (value:30)minimum inliers for Kalman update--it, --iterations (value:500)RANSAC maximum iterations count-k, --keypoints (value:2000)number of keypoints to detect--meshpath to ply mesh--method, --pnp (value:0)PnP method: (0) ITERATIVE - (1) EPNP - (2) P3P - (3) DLS--modelpath to yml model-r, --ratio (value:0.7)threshold for ratio test-v, --videopath to recorded video

例如你可以在运行程序的时候，改变pnp的方法:

./cpp-tutorial-pnp_detection --method=2

代码讲解

1) 读取3D纹理对象和对象的网格数据(Read 3D textured object model and object mesh)

为了加载textured model实现了Model类(class),其中的函数　load() 　用来打开 YAML 格式的文件，并读取存储的 3D点和相应的描述子。在OpenCV的源代码中可以找到一个3D textured model，路径samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/cookies_ORB.yml.

/* Load a YAML file using OpenCV */ void Model::load(const std::string path) {cv::Mat points3d_mat;cv::FileStorage storage(path, cv::FileStorage::READ);storage["points_3d"] >> points3d_mat;storage["descriptors"] >> descriptors_;points3d_mat.copyTo(list_points3d_in_);storage.release(); }
主程序中加载模型的方式如下:

Model model; // instantiate Model object model.load(yml_read_path); // load a 3D textured object model 为了加载网格模型实现了 Mesh类( class),其中的函数　load() 　用来打开 ∗.ply 格式的文件，并读取存储的object的 3D点and also the composed。在OpenCV的源代码中可以找到一个例子，路径 samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/box.ply

/* Load a CSV with *.ply format */ void Mesh::load(const std::string path) {// Create the readerCsvReader csvReader(path);// Clear previous datalist_vertex_.clear();list_triangles_.clear();// Read from .ply filecsvReader.readPLY(list_vertex_, list_triangles_);// Update mesh attributesnum_vertexs_ = list_vertex_.size();num_triangles_ = list_triangles_.size(); }
主程序中网格加载的代码为:

Mesh mesh; // instantiate Mesh object mesh.load(ply_read_path); // load an object mesh
可以加载不同的模型和网格:

./cpp-tutorial-pnp_detection --mesh=/absolute_path_to_your_mesh.ply --model=/absolute_path_to_your_model.yml

2) 输入源为摄像头或者视频文件（Take input from Camera or Video）

To detect is necessary capture video. It's done loading a recorded video by passing the absolute path where it is located in your machine. In order to test the application you can find a recorded video insamples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/box.mp4.

cv::VideoCapture cap; // instantiate VideoCapture cap.open(video_read_path); // open a recorded video if(!cap.isOpened()) // check if we succeeded {std::cout << "Could not open the camera device" << std::endl;return -1; }
Then the algorithm is computed frame per frame:

cv::Mat frame, frame_vis; while(cap.read(frame) && cv::waitKey(30) != 27) // capture frame until ESC is pressed {frame_vis = frame.clone(); // refresh visualisation frame// MAIN ALGORITHM }
You can also load different recorded video:

./cpp-tutorial-pnp_detection --video=/absolute_path_to_your_video.mp4

3) 提取ORB特征（Extract ORB features and descriptors from the scene）

The next step is to detect the scene features and extract it descriptors. For this task I implemented aclassRobustMatcher which has a function for keypoints detection and features extraction. You can find it insamples/cpp/tutorial_code/calib3d/real_time_pose_estimation/src/RobusMatcher.cpp. In yourRobusMatch object you can use any of the 2D features detectors of OpenCV. In this case I usedcv::ORB features because is based oncv::FAST to detect the keypoints and cv::xfeatures2d::BriefDescriptorExtractor to extract the descriptors which means that is fast and robust to rotations. You can find more detailed information aboutORB in the documentation.

The following code is how to instantiate and set the features detector and the descriptors extractor:

RobustMatcher rmatcher; // instantiate RobustMatcher cv::FeatureDetector * detector = new cv::OrbFeatureDetector(numKeyPoints); // instatiate ORB feature detector cv::DescriptorExtractor * extractor = new cv::OrbDescriptorExtractor(); // instatiate ORB descriptor extractor rmatcher.setFeatureDetector(detector); // set feature detector rmatcher.setDescriptorExtractor(extractor); // set descriptor extractor
The features and descriptors will be computed by the RobustMatcher inside the matching function.

4) 基于Flann算法对ORB特征描述子进行匹配（Match scene descriptors with model descriptors using Flann matcher）

It is the first step in our detection algorithm. The main idea is to match the scene descriptors with our model descriptors in order to know the 3D coordinates of the found features into the current scene.

Firstly, we have to set which matcher we want to use. In this case is usedcv::FlannBasedMatcher matcher which in terms of computational cost is faster than thecv::BFMatcher matcher as we increase the trained collectction of features. Then, for FlannBased matcher the index created isMulti-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search due toORB descriptors are binary.

You can tune the LSH and search parameters to improve the matching efficiency:

cv::Ptr<cv::flann::IndexParams> indexParams = cv::makePtr<cv::flann::LshIndexParams>(6, 12, 1); // instantiate LSH index parameters cv::Ptr<cv::flann::SearchParams> searchParams = cv::makePtr<cv::flann::SearchParams>(50); // instantiate flann search parameters cv::DescriptorMatcher * matcher = new cv::FlannBasedMatcher(indexParams, searchParams); // instantiate FlannBased matcher rmatcher.setDescriptorMatcher(matcher);

Secondly, we have to call the matcher by using robustMatch() or fastRobustMatch() function. The difference of using this two functions is its computational cost. The first method is slower but more robust at filtering good matches because uses two ratio test and a symmetry test. In contrast, the second method is faster but less robust because only applies a single ratio test to the matches.

The following code is to get the model 3D points and its descriptors and then call the matcher in the main program:

// Get the MODEL INFO std::vector<cv::Point3f> list_points3d_model = model.get_points3d(); // list with model 3D coordinates cv::Mat descriptors_model = model.get_descriptors(); // list with descriptors of each 3D coordinate
// -- Step 1: Robust matching between model descriptors and scene descriptors std::vector<cv::DMatch> good_matches; // to obtain the model 3D points in the scene std::vector<cv::KeyPoint> keypoints_scene; // to obtain the 2D points of the scene if(fast_match) {rmatcher.fastRobustMatch(frame, good_matches, keypoints_scene, descriptors_model); } else {rmatcher.robustMatch(frame, good_matches, keypoints_scene, descriptors_model); }
The following code corresponds to the robustMatch() function which belongs to the RobustMatcher class. This function uses the given image to detect the keypoints and extract the descriptors, match using two Nearest Neighbour the extracted descriptors with the given model descriptors and vice versa. Then, a ratio test is applied to the two direction matches in order to remove these matches which its distance ratio between the first and second best match is larger than a given threshold. Finally, a symmetry test is applied in order the remove non symmetrical matches. void RobustMatcher::robustMatch( const cv::Mat& frame, std::vector<cv::DMatch>& good_matches,std::vector<cv::KeyPoint>& keypoints_frame,const std::vector<cv::KeyPoint>& keypoints_model, const cv::Mat& descriptors_model ) {// 1a. Detection of the ORB featuresthis->computeKeyPoints(frame, keypoints_frame);// 1b. Extraction of the ORB descriptorscv::Mat descriptors_frame;this->computeDescriptors(frame, keypoints_frame, descriptors_frame);// 2. Match the two image descriptorsstd::vector<std::vector<cv::DMatch> > matches12, matches21;// 2a. From image 1 to image 2matcher_->knnMatch(descriptors_frame, descriptors_model, matches12, 2); // return 2 nearest neighbours// 2b. From image 2 to image 1matcher_->knnMatch(descriptors_model, descriptors_frame, matches21, 2); // return 2 nearest neighbours// 3. Remove matches for which NN ratio is > than threshold// clean image 1 -> image 2 matchesint removed1 = ratioTest(matches12);// clean image 2 -> image 1 matchesint removed2 = ratioTest(matches21);// 4. Remove non-symmetrical matchessymmetryTest(matches12, matches21, good_matches); } After the matches filtering we have to subtract the 2D and 3D correspondences from the found scene keypoints and our 3D model using the obtained DMatches vector. For more information about cv::DMatch check the documentation.
// -- Step 2: Find out the 2D/3D correspondences std::vector<cv::Point3f> list_points3d_model_match; // container for the model 3D coordinates found in the scene std::vector<cv::Point2f> list_points2d_scene_match; // container for the model 2D coordinates found in the scene for(unsigned int match_index = 0; match_index < good_matches.size(); ++match_index) {cv::Point3f point3d_model = list_points3d_model[ good_matches[match_index].trainIdx ]; // 3D point from modelcv::Point2f point2d_scene = keypoints_scene[ good_matches[match_index].queryIdx ].pt; // 2D point from the scenelist_points3d_model_match.push_back(point3d_model); // add 3D pointlist_points2d_scene_match.push_back(point2d_scene);

You can also change the ratio test threshold, the number of keypoints to detect as well as use or not the robust matcher:

./cpp-tutorial-pnp_detection --ratio=0.8 --keypoints=1000 --fast=false
5) 使用PnP + Ransac进行姿态估计（Pose estimation using PnP + Ransac）

Once with the 2D and 3D correspondences we have to apply a PnP algorithm in order to estimate the camera pose. The reason why we have to usecv::solvePnPRansac instead of cv::solvePnP is due to the fact that after the matching not all the found correspondences are correct and, as like as not, there are false correspondences or also calledoutliers. TheRandom Sample Consensus orRansac is a non-deterministic iterative method which estimate parameters of a mathematical model from observed data producing an aproximate result as the number of iterations increase. After appylingRansac all theoutliers will be eliminated to then estimate the camera pose with a certain probability to obtain a good solution.

For the camera pose estimation I have implemented a classPnPProblem. Thisclass has 4 atributes: a given calibration matrix, the rotation matrix, the translation matrix and the rotation-translation matrix. The intrinsic calibration parameters of the camera which you are using to estimate the pose are necessary. In order to obtain the parameters you can check Camera calibration with square chessboard and Camera calibration With OpenCV tutorials.

The following code is how to declare the PnPProblem class in the main program:

// Intrinsic camera parameters: UVC WEBCAM double f = 55; // focal length in mm double sx = 22.3, sy = 14.9; // sensor size double width = 640, height = 480; // image size double params_WEBCAM[] = { width*f/sx, // fxheight*f/sy, // fywidth/2, // cxheight/2}; // cy PnPProblem pnp_detection(params_WEBCAM); // instantiate PnPProblem class

The following code is how the PnPProblem class initialises its atributes:

// Custom constructor given the intrinsic camera parameters PnPProblem::PnPProblem(const double params[]) {_A_matrix = cv::Mat::zeros(3, 3, CV_64FC1); // intrinsic camera parameters_A_matrix.at<double>(0, 0) = params[0]; // [ fx 0 cx ]_A_matrix.at<double>(1, 1) = params[1]; // [ 0 fy cy ]_A_matrix.at<double>(0, 2) = params[2]; // [ 0 0 1 ]_A_matrix.at<double>(1, 2) = params[3];_A_matrix.at<double>(2, 2) = 1;_R_matrix = cv::Mat::zeros(3, 3, CV_64FC1); // rotation matrix_t_matrix = cv::Mat::zeros(3, 1, CV_64FC1); // translation matrix_P_matrix = cv::Mat::zeros(3, 4, CV_64FC1); // rotation-translation matrix }

OpenCV provides four PnP methods: ITERATIVE, EPNP, P3P and DLS. Depending on the application type, the estimation method will be different. In the case that we want to make a real time application, the more suitable methods are EPNP and P3P due to that are faster than ITERATIVE and DLS at finding an optimal solution. However, EPNP and P3P are not especially robust in front of planar surfaces and sometimes the pose estimation seems to have a mirror effect. Therefore, in this this tutorial is used ITERATIVE method due to the object to be detected has planar surfaces.

The OpenCV Ransac implementation wants you to provide three parameters: the maximum number of iterations until stop the algorithm, the maximum allowed distance between the observed and computed point projections to consider it an inlier and the confidence to obtain a good result. You can tune these paramaters in order to improve your algorithm performance. Increasing the number of iterations you will have a more accurate solution, but will take more time to find a solution. Increasing the reprojection error will reduce the computation time, but your solution will be unaccurate. Decreasing the confidence your arlgorithm will be faster, but the obtained solution will be unaccurate.

The following parameters work for this application:

// RANSAC parameters int iterationsCount = 500; // number of Ransac iterations. float reprojectionError = 2.0; // maximum allowed distance to consider it an inlier. float confidence = 0.95; // ransac successful confidence.
The following code corresponds to the estimatePoseRANSAC() function which belongs to the PnPProblem class. This function estimates the rotation and translation matrix given a set of 2D/3D correspondences, the desired PnP method to use, the output inliers container and the Ransac parameters:
// Estimate the pose given a list of 2D/3D correspondences with RANSAC and the method to use void PnPProblem::estimatePoseRANSAC( const std::vector<cv::Point3f> &list_points3d, // list with model 3D coordinatesconst std::vector<cv::Point2f> &list_points2d, // list with scene 2D coordinatesint flags, cv::Mat &inliers, int iterationsCount, // PnP method; inliers containerfloat reprojectionError, float confidence ) // Ransac parameters {cv::Mat distCoeffs = cv::Mat::zeros(4, 1, CV_64FC1); // vector of distortion coefficientscv::Mat rvec = cv::Mat::zeros(3, 1, CV_64FC1); // output rotation vectorcv::Mat tvec = cv::Mat::zeros(3, 1, CV_64FC1); // output translation vectorbool useExtrinsicGuess = false; // if true the function uses the provided rvec and tvec values as// initial approximations of the rotation and translation vectorscv::solvePnPRansac( list_points3d, list_points2d, _A_matrix, distCoeffs, rvec, tvec,useExtrinsicGuess, iterationsCount, reprojectionError, confidence,inliers, flags );Rodrigues(rvec,_R_matrix); // converts Rotation Vector to Matrix_t_matrix = tvec; // set translation matrixthis->set_P_matrix(_R_matrix, _t_matrix); // set rotation-translation matrix }
In the following code are the 3th and 4th steps of the main algorithm. The first, calling the above function and the second taking the output inliers vector from Ransac to get the 2D scene points for drawing purpose. As seen in the code we must be sure to apply Ransac if we have matches, in the other case, the function cv::solvePnPRansac crashes due to any OpenCV bug.

if(good_matches.size() > 0) // None matches, then RANSAC crashes {// -- Step 3: Estimate the pose using RANSAC approachpnp_detection.estimatePoseRANSAC( list_points3d_model_match, list_points2d_scene_match,pnpMethod, inliers_idx, iterationsCount, reprojectionError, confidence );// -- Step 4: Catch the inliers keypoints to drawfor(int inliers_index = 0; inliers_index < inliers_idx.rows; ++inliers_index){int n = inliers_idx.at<int>(inliers_index); // i-inliercv::Point2f point2d = list_points2d_scene_match[n]; // i-inlier point 2Dlist_points2d_inliers.push_back(point2d); // add i-inlier to list }

Finally, once the camera pose has been estimated we can use the R and t in order to compute the 2D projection onto the image of a given 3D point expressed in a world reference frame using the showed formula onTheory.

The following code corresponds to the backproject3DPoint() function which belongs to thePnPProblem class. The function backproject a given 3D point expressed in a world reference frame onto a 2D image:

// Backproject a 3D point to 2D using the estimated pose parameters cv::Point2f PnPProblem::backproject3DPoint(const cv::Point3f &point3d) {// 3D point vector [x y z 1]'cv::Mat point3d_vec = cv::Mat(4, 1, CV_64FC1);point3d_vec.at<double>(0) = point3d.x;point3d_vec.at<double>(1) = point3d.y;point3d_vec.at<double>(2) = point3d.z;point3d_vec.at<double>(3) = 1;// 2D point vector [u v 1]'cv::Mat point2d_vec = cv::Mat(4, 1, CV_64FC1);point2d_vec = _A_matrix * _P_matrix * point3d_vec;// Normalization of [u v]'cv::Point2f point2d;point2d.x = point2d_vec.at<double>(0) / point2d_vec.at<double>(2);point2d.y = point2d_vec.at<double>(1) / point2d_vec.at<double>(2);return point2d; }

The above function is used to compute all the 3D points of the object Mesh to show the pose of the object.

You can also change RANSAC parameters and PnP method:

./cpp-tutorial-pnp_detection --error=0.25 --confidence=0.90 --iterations=250 --method=3

6) 使用线性卡尔曼滤波去除错误的姿态估计（Linear Kalman Filter for bad poses rejection）

Is it common in computer vision or robotics fields that after applying detection or tracking techniques, bad results are obtained due to some sensor errors. In order to avoid these bad detections in this tutorial is explained how to implement a Linear Kalman Filter. The Kalman Filter will be applied after detected a given number of inliers.

You can find more information about what Kalman Filter is. In this tutorial it's used the OpenCV implementation of the cv::KalmanFilter based on Linear Kalman Filter for position and orientation tracking to set the dynamics and measurement models.

Firstly, we have to define our state vector which will have 18 states: the positional data (x,y,z) with its first and second derivatives (velocity and acceleration), then rotation is added in form of three euler angles (roll, pitch, jaw) together with their first and second derivatives (angular velocity and acceleration)

X=(x,y,z,x˙,y˙,z˙,x¨,y¨,z¨,ψ,θ,ϕ,ψ˙,θ˙,ϕ˙,ψ¨,θ¨,ϕ¨)T

Secondly, we have to define the number of measuremnts which will be 6: from R and t we can extract (x,y,z) and (ψ,θ,ϕ). In addition, we have to define the number of control actions to apply to the system which in this case will be zero. Finally, we have to define the differential time between measurements which in this case is 1/T, where T is the frame rate of the video.

cv::KalmanFilter KF; // instantiate Kalman Filter int nStates = 18; // the number of states int nMeasurements = 6; // the number of measured states int nInputs = 0; // the number of action control double dt = 0.125; // time between measurements (1/FPS) initKalmanFilter(KF, nStates, nMeasurements, nInputs, dt); // init function

The following code corresponds to the Kalman Filter initialisation. Firstly, is set the process noise, the measurement noise and the error covariance matrix. Secondly, are set the transition matrix which is the dynamic model and finally the measurement matrix, which is the measurement model.

You can tune the process and measurement noise to improve the Kalman Filter performance. As the measurement noise is reduced the faster will converge doing the algorithm sensitive in front of bad measurements.

void initKalmanFilter(cv::KalmanFilter &KF, int nStates, int nMeasurements, int nInputs, double dt) {KF.init(nStates, nMeasurements, nInputs, CV_64F); // init Kalman Filtercv::setIdentity(KF.processNoiseCov, cv::Scalar::all(1e-5)); // set process noisecv::setIdentity(KF.measurementNoiseCov, cv::Scalar::all(1e-4)); // set measurement noisecv::setIdentity(KF.errorCovPost, cv::Scalar::all(1)); // error covariance/* DYNAMIC MODEL */// [1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0 0 0]// [0 1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0 0]// [0 0 1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0]// [0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2 0 0]// [0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2 0]// [0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2]// [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0]// [0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0]// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt]// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0]// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]// positionKF.transitionMatrix.at<double>(0,3) = dt;KF.transitionMatrix.at<double>(1,4) = dt;KF.transitionMatrix.at<double>(2,5) = dt;KF.transitionMatrix.at<double>(3,6) = dt;KF.transitionMatrix.at<double>(4,7) = dt;KF.transitionMatrix.at<double>(5,8) = dt;KF.transitionMatrix.at<double>(0,6) = 0.5*pow(dt,2);KF.transitionMatrix.at<double>(1,7) = 0.5*pow(dt,2);KF.transitionMatrix.at<double>(2,8) = 0.5*pow(dt,2);// orientationKF.transitionMatrix.at<double>(9,12) = dt;KF.transitionMatrix.at<double>(10,13) = dt;KF.transitionMatrix.at<double>(11,14) = dt;KF.transitionMatrix.at<double>(12,15) = dt;KF.transitionMatrix.at<double>(13,16) = dt;KF.transitionMatrix.at<double>(14,17) = dt;KF.transitionMatrix.at<double>(9,15) = 0.5*pow(dt,2);KF.transitionMatrix.at<double>(10,16) = 0.5*pow(dt,2);KF.transitionMatrix.at<double>(11,17) = 0.5*pow(dt,2);/* MEASUREMENT MODEL */// [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]// [0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]// [0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]// [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0]KF.measurementMatrix.at<double>(0,0) = 1; // xKF.measurementMatrix.at<double>(1,1) = 1; // yKF.measurementMatrix.at<double>(2,2) = 1; // zKF.measurementMatrix.at<double>(3,9) = 1; // rollKF.measurementMatrix.at<double>(4,10) = 1; // pitchKF.measurementMatrix.at<double>(5,11) = 1; // yaw }
In the following code is the 5th step of the main algorithm. When the obtained number of inliers after Ransac is over the threshold, the measurements matrix is filled and then the Kalman Filter is updated: // -- Step 5: Kalman Filter // GOOD MEASUREMENT if( inliers_idx.rows >= minInliersKalman ) {// Get the measured translationcv::Mat translation_measured(3, 1, CV_64F);translation_measured = pnp_detection.get_t_matrix();// Get the measured rotationcv::Mat rotation_measured(3, 3, CV_64F);rotation_measured = pnp_detection.get_R_matrix();// fill the measurements vectorfillMeasurements(measurements, translation_measured, rotation_measured); } // Instantiate estimated translation and rotation cv::Mat translation_estimated(3, 1, CV_64F); cv::Mat rotation_estimated(3, 3, CV_64F); // update the Kalman filter with good measurements updateKalmanFilter( KF, measurements,translation_estimated, rotation_estimated);
The following code corresponds to the fillMeasurements() function which converts the measured Rotation Matrix to Eulers angles and fill the measurements matrix along with the measured translation vector: void fillMeasurements( cv::Mat &measurements,const cv::Mat &translation_measured, const cv::Mat &rotation_measured) {// Convert rotation matrix to euler anglescv::Mat measured_eulers(3, 1, CV_64F);measured_eulers = rot2euler(rotation_measured);// Set measurement to predictmeasurements.at<double>(0) = translation_measured.at<double>(0); // xmeasurements.at<double>(1) = translation_measured.at<double>(1); // ymeasurements.at<double>(2) = translation_measured.at<double>(2); // zmeasurements.at<double>(3) = measured_eulers.at<double>(0); // rollmeasurements.at<double>(4) = measured_eulers.at<double>(1); // pitchmeasurements.at<double>(5) = measured_eulers.at<double>(2); // yaw }
The following code corresponds to the updateKalmanFilter() function which update the Kalman Filter and set the estimated Rotation Matrix and translation vector. The estimated Rotation Matrix comes from the estimated Euler angles to Rotation Matrix . void updateKalmanFilter( cv::KalmanFilter &KF, cv::Mat &measurement,cv::Mat &translation_estimated, cv::Mat &rotation_estimated ) {// First predict, to update the internal statePre variablecv::Mat prediction = KF.predict();// The "correct" phase that is going to use the predicted value and our measurementcv::Mat estimated = KF.correct(measurement);// Estimated translationtranslation_estimated.at<double>(0) = estimated.at<double>(0);translation_estimated.at<double>(1) = estimated.at<double>(1);translation_estimated.at<double>(2) = estimated.at<double>(2);// Estimated euler anglescv::Mat eulers_estimated(3, 1, CV_64F);eulers_estimated.at<double>(0) = estimated.at<double>(9);eulers_estimated.at<double>(1) = estimated.at<double>(10);eulers_estimated.at<double>(2) = estimated.at<double>(11);// Convert estimated quaternion to rotation matrixrotation_estimated = euler2rot(eulers_estimated); }

The 6th step is set the estimated rotation-translation matrix:

// -- Step 6: Set estimated projection matrix pnp_detection_est.set_P_matrix(rotation_estimated, translation_estimated);
// -- Step X: Draw pose drawObjectMesh(frame_vis, &mesh, &pnp_detection, green); // draw current pose drawObjectMesh(frame_vis, &mesh, &pnp_detection_est, yellow); // draw estimated pose double l = 5; std::vector<cv::Point2f> pose_points2d; pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,0,0))); // axis center pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(l,0,0))); // axis x pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,l,0))); // axis y pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,0,l))); // axis z draw3DCoordinateAxes(frame_vis, pose_points2d);

You can also modify the minimum inliers to update Kalman Filter:

./cpp-tutorial-pnp_detection --inliers=20

结果

下面的视频是实时姿势估计的结果，使用下面的参数和前面解释的检测算法：

// Robust Matcher parameters int numKeyPoints = 2000; // number of detected keypoints float ratio = 0.70f; // ratio test bool fast_match = true; // fastRobustMatch() or robustMatch() // RANSAC parameters int iterationsCount = 500; // number of Ransac iterations. int reprojectionError = 2.0; // maximum allowed distance to consider it an inlier. float confidence = 0.95; // ransac successful confidence. // Kalman Filter parameters int minInliersKalman = 30; // Kalman threshold updating

视频见YouTube： YouTube.

总结

以上是生活随笔为你收集整理的纹理对象的实时姿态估计的全部内容，希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错，欢迎将生活随笔推荐给好友。