wordpress 分享到插件,搜索引擎seo优化平台,深圳网络推广哪家比较好,电子商务网站课程设计总结前言
本文记录在FoundationPose中#xff0c;跑通基于CAD模型为输入的demo#xff0c;输出位姿信息#xff0c;可视化结果。
然后分享NeRF物体重建部分的训练#xff0c;以及RGBD图为输入的demo。 1、搭建环境
方案1#xff1a;基于docker镜像#xff08;推荐#xf…前言
本文记录在FoundationPose中跑通基于CAD模型为输入的demo输出位姿信息可视化结果。
然后分享NeRF物体重建部分的训练以及RGBD图为输入的demo。 1、搭建环境
方案1基于docker镜像推荐
首先下载开源代码https://github.com/NVlabs/FoundationPose
然后执行下面命令拉取镜像并构建镜像环境
cd docker/
docker pull wenbowen123/foundationpose docker tag wenbowen123/foundationpose foundationpose
bash docker/run_container.sh
bash build_all.sh
构建完成后可以用docker exec 进入镜像容器中。 方案2基于Conda比较麻烦
首先安装 Eigen3
cd $HOME wget -q https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.gz \
tar -xzf eigen-3.4.0.tar.gz \
cd eigen-3.4.0 mkdir build cd build
cmake .. -Wno-dev -DCMAKE_BUILD_TYPERelease -DCMAKE_CXX_FLAGS-stdc14 ..
sudo make install
cd $HOME rm -rf eigen-3.4.0 eigen-3.4.0.tar.gz
然后参考下面命令创建conda环境
# create conda environment
create -n foundationpose python3.9# activate conda environment
conda activate foundationpose# install dependencies
python -m pip install -r requirements.txt# Install NVDiffRast
python -m pip install --quiet --no-cache-dir githttps://github.com/NVlabs/nvdiffrast.git# Kaolin (Optional, needed if running model-free setup)
python -m pip install --quiet --no-cache-dir kaolin0.15.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.0.0_cu118.html# PyTorch3D
python -m pip install --quiet --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py39_cu118_pyt200/download.html# Build extensions
CMAKE_PREFIX_PATH$CONDA_PREFIX/lib/python3.9/site-packages/pybind11/share/cmake/pybind11 bash build_all_conda.sh 2、基于CAD模型为输入的demo
首先下载模型权重里面包括两个文件夹点击下载模型权重
在工程目录中创建weights/将下面两个文件夹放到里面。 然后下载测试数据里面包括两个压缩文件点击下载demo数据
在工程目录中创建demo_data/加压文件将下面两个文件放到里面。 运行 run_demo.py实现CAD模型为输入的demo
python run_demo.py --debug 2
如果是服务器运行没有可视化的需要注释两行代码 if debug1: center_pose posenp.linalg.inv(to_origin) vis draw_posed_3d_box(reader.K, imgcolor, ob_in_camcenter_pose, bboxbbox) vis draw_xyz_axis(color, ob_in_camcenter_pose, scale0.1, Kreader.K, thickness3, transparency0, is_input_rgbTrue) # cv2.imshow(1, vis[...,::-1]) # cv2.waitKey(1) 然后看到demo_data/mustard0/里面生成了ob_in_cam、track_vis文件夹
ob_in_cam 是位姿估计的结果用txt文件存储示例文件
6.073544621467590332e-01 -2.560715079307556152e-01 7.520291209220886230e-01 -4.481770694255828857e-01
-7.755840420722961426e-01 -3.960975110530853271e-01 4.915038347244262695e-01 1.187708452343940735e-01
1.720167100429534912e-01 -8.817789554595947266e-01 -4.391765296459197998e-01 8.016449213027954102e-01
0.000000000000000000e00 0.000000000000000000e00 0.000000000000000000e00 1.000000000000000000e00track_vis 是可视化结果能看到多张图片 3、NeRF物体重建训练
下载训练数据Linemod和YCB-V两个公开数据集的示例
点击下载RGBD参考数据 示例1训练Linemod数据集
修改代码bundlesdf/run_nerf.py修改为use_refined_maskFalse即98行 mesh run_one_ob(base_dirbase_dir, cfgcfg, use_refined_maskFalse) 然后执行命令
python bundlesdf/run_nerf.py --ref_view_dir /DATASET/lm_ref_views --dataset linemod 如果是服务器运行没有可视化的需要安装xvfb
sudo apt-get update
sudo apt-get install -y xvfb 然后执行命令
xvfb-run -s -screen 0 1024x768x24 python bundlesdf/run_nerf.py --ref_view_dir model_free_ref_views/lm_ref_views --dataset linemod
因为训练NeRF需要渲染的使用xvfb进行模拟。
能看到打印信息
bundlesdf/run_nerf.py:61: DeprecationWarning: Starting with ImageIO v3 the behavior of this function will switch to that of iio.v3.imread. To keep the current behavior (and make this warning disappear) use import imageio.v2 as imageio or call imageio.v2.imread directly.rgb imageio.imread(color_file)
[compute_scene_bounds()] compute_scene_bounds_worker start
[compute_scene_bounds()] compute_scene_bounds_worker done
[compute_scene_bounds()] merge pcd
[compute_scene_bounds()] compute_translation_scales done
translation_cvcam[0.00024226 0.00356217 0.00056694], sc_factor19.274929219577043
[build_octree()] Octree voxel dilate_radius:1
[__init__()] level:0, vox_pts:torch.Size([1, 3]), corner_pts:torch.Size([8, 3])
[__init__()] level:1, vox_pts:torch.Size([8, 3]), corner_pts:torch.Size([27, 3])
[__init__()] level:2, vox_pts:torch.Size([64, 3]), corner_pts:torch.Size([125, 3])
[draw()] level:2
[draw()] level:2
level 0, resolution: 32
level 1, resolution: 37
level 2, resolution: 43
level 3, resolution: 49
level 4, resolution: 56
level 5, resolution: 64
level 6, resolution: 74
level 7, resolution: 85
level 8, resolution: 98
level 9, resolution: 112
level 10, resolution: 128
level 11, resolution: 148
level 12, resolution: 169
level 13, resolution: 195
level 14, resolution: 223
level 15, resolution: 256
GridEncoder: input_dim3 n_levels16 level_dim2 resolution32 - 256 per_level_scale1.1487 params(26463840, 2) gridtypehash align_cornersFalse
sc_factor 19.274929219577043
translation [0.00024226 0.00356217 0.00056694]
[__init__()] denoise cloud
[__init__()] Denoising rays based on octree cloud
[__init__()] bad_mask#3
rays torch.Size([128387, 12])
[train()] train progress 0/1001
[train_loop()] Iter: 0, valid_samples: 524161/524288, valid_rays: 2048/2048, loss: 309.0942383, rgb_loss: 0.0216732, rgb0_loss: 0.0000000, fs_rgb_loss: 0.0000000, depth_loss: 0.0000000, depth_loss0: 0.0000000, fs_loss: 301.6735840, point_cloud_loss: 0.0000000, point_cloud_normal_loss: 0.0000000, sdf_loss: 7.2143111, eikonal_loss: 0.0000000, variation_loss: 0.0000000, truncation(meter): 0.0100000, pose_reg: 0.0000000, reg_features: 0.1152707,[train()] train progress 100/1001
[train()] train progress 200/1001
[train()] train progress 300/1001
[train()] train progress 400/1001
[train()] train progress 500/1001
Saved checkpoints at model_free_ref_views/lm_ref_views/ob_0000001/nerf/model_latest.pth
[train_loop()] Iter: 500, valid_samples: 518554/524288, valid_rays: 2026/2048, loss: 1.0530750, rgb_loss: 0.0009063, rgb0_loss: 0.0000000, fs_rgb_loss: 0.0000000, depth_loss: 0.0000000, depth_loss0: 0.0000000, fs_loss: 0.2142579, point_cloud_loss: 0.0000000, point_cloud_normal_loss: 0.0000000, sdf_loss: 0.8360301, eikonal_loss: 0.0000000, variation_loss: 0.0000000, truncation(meter): 0.0100000, pose_reg: 0.0000000, reg_features: 0.0008409,[extract_mesh()] query_pts:torch.Size([42875, 3]), valid:42875
[extract_mesh()] Running Marching Cubes
[extract_mesh()] done V:(4536, 3), F:(8986, 3)
[train()] train progress 600/1001
[train()] train progress 700/1001
[train()] train progress 800/1001
[train()] train progress 900/1001
[train()] train progress 1000/1001
Saved checkpoints at model_free_ref_views/lm_ref_views/ob_0000001/nerf/model_latest.pth
[train_loop()] Iter: 1000, valid_samples: 519351/524288, valid_rays: 2029/2048, loss: 0.4827633, rgb_loss: 0.0006563, rgb0_loss: 0.0000000, fs_rgb_loss: 0.0000000, depth_loss: 0.0000000, depth_loss0: 0.0000000, fs_loss: 0.0935674, point_cloud_loss: 0.0000000, point_cloud_normal_loss: 0.0000000, sdf_loss: 0.3876466, eikonal_loss: 0.0000000, variation_loss: 0.0000000, truncation(meter): 0.0100000, pose_reg: 0.0000000, reg_features: 0.0001022,[extract_mesh()] query_pts:torch.Size([42875, 3]), valid:42875
[extract_mesh()] Running Marching Cubes
[extract_mesh()] done V:(5265, 3), F:(10328, 3)
[extract_mesh()] query_pts:torch.Size([42875, 3]), valid:42875
[extract_mesh()] Running Marching Cubes
[extract_mesh()] done V:(5265, 3), F:(10328, 3)
[module()] OpenGL_accelerate module loaded
[module()] Using accelerated ArrayDatatype
[mesh_texture_from_train_images()] Texture: Texture map computation
project train_images 0/16
project train_images 1/16
project train_images 2/16
project train_images 3/16
project train_images 4/16
project train_images 5/16
project train_images 6/16
project train_images 7/16
project train_images 8/16
project train_images 9/16
project train_images 10/16
project train_images 11/16
project train_images 12/16
project train_images 13/16
project train_images 14/16
project train_images 15/16重点留意损失的变化 [train()] train progress 0/1001 [train_loop()] Iter: 0, valid_samples: 524161/524288, valid_rays: 2048/2048, loss: 309.0942383, rgb_loss: 0.0216732, rgb0_loss: 0.0000000, fs_rgb_loss: 0.0000000, depth_loss: 0.0000000, depth_loss0: 0.0000000, fs_loss: 301.6735840, point_cloud_loss: 0.0000000, point_cloud_normal_loss: 0.0000000, sdf_loss: 7.2143111, eikonal_loss: 0.0000000, variation_loss: 0.0000000, truncation(meter): 0.0100000, pose_reg: 0.0000000, reg_features: 0.1152707, [train()] train progress 100/1001 [train()] train progress 200/1001 [train()] train progress 300/1001 [train()] train progress 400/1001 [train()] train progress 500/1001 Saved checkpoints at model_free_ref_views/lm_ref_views/ob_0000001/nerf/model_latest.pth [train_loop()] Iter: 500, valid_samples: 518554/524288, valid_rays: 2026/2048, loss: 1.0530750, rgb_loss: 0.0009063, rgb0_loss: 0.0000000, fs_rgb_loss: 0.0000000, depth_loss: 0.0000000, depth_loss0: 0.0000000, fs_loss: 0.2142579, point_cloud_loss: 0.0000000, point_cloud_normal_loss: 0.0000000, sdf_loss: 0.8360301, eikonal_loss: 0.0000000, variation_loss: 0.0000000, truncation(meter): 0.0100000, pose_reg: 0.0000000, reg_features: 0.0008409, 默认训练1000轮训练也挺快的。
在lm_ref_views/ob_0000001/中生成了nerf文件夹存放下面文件 在lm_ref_views/ob_0000001/model中生成了的model.obj后续模型推理或demo直接使用它。 示例2训练YCB-V数据集
python bundlesdf/run_nerf.py --ref_view_dir /DATASET/ycbv/ref_views_16 --dataset ycbv
如果是服务器运行没有可视化的需要安装xvfb
sudo apt-get update
sudo apt-get install -y xvfb 然后执行命令
xvfb-run -s -screen 0 1024x768x24 python bundlesdf/run_nerf.py --ref_view_dir /DATASET/ycbv/ref_views_16 --dataset ycbv
因为训练NeRF需要渲染的使用xvfb进行模拟。 4、RGBD图输入demo
这里以Linemod数据集为示例首先下面测试数据集
点击下载测试数据 然后加压文件存放路径FoundationPose-main/model_free_ref_views/lm_test_all
官方代码有问题需要替换两个代码datareader.py、run_linemod.py
run_linemod.py
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.from Utils import *
import json,uuid,joblib,os,sys
import scipy.spatial as spatial
from multiprocessing import Pool
import multiprocessing
from functools import partial
from itertools import repeat
import itertools
from datareader import *
from estimater import *
code_dir os.path.dirname(os.path.realpath(__file__))
sys.path.append(f{code_dir}/mycpp/build)
import yaml
import redef get_mask(reader, i_frame, ob_id, detect_type):if detect_typebox:mask reader.get_mask(i_frame, ob_id)H,W mask.shape[:2]vs,us np.where(mask0)umin us.min()umax us.max()vmin vs.min()vmax vs.max()valid np.zeros((H,W), dtypebool)valid[vmin:vmax,umin:umax] 1elif detect_typemask:mask reader.get_mask(i_frame, ob_id)if mask is None:return Nonevalid mask0elif detect_typedetected:mask cv2.imread(reader.color_files[i_frame].replace(rgb,mask_cosypose), -1)valid maskob_idelse:raise RuntimeErrorreturn validdef run_pose_estimation_worker(reader, i_frames, est:FoundationPoseNone, debug0, ob_idNone, devicecuda:0):torch.cuda.set_device(device)est.to_device(device)est.glctx dr.RasterizeCudaContext(devicedevice)result NestDict()for i, i_frame in enumerate(i_frames):logging.info(f{i}/{len(i_frames)}, i_frame:{i_frame}, ob_id:{ob_id})print(\n### , f{i}/{len(i_frames)}, i_frame:{i_frame}, ob_id:{ob_id})video_id reader.get_video_id()color reader.get_color(i_frame)depth reader.get_depth(i_frame)id_str reader.id_strs[i_frame]H,W color.shape[:2]debug_dir est.debug_dirob_mask get_mask(reader, i_frame, ob_id, detect_typedetect_type)if ob_mask is None:logging.info(ob_mask not found, skip)result[video_id][id_str][ob_id] np.eye(4)return resultest.gt_pose reader.get_gt_pose(i_frame, ob_id)pose est.register(Kreader.K, rgbcolor, depthdepth, ob_maskob_mask, ob_idob_id)logging.info(fpose:\n{pose})if debug3:m est.mesh_ori.copy()tmp m.copy()tmp.apply_transform(pose)tmp.export(f{debug_dir}/model_tf.obj)result[video_id][id_str][ob_id] posereturn result, posedef run_pose_estimation():wp.force_load(devicecuda)reader_tmp LinemodReader(opt.linemod_dir, splitNone)print(## opt.linemod_dir:, opt.linemod_dir)debug opt.debuguse_reconstructed_mesh opt.use_reconstructed_meshdebug_dir opt.debug_dirres NestDict()glctx dr.RasterizeCudaContext()mesh_tmp trimesh.primitives.Box(extentsnp.ones((3)), transformnp.eye(4)).to_mesh()est FoundationPose(model_ptsmesh_tmp.vertices.copy(), model_normalsmesh_tmp.vertex_normals.copy(), symmetry_tfsNone, meshmesh_tmp, scorerNone, refinerNone, glctxglctx, debug_dirdebug_dir, debugdebug)# ob_idmatch re.search(r\d$, opt.linemod_dir)if match:last_number match.group()ob_id int(last_number)else:print(No digits found at the end of the string)# for ob_id in reader_tmp.ob_ids:if ob_id:if use_reconstructed_mesh:print(## ob_id:, ob_id)print(## opt.linemod_dir:, opt.linemod_dir)print(## opt.ref_view_dir:, opt.ref_view_dir)mesh reader_tmp.get_reconstructed_mesh(ref_view_diropt.ref_view_dir)else:mesh reader_tmp.get_gt_mesh(ob_id)# symmetry_tfs reader_tmp.symmetry_tfs[ob_id] # !!!!!!!!!!!!!!!!args []reader LinemodReader(opt.linemod_dir, splitNone)video_id reader.get_video_id()# est.reset_object(model_ptsmesh.vertices.copy(), model_normalsmesh.vertex_normals.copy(), symmetry_tfssymmetry_tfs, meshmesh) # rawest.reset_object(model_ptsmesh.vertices.copy(), model_normalsmesh.vertex_normals.copy(), meshmesh) # !!!!!!!!!!!!!!!!print(### len(reader.color_files):, len(reader.color_files))for i in range(len(reader.color_files)):args.append((reader, [i], est, debug, ob_id, cuda:0))# vis Datato_origin, extents trimesh.bounds.oriented_bounds(mesh)bbox np.stack([-extents/2, extents/2], axis0).reshape(2,3)os.makedirs(f{opt.linemod_dir}/track_vis, exist_okTrue)outs []i 0for arg in args[:200]:print(### num:, i)out, pose run_pose_estimation_worker(*arg)outs.append(out)center_pose posenp.linalg.inv(to_origin)img_color reader.get_color(i)vis draw_posed_3d_box(reader.K, imgimg_color, ob_in_camcenter_pose, bboxbbox)vis draw_xyz_axis(img_color, ob_in_camcenter_pose, scale0.1, Kreader.K, thickness3, transparency0, is_input_rgbTrue)imageio.imwrite(f{opt.linemod_dir}/track_vis/{reader.id_strs[i]}.png, vis)i i 1for out in outs:for video_id in out:for id_str in out[video_id]:for ob_id in out[video_id][id_str]:res[video_id][id_str][ob_id] out[video_id][id_str][ob_id]with open(f{opt.debug_dir}/linemod_res.yml,w) as ff:yaml.safe_dump(make_yaml_dumpable(res), ff)print(Save linemod_res.yml OK !!!)if __name____main__:parser argparse.ArgumentParser()code_dir os.path.dirname(os.path.realpath(__file__))parser.add_argument(--linemod_dir, typestr, default/guopu/FoundationPose-main/model_free_ref_views/lm_test_all/000015, helplinemod root dir) # lm_test_all lm_testparser.add_argument(--use_reconstructed_mesh, typeint, default1)parser.add_argument(--ref_view_dir, typestr, default/guopu/FoundationPose-main/model_free_ref_views/lm_ref_views/ob_0000015)parser.add_argument(--debug, typeint, default0)parser.add_argument(--debug_dir, typestr, defaultf/guopu/FoundationPose-main/model_free_ref_views/lm_test_all/debug) # lm_test_all lm_testopt parser.parse_args()set_seed(0)detect_type mask # mask / box / detectedrun_pose_estimation()
datareader.py
# Copyright (c) 2023, NVIDIA CORPORATION. All rights reserved.
#
# NVIDIA CORPORATION and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto. Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA CORPORATION is strictly prohibited.from Utils import *
import json,os,sysBOP_LIST [lmo,tless,ycbv,hb,tudl,icbin,itodd]
BOP_DIR os.getenv(BOP_DIR)def get_bop_reader(video_dir, zfarnp.inf):if ycbv in video_dir or YCB in video_dir:return YcbVideoReader(video_dir, zfarzfar)if lmo in video_dir or LINEMOD-O in video_dir:return LinemodOcclusionReader(video_dir, zfarzfar)if tless in video_dir or TLESS in video_dir:return TlessReader(video_dir, zfarzfar)if hb in video_dir:return HomebrewedReader(video_dir, zfarzfar)if tudl in video_dir:return TudlReader(video_dir, zfarzfar)if icbin in video_dir:return IcbinReader(video_dir, zfarzfar)if itodd in video_dir:return ItoddReader(video_dir, zfarzfar)else:raise RuntimeErrordef get_bop_video_dirs(dataset):if datasetycbv:video_dirs sorted(glob.glob(f{BOP_DIR}/ycbv/test/*))elif datasetlmo:video_dirs sorted(glob.glob(f{BOP_DIR}/lmo/lmo_test_bop19/test/*))elif datasettless:video_dirs sorted(glob.glob(f{BOP_DIR}/tless/tless_test_primesense_bop19/test_primesense/*))elif datasethb:video_dirs sorted(glob.glob(f{BOP_DIR}/hb/hb_test_primesense_bop19/test_primesense/*))elif datasettudl:video_dirs sorted(glob.glob(f{BOP_DIR}/tudl/tudl_test_bop19/test/*))elif dataseticbin:video_dirs sorted(glob.glob(f{BOP_DIR}/icbin/icbin_test_bop19/test/*))elif datasetitodd:video_dirs sorted(glob.glob(f{BOP_DIR}/itodd/itodd_test_bop19/test/*))else:raise RuntimeErrorreturn video_dirsclass YcbineoatReader:def __init__(self,video_dir, downscale1, shorter_sideNone, zfarnp.inf):self.video_dir video_dirself.downscale downscaleself.zfar zfarself.color_files sorted(glob.glob(f{self.video_dir}/rgb/*.png))self.K np.loadtxt(f{video_dir}/cam_K.txt).reshape(3,3)self.id_strs []for color_file in self.color_files:id_str os.path.basename(color_file).replace(.png,)self.id_strs.append(id_str)self.H,self.W cv2.imread(self.color_files[0]).shape[:2]if shorter_side is not None:self.downscale shorter_side/min(self.H, self.W)self.H int(self.H*self.downscale)self.W int(self.W*self.downscale)self.K[:2] * self.downscaleself.gt_pose_files sorted(glob.glob(f{self.video_dir}/annotated_poses/*))self.videoname_to_object {bleach0: 021_bleach_cleanser,bleach_hard_00_03_chaitanya: 021_bleach_cleanser,cracker_box_reorient: 003_cracker_box,cracker_box_yalehand0: 003_cracker_box,mustard0: 006_mustard_bottle,mustard_easy_00_02: 006_mustard_bottle,sugar_box1: 004_sugar_box,sugar_box_yalehand0: 004_sugar_box,tomato_soup_can_yalehand0: 005_tomato_soup_can,}def get_video_name(self):return self.video_dir.split(/)[-1]def __len__(self):return len(self.color_files)def get_gt_pose(self,i):try:pose np.loadtxt(self.gt_pose_files[i]).reshape(4,4)return poseexcept:logging.info(GT pose not found, return None)return Nonedef get_color(self,i):color imageio.imread(self.color_files[i])[...,:3]color cv2.resize(color, (self.W,self.H), interpolationcv2.INTER_NEAREST)return colordef get_mask(self,i):mask cv2.imread(self.color_files[i].replace(rgb,masks),-1)if len(mask.shape)3:for c in range(3):if mask[...,c].sum()0:mask mask[...,c]breakmask cv2.resize(mask, (self.W,self.H), interpolationcv2.INTER_NEAREST).astype(bool).astype(np.uint8)return maskdef get_depth(self,i):depth cv2.imread(self.color_files[i].replace(rgb,depth),-1)/1e3depth cv2.resize(depth, (self.W,self.H), interpolationcv2.INTER_NEAREST)depth[(depth0.1) | (depthself.zfar)] 0return depthdef get_xyz_map(self,i):depth self.get_depth(i)xyz_map depth2xyzmap(depth, self.K)return xyz_mapdef get_occ_mask(self,i):hand_mask_file self.color_files[i].replace(rgb,masks_hand)occ_mask np.zeros((self.H,self.W), dtypebool)if os.path.exists(hand_mask_file):occ_mask occ_mask | (cv2.imread(hand_mask_file,-1)0)right_hand_mask_file self.color_files[i].replace(rgb,masks_hand_right)if os.path.exists(right_hand_mask_file):occ_mask occ_mask | (cv2.imread(right_hand_mask_file,-1)0)occ_mask cv2.resize(occ_mask, (self.W,self.H), interpolationcv2.INTER_NEAREST)return occ_mask.astype(np.uint8)def get_gt_mesh(self):ob_name self.videoname_to_object[self.get_video_name()]YCB_VIDEO_DIR os.getenv(YCB_VIDEO_DIR)mesh trimesh.load(f{YCB_VIDEO_DIR}/models/{ob_name}/textured_simple.obj)return meshclass BopBaseReader:def __init__(self, base_dir, zfarnp.inf, resize1):self.base_dir base_dirself.resize resizeself.dataset_name Noneself.color_files sorted(glob.glob(f{self.base_dir}/rgb/*))if len(self.color_files)0:self.color_files sorted(glob.glob(f{self.base_dir}/gray/*))self.zfar zfarself.K_table {}with open(f{self.base_dir}/scene_camera.json,r) as ff:info json.load(ff)for k in info:self.K_table[f{int(k):06d}] np.array(info[k][cam_K]).reshape(3,3)self.bop_depth_scale info[k][depth_scale]if os.path.exists(f{self.base_dir}/scene_gt.json):with open(f{self.base_dir}/scene_gt.json,r) as ff:self.scene_gt json.load(ff)self.scene_gt copy.deepcopy(self.scene_gt) # Release file handle to be pickle-able by joblibassert len(self.scene_gt)len(self.color_files)else:self.scene_gt Noneself.make_id_strs()def make_scene_ob_ids_dict(self):with open(f{BOP_DIR}/{self.dataset_name}/test_targets_bop19.json,r) as ff:self.scene_ob_ids_dict {}data json.load(ff)for d in data:if d[scene_id]self.get_video_id():id_str f{d[im_id]:06d}if id_str not in self.scene_ob_ids_dict:self.scene_ob_ids_dict[id_str] []self.scene_ob_ids_dict[id_str] [d[obj_id]]*d[inst_count]def get_K(self, i_frame):K self.K_table[self.id_strs[i_frame]]if self.resize!1:K[:2,:2] * self.resizereturn Kdef get_video_dir(self):video_id int(self.base_dir.rstrip(/).split(/)[-1])return video_iddef make_id_strs(self):self.id_strs []for i in range(len(self.color_files)):name os.path.basename(self.color_files[i]).split(.)[0]self.id_strs.append(name)def get_instance_ids_in_image(self, i_frame:int):ob_ids []if self.scene_gt is not None:name int(os.path.basename(self.color_files[i_frame]).split(.)[0])for k in self.scene_gt[str(name)]:ob_ids.append(k[obj_id])elif self.scene_ob_ids_dict is not None:return np.array(self.scene_ob_ids_dict[self.id_strs[i_frame]])else:mask_dir os.path.dirname(self.color_files[0]).replace(rgb,mask_visib)id_str self.id_strs[i_frame]mask_files sorted(glob.glob(f{mask_dir}/{id_str}_*.png))ob_ids []for mask_file in mask_files:ob_id int(os.path.basename(mask_file).split(.)[0].split(_)[1])ob_ids.append(ob_id)ob_ids np.asarray(ob_ids)return ob_idsdef get_gt_mesh_file(self, ob_id):raise RuntimeError(You should override this)def get_color(self,i):color imageio.imread(self.color_files[i])if len(color.shape)2:color np.tile(color[...,None], (1,1,3)) # Gray to RGBif self.resize!1:color cv2.resize(color, fxself.resize, fyself.resize, dsizeNone)return colordef get_depth(self,i, filledFalse):if filled:depth_file self.color_files[i].replace(rgb,depth_filled)depth_file f{os.path.dirname(depth_file)}/0{os.path.basename(depth_file)}depth cv2.imread(depth_file,-1)/1e3else:depth_file self.color_files[i].replace(rgb,depth).replace(gray,depth)depth cv2.imread(depth_file,-1)*1e-3*self.bop_depth_scaleif self.resize!1:depth cv2.resize(depth, fxself.resize, fyself.resize, dsizeNone, interpolationcv2.INTER_NEAREST)depth[depth0.1] 0depth[depthself.zfar] 0return depthdef get_xyz_map(self,i):depth self.get_depth(i)xyz_map depth2xyzmap(depth, self.get_K(i))return xyz_mapdef get_mask(self, i_frame:int, ob_id:int, typemask_visib):type: mask_visib (only visible part) / mask (projected mask from whole model)pos 0name int(os.path.basename(self.color_files[i_frame]).split(.)[0])if self.scene_gt is not None:for k in self.scene_gt[str(name)]:if k[obj_id]ob_id:breakpos 1mask_file f{self.base_dir}/{type}/{name:06d}_{pos:06d}.pngif not os.path.exists(mask_file):logging.info(f{mask_file} not found)return Noneelse:# mask_dir os.path.dirname(self.color_files[0]).replace(rgb,type)# mask_file f{mask_dir}/{self.id_strs[i_frame]}_{ob_id:06d}.pngraise RuntimeErrormask cv2.imread(mask_file, -1)if self.resize!1:mask cv2.resize(mask, fxself.resize, fyself.resize, dsizeNone, interpolationcv2.INTER_NEAREST)return mask0def get_gt_mesh(self, ob_id:int):mesh_file self.get_gt_mesh_file(ob_id)mesh trimesh.load(mesh_file)mesh.vertices * 1e-3return meshdef get_model_diameter(self, ob_id):dir os.path.dirname(self.get_gt_mesh_file(self.ob_ids[0]))info_file f{dir}/models_info.jsonwith open(info_file,r) as ff:info json.load(ff)return info[str(ob_id)][diameter]/1e3def get_gt_poses(self, i_frame, ob_id):gt_poses []name int(self.id_strs[i_frame])for i_k, k in enumerate(self.scene_gt[str(name)]):if k[obj_id]ob_id:cur np.eye(4)cur[:3,:3] np.array(k[cam_R_m2c]).reshape(3,3)cur[:3,3] np.array(k[cam_t_m2c])/1e3gt_poses.append(cur)return np.asarray(gt_poses).reshape(-1,4,4)def get_gt_pose(self, i_frame:int, ob_id, maskNone, use_my_correctionFalse):ob_in_cam np.eye(4)best_iou -np.infbest_gt_mask Nonename int(self.id_strs[i_frame])for i_k, k in enumerate(self.scene_gt[str(name)]):if k[obj_id]ob_id:cur np.eye(4)cur[:3,:3] np.array(k[cam_R_m2c]).reshape(3,3)cur[:3,3] np.array(k[cam_t_m2c])/1e3if mask is not None: # When multi-instance exists, use mask to determine which onegt_mask cv2.imread(f{self.base_dir}/mask_visib/{self.id_strs[i_frame]}_{i_k:06d}.png, -1).astype(bool)intersect (gt_mask*mask).astype(bool)union (gt_maskmask).astype(bool)iou float(intersect.sum())/union.sum()if ioubest_iou:best_iou ioubest_gt_mask gt_maskob_in_cam curelse:ob_in_cam curbreakif use_my_correction:if ycb in self.base_dir.lower() and train_real in self.color_files[i_frame]:video_id self.get_video_id()if ob_id1:if video_id in [12,13,14,17,24]:ob_in_cam ob_in_camself.symmetry_tfs[ob_id][1]return ob_in_camdef load_symmetry_tfs(self):dir os.path.dirname(self.get_gt_mesh_file(self.ob_ids[0]))info_file f{dir}/models_info.jsonwith open(info_file,r) as ff:info json.load(ff)self.symmetry_tfs {}self.symmetry_info_table {}for ob_id in self.ob_ids:self.symmetry_info_table[ob_id] info[str(ob_id)]self.symmetry_tfs[ob_id] symmetry_tfs_from_info(info[str(ob_id)], rot_angle_discrete5)self.geometry_symmetry_info_table copy.deepcopy(self.symmetry_info_table)def get_video_id(self):return int(self.base_dir.split(/)[-1])class LinemodOcclusionReader(BopBaseReader):def __init__(self,base_dir/mnt/9a72c439-d0a7-45e8-8d20-d7a235d02763/DATASET/LINEMOD-O/lmo_test_all/test/000002, zfarnp.inf):super().__init__(base_dir, zfarzfar)self.dataset_name lmoself.K list(self.K_table.values())[0]self.ob_ids [1,5,6,8,9,10,11,12]self.ob_id_to_names {1: ape,2: benchvise,3: bowl,4: camera,5: water_pour,6: cat,7: cup,8: driller,9: duck,10: eggbox,11: glue,12: holepuncher,13: iron,14: lamp,15: phone,}# self.load_symmetry_tfs()def get_gt_mesh_file(self, ob_id):mesh_dir f{BOP_DIR}/{self.dataset_name}/models/obj_{ob_id:06d}.plyreturn mesh_dirclass LinemodReader(LinemodOcclusionReader):def __init__(self, base_dir/mnt/9a72c439-d0a7-45e8-8d20-d7a235d02763/DATASET/LINEMOD/lm_test_all/test/000001, zfarnp.inf, splitNone):super().__init__(base_dir, zfarzfar)self.dataset_name lmif split is not None: # train/testprint(## split is not None)with open(f/mnt/9a72c439-d0a7-45e8-8d20-d7a235d02763/DATASET/LINEMOD/Linemod_preprocessed/data/{self.get_video_id():02d}/{split}.txt,r) as ff:lines ff.read().splitlines()self.color_files []for line in lines:id int(line)self.color_files.append(f{self.base_dir}/rgb/{id:06d}.png)self.make_id_strs()self.ob_ids np.setdiff1d(np.arange(1,16), np.array([7,3])).tolist() # Exclude bowl and mug# self.load_symmetry_tfs()def get_gt_mesh_file(self, ob_id):root self.base_dirprint(f{root}/../)print(f{root}/lm_models)print(f{root}/lm_models/models/obj_{ob_id:06d}.ply)while 1:if os.path.exists(f{root}/lm_models):mesh_dir f{root}/lm_models/models/obj_{ob_id:06d}.plybreakelse:root os.path.abspath(f{root}/../)mesh_dir f{root}/lm_models/models/obj_{ob_id:06d}.plybreakreturn mesh_dirdef get_reconstructed_mesh(self, ref_view_dir):mesh trimesh.load(os.path.abspath(f{ref_view_dir}/model/model.obj))return meshclass YcbVideoReader(BopBaseReader):def __init__(self, base_dir, zfarnp.inf):super().__init__(base_dir, zfarzfar)self.dataset_name ycbvself.K list(self.K_table.values())[0]self.make_id_strs()self.ob_ids np.arange(1,22).astype(int).tolist()YCB_VIDEO_DIR os.getenv(YCB_VIDEO_DIR)self.ob_id_to_names {}self.name_to_ob_id {}# names sorted(os.listdir(f{YCB_VIDEO_DIR}/models/))if os.path.exists(f{YCB_VIDEO_DIR}/models/):names sorted(os.listdir(f{YCB_VIDEO_DIR}/models/))for i,ob_id in enumerate(self.ob_ids):self.ob_id_to_names[ob_id] names[i]self.name_to_ob_id[names[i]] ob_idelse:names []if 0:# if BOP not in self.base_dir:with open(f{self.base_dir}/../../keyframe.txt,r) as ff:self.keyframe_lines ff.read().splitlines()# self.load_symmetry_tfs()for ob_id in self.ob_ids:if ob_id in [1,4,6,18]: # Cylinderself.geometry_symmetry_info_table[ob_id] {symmetries_continuous: [{axis:[0,0,1], offset:[0,0,0]},],symmetries_discrete: euler_matrix(0, np.pi, 0).reshape(1,4,4).tolist(),}elif ob_id in [13]:self.geometry_symmetry_info_table[ob_id] {symmetries_continuous: [{axis:[0,0,1], offset:[0,0,0]},],}elif ob_id in [2,3,9,21]: # Rectangle boxtfs []for rz in [0, np.pi]:for rx in [0,np.pi]:for ry in [0,np.pi]:tfs.append(euler_matrix(rx, ry, rz))self.geometry_symmetry_info_table[ob_id] {symmetries_discrete: np.asarray(tfs).reshape(-1,4,4).tolist(),}else:passdef get_gt_mesh_file(self, ob_id):if BOP in self.base_dir:mesh_file os.path.abspath(f{self.base_dir}/../../ycbv_models/models/obj_{ob_id:06d}.ply)else:mesh_file f{self.base_dir}/../../ycbv_models/models/obj_{ob_id:06d}.plyreturn mesh_filedef get_gt_mesh(self, ob_id:int, get_posecnn_versionFalse):if get_posecnn_version:YCB_VIDEO_DIR os.getenv(YCB_VIDEO_DIR)mesh trimesh.load(f{YCB_VIDEO_DIR}/models/{self.ob_id_to_names[ob_id]}/textured_simple.obj)return meshmesh_file self.get_gt_mesh_file(ob_id)mesh trimesh.load(mesh_file, processFalse)mesh.vertices * 1e-3tex_file mesh_file.replace(.ply,.png)if os.path.exists(tex_file):from PIL import Imageim Image.open(tex_file)uv mesh.visual.uvmaterial trimesh.visual.texture.SimpleMaterial(imageim)color_visuals trimesh.visual.TextureVisuals(uvuv, imageim, materialmaterial)mesh.visual color_visualsreturn meshdef get_reconstructed_mesh(self, ob_id, ref_view_dir):mesh trimesh.load(os.path.abspath(f{ref_view_dir}/ob_{ob_id:07d}/model/model.obj))return meshdef get_transform_reconstructed_to_gt_model(self, ob_id):out np.eye(4)return outdef get_visible_cloud(self, ob_id):file os.path.abspath(f{self.base_dir}/../../models/{self.ob_id_to_names[ob_id]}/visible_cloud.ply)pcd o3d.io.read_point_cloud(file)return pcddef is_keyframe(self, i):color_file self.color_files[i]video_id self.get_video_id()frame_id int(os.path.basename(color_file).split(.)[0])key f{video_id:04d}/{frame_id:06d}return (key in self.keyframe_lines)class TlessReader(BopBaseReader):def __init__(self, base_dir, zfarnp.inf):super().__init__(base_dir, zfarzfar)self.dataset_name tlessself.ob_ids np.arange(1,31).astype(int).tolist()self.load_symmetry_tfs()def get_gt_mesh_file(self, ob_id):mesh_file f{self.base_dir}/../../../models_cad/obj_{ob_id:06d}.plyreturn mesh_filedef get_gt_mesh(self, ob_id):mesh trimesh.load(self.get_gt_mesh_file(ob_id))mesh.vertices * 1e-3mesh trimesh_add_pure_colored_texture(mesh, colornp.ones((3))*200)return meshclass HomebrewedReader(BopBaseReader):def __init__(self, base_dir, zfarnp.inf):super().__init__(base_dir, zfarzfar)self.dataset_name hbself.ob_ids np.arange(1,34).astype(int).tolist()self.load_symmetry_tfs()self.make_scene_ob_ids_dict()def get_gt_mesh_file(self, ob_id):mesh_file f{self.base_dir}/../../../hb_models/models/obj_{ob_id:06d}.plyreturn mesh_filedef get_gt_pose(self, i_frame:int, ob_id, use_my_correctionFalse):logging.info(WARN HomeBrewed doesnt have GT pose)return np.eye(4)class ItoddReader(BopBaseReader):def __init__(self, base_dir, zfarnp.inf):super().__init__(base_dir, zfarzfar)self.dataset_name itoddself.make_id_strs()self.ob_ids np.arange(1,29).astype(int).tolist()self.load_symmetry_tfs()self.make_scene_ob_ids_dict()def get_gt_mesh_file(self, ob_id):mesh_file f{self.base_dir}/../../../itodd_models/models/obj_{ob_id:06d}.plyreturn mesh_fileclass IcbinReader(BopBaseReader):def __init__(self, base_dir, zfarnp.inf):super().__init__(base_dir, zfarzfar)self.dataset_name icbinself.ob_ids np.arange(1,3).astype(int).tolist()self.load_symmetry_tfs()def get_gt_mesh_file(self, ob_id):mesh_file f{self.base_dir}/../../../icbin_models/models/obj_{ob_id:06d}.plyreturn mesh_fileclass TudlReader(BopBaseReader):def __init__(self, base_dir, zfarnp.inf):super().__init__(base_dir, zfarzfar)self.dataset_name tudlself.ob_ids np.arange(1,4).astype(int).tolist()self.load_symmetry_tfs()def get_gt_mesh_file(self, ob_id):mesh_file f{self.base_dir}/../../../tudl_models/models/obj_{ob_id:06d}.plyreturn mesh_file
运行run_linemod.py
python run_linemod.py
能看到文件夹model_free_ref_views/lm_test_all/000015/track_vis/
里面存放可视化结果 分享完成~
本文先介绍到这里后面会分享“6D位姿估计”的其它数据集、算法、代码、具体应用示例。