600字范文 > 复盘：Jetson AGX Xavier部署YOLOv5结合ROS小车实现对目标的检测跟踪

复盘：Jetson AGX Xavier部署YOLOv5结合ROS小车实现对目标的检测跟踪

时间：2020-02-27 12:53:02

提示：系列被面试官问的问题，我自己当时不会，所以下来自己复盘一下，认真学习和总结，以应对未来更多的可能性

关于互联网大厂的笔试面试，都是需要细心准备的

（1）自己的科研经历，科研内容，学习的相关领域知识，要熟悉熟透了

（2）自己的实习经历，做了什么内容，学习的领域知识，要熟悉熟透了

（3）除了科研，实习之外，平时自己关注的前沿知识，也不要落下，仔细了解，面试官很在乎你是否喜欢追进新科技，跟进创新概念和技术

（4）准备数据结构与算法，有笔试的大厂，第一关就是手撕代码做算法题

面试中，实际上，你准备数据结构与算法时以备不时之需，有足够的信心面对面试官可能问的算法题，很多情况下你的科研经历和实习经历足够跟面试官聊了，就不需要考你算法了。但很多大厂就会面试问你算法题，因此不论为了笔试面试，数据结构与算法必须熟悉熟透了

秋招提前批好多大厂不考笔试，直接面试，能否免笔试去面试，那就看你简历实力有多强了。

文章目录

复盘：Jetson AGX Xavier部署YOLOv5结合ROS小车实现对目标的检测跟踪@[TOC](文章目录) 将目标检测算法yolov5部署到jetson agx Xavier上，让小车或者无人机背着Xavier，实现单目标的检测和跟踪，利用跟踪结果控制小车或者无人机的行动控制硬件：Jetson AGX XavierROS小车或大疆无人机无线模块信号线软件：YOLOv5算法ROS安装远程控制实现整体思路方案实现总结

将目标检测算法yolov5部署到jetson agx Xavier上，让小车或者无人机背着Xavier，实现单目标的检测和跟踪，利用跟踪结果控制小车或者无人机的行动控制

硬件：

Jetson AGX Xavier

ROS小车或大疆无人机

小车需搭载有摄像头。

无线模块

某宝购买：

无线模块使用的是官方推荐的M.2接口的intel 8265无线网卡，

某宝上有带天线整套卖的可以自行购买。

Xavier的一个缺点是天线不好固定，我简单的用黑胶带固定。

外站大佬有为Xavier设计外壳可以很好固定天线，希望国内也能有大佬设计一款。

信号线

信号线用于Xavier与小车嵌入式底盘进行通信，一般为串口通信。

用来发送控制命令的

软件：

YOLOv5算法

Jetson AGX Xavier实现TensorRT加速YOLOv5进行实时检测

ROS安装

Robot Operating System，机器人操作系统，简称ROS。

在许多机器人控制，小车控制领域中都会使用ROS。

ROS提供的C++，Python编程接口，让人工智能与ROS结合控制机器人动作的方案变得更加可行。

可实现目标追踪的任务：实现一个简单的追人。

采用yolov5算法进行检测，jetson与ROS通信对小车进行控制。

根据文章

Jetson AGX Xavier安装ROS

安装ROS。能够通过键盘控制ROS小车运动。（由于小车各不相同，大家自行根据小车资料配置，一般都很全面）

远程控制

根据文章

jetson agx xavier 完美使用NoMachine远程桌面控制

能够实现对Xavier进行远程桌面控制。（小车运动总不好背个显示器啥的）

实现

整体思路

不妨将人作为小车跟踪对象，将YOLOv5算法获得的人体目标框信息，

变成简单的左转、右转或者直行以及原地转圈指令。

然后该如何将这个指令发给小车呢，

我对ROS控制小车方面也很生疏，那我的想法就是：

结合ROS现有的键盘控制代码teleop_twist_keyboard.py，把按键信息改为控制指令，就完事了。

（当然这个方法是我实在不知道ROS控制小车的代码该如何写时想出来的，精通ROS的大佬就当看个笑话好了哈）。

方案实现

运行roscore

roscore

运行Base_control

这个launch文件用于建立Xavier与小车底盘的通信，负责收发信息。

这个地方根据各自小车的资料运行，如果你能通过键盘控制小车运动了，那肯定知道是啥了。

roslaunch ./xxx/Base_control.launch

目标跟踪程序

在tensorrtx/yolov5文件夹下新建Follow.py ，复制以下代码（代码很粗糙，仅作参考）。

注意 libmyplugins.so，yolov5s.engine这两个文件的路径。

"""An example that uses TensorRT's Python api to make inferences."""from __future__ import print_functionimport threadingimport roslib; roslib.load_manifest('teleop_twist_keyboard')import rospyfrom geometry_msgs.msg import Twistimport sys, select, termios, ttyimport ctypesimport osimport randomimport sysimport threadingimport timeimport cv2import numpy as npimport pycuda.autoinitimport pycuda.driver as cudaimport tensorrt as trtimport torchimport torchvisionINPUT_W = 608 #INPUT_W = 640 这里可能要换，如果608跑不通INPUT_H = 608 #INPUT_W = 640 CONF_THRESH = 0.25IOU_THRESHOLD = 0.45msg = ""x = 0y = 0z = 0th = 0status = 0int_box=[0,0,0,0]def plot_one_box(x, img, color=None, label=None, line_thickness=None):"""description: Plots one bounding box on image img,this function comes from YoLov5 project.param:x:a box likes [x1,y1,x2,y2]img: a opencv image objectcolor: color to draw rectangle, such as (0,255,0)label: strline_thickness: intreturn:no return"""tl = (line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1) # line/font thicknesscolor = color or [random.randint(0, 255) for _ in range(3)]c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)if label:tf = max(tl - 1, 1) # font thicknesst_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filledcv2.putText(img,label,(c1[0], c1[1] - 2),0,tl / 3,[225, 255, 255],thickness=tf,lineType=cv2.LINE_AA,)class YoLov5TRT(object):"""description: A YOLOv5 class that warps TensorRT ops, preprocess and postprocess ops."""def __init__(self, engine_file_path):# Create a Context on this device,self.cfx = cuda.Device(0).make_context()stream = cuda.Stream()TRT_LOGGER = trt.Logger(trt.Logger.INFO)runtime = trt.Runtime(TRT_LOGGER)# Deserialize the engine from filewith open(engine_file_path, "rb") as f:engine = runtime.deserialize_cuda_engine(f.read())context = engine.create_execution_context()host_inputs = []cuda_inputs = []host_outputs = []cuda_outputs = []bindings = []for binding in engine:size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_sizedtype = trt.nptype(engine.get_binding_dtype(binding))# Allocate host and device buffershost_mem = cuda.pagelocked_empty(size, dtype)cuda_mem = cuda.mem_alloc(host_mem.nbytes)# Append the device buffer to device bindings.bindings.append(int(cuda_mem))# Append to the appropriate list.if engine.binding_is_input(binding):host_inputs.append(host_mem)cuda_inputs.append(cuda_mem)else:host_outputs.append(host_mem)cuda_outputs.append(cuda_mem)# Storeself.stream = streamself.context = contextself.engine = engineself.host_inputs = host_inputsself.cuda_inputs = cuda_inputsself.host_outputs = host_outputsself.cuda_outputs = cuda_outputsself.bindings = bindingsdef infer(self, input_image_path):# threading.Thread.__init__(self)# Make self the active context, pushing it on top of the context stack.global int_box,x,y,z,thself.cfx.push()# Restorestream = self.streamcontext = self.contextengine = self.enginehost_inputs = self.host_inputscuda_inputs = self.cuda_inputshost_outputs = self.host_outputscuda_outputs = self.cuda_outputsbindings = self.bindings# Do image preprocessinput_image, image_raw, origin_h, origin_w = self.preprocess_image(input_image_path)# Copy input image to host buffernp.copyto(host_inputs[0], input_image.ravel())# Transfer input data to the GPU.cuda.memcpy_htod_async(cuda_inputs[0], host_inputs[0], stream)# Run inference.context.execute_async(bindings=bindings, stream_handle=stream.handle)# Transfer predictions back from the GPU.cuda.memcpy_dtoh_async(host_outputs[0], cuda_outputs[0], stream)# Synchronize the streamstream.synchronize()# Remove any context from the top of the context stack, deactivating it.self.cfx.pop()# Here we use the first row of output in that batch_size = 1output = host_outputs[0]# Do postprocessresult_boxes, result_scores, result_classid = self.post_process(output, origin_h, origin_w)if torch.tensor(0) in result_classid:for i in range(len(result_boxes)):box = result_boxes[i]if int(result_classid[i]) == 0 and result_scores[i] >= 0.5:int_box = list(map(int, box.numpy().tolist())) # tensor变为intcenter=int((int_box[0] + int_box[2]) / 2)print("center:", center)print('result:', 1)if center >= 900:print("right")x, y, z, th = 1, 0, 0, -1passelif center <= 300:print("left")x, y, z, th = 1, 0, 0, 1passelse:print("go")x,y,z,th = 1,0,0,0plot_one_box(box,image_raw,label="{}:{:.2f}".format(categories[int(result_classid[i])], result_scores[i]),)else:int_box = int_boxpassreturn image_rawelse:prebox = int_boxprecenter=int((prebox[0] + prebox[2])/2)print('result:', 2)if int_box != [0, 0, 0, 0]:if precenter >= 650:print("turn right")x, y, z, th = 0, 0, 0, -1else:print("turn left")x, y, z, th = 0, 0, 0, 1else:print("stop!")x, y, z, th = 0, 0, 0, 0print("precenter", precenter)return input_image_path# parent, filename = os.path.split(input_image_path)# save_name = os.path.join(parent, "output_" + filename)# # Save image# cv2.imwrite(save_name, image_raw)def destroy(self):# Remove any context from the top of the context stack, deactivating it.self.cfx.pop()def preprocess_image(self, input_image_path):"""description: Read an image from image path, convert it to RGB,resize and pad it to target size, normalize to [0,1],transform to NCHW format.param:input_image_path: str, image pathreturn:image: the processed imageimage_raw: the original imageh: original heightw: original width"""image_raw = input_image_pathh, w, c = image_raw.shapeimage = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)# Calculate widht and height and paddingsr_w = INPUT_W / wr_h = INPUT_H / hif r_h > r_w:tw = INPUT_Wth = int(r_w * h)tx1 = tx2 = 0ty1 = int((INPUT_H - th) / 2)ty2 = INPUT_H - th - ty1else:tw = int(r_h * w)th = INPUT_Htx1 = int((INPUT_W - tw) / 2)tx2 = INPUT_W - tw - tx1ty1 = ty2 = 0# Resize the image with long side while maintaining ratioimage = cv2.resize(image, (tw, th))# Pad the short side with (128,128,128)image = cv2.copyMakeBorder(image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, (128, 128, 128))image = image.astype(np.float32)# Normalize to [0,1]image /= 255.0# HWC to CHW format:image = np.transpose(image, [2, 0, 1])# CHW to NCHW formatimage = np.expand_dims(image, axis=0)# Convert the image to row-major order, also known as "C order":image = np.ascontiguousarray(image)return image, image_raw, h, wdef xywh2xyxy(self, origin_h, origin_w, x):"""description: Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-rightparam:origin_h: height of original imageorigin_w: width of original imagex:A boxes tensor, each row is a box [center_x, center_y, w, h]return:y:A boxes tensor, each row is a box [x1, y1, x2, y2]"""y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)r_w = INPUT_W / origin_wr_h = INPUT_H / origin_hif r_h > r_w:y[:, 0] = x[:, 0] - x[:, 2] / 2y[:, 2] = x[:, 0] + x[:, 2] / 2y[:, 1] = x[:, 1] - x[:, 3] / 2 - (INPUT_H - r_w * origin_h) / 2y[:, 3] = x[:, 1] + x[:, 3] / 2 - (INPUT_H - r_w * origin_h) / 2y /= r_welse:y[:, 0] = x[:, 0] - x[:, 2] / 2 - (INPUT_W - r_h * origin_w) / 2y[:, 2] = x[:, 0] + x[:, 2] / 2 - (INPUT_W - r_h * origin_w) / 2y[:, 1] = x[:, 1] - x[:, 3] / 2y[:, 3] = x[:, 1] + x[:, 3] / 2y /= r_hreturn ydef post_process(self, output, origin_h, origin_w):"""description: postprocess the predictionparam:output:A tensor likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...]origin_h: height of original imageorigin_w: width of original imagereturn:result_boxes: finally boxes, a boxes tensor, each row is a box [x1, y1, x2, y2]result_scores: finally scores, a tensor, each element is the score correspoing to boxresult_classid: finally classid, a tensor, each element is the classid correspoing to box"""# Get the num of boxes detectednum = int(output[0])# Reshape to a two dimentional ndarraypred = np.reshape(output[1:], (-1, 6))[:num, :]# to a torch Tensorpred = torch.Tensor(pred).cuda()# Get the boxesboxes = pred[:, :4]# Get the scoresscores = pred[:, 4]# Get the classidclassid = pred[:, 5]# Choose those boxes that score > CONF_THRESHsi = scores > CONF_THRESHboxes = boxes[si, :]scores = scores[si]classid = classid[si]# Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]boxes = self.xywh2xyxy(origin_h, origin_w, boxes)# Do nmsindices = torchvision.ops.nms(boxes, scores, iou_threshold=IOU_THRESHOLD).cpu()result_boxes = boxes[indices, :].cpu()result_scores = scores[indices].cpu()result_classid = classid[indices].cpu()return result_boxes, result_scores, result_classidclass myThread(threading.Thread):def __init__(self, func, args):threading.Thread.__init__(self)self.func = funcself.args = argsdef run(self):self.func(*self.args)class PublishThread(threading.Thread):def __init__(self, rate):super(PublishThread, self).__init__()self.publisher = rospy.Publisher('cmd_vel', Twist, queue_size = 1)self.x = 0.0self.y = 0.0self.z = 0.0self.th = 0.0self.speed = 0.0self.turn = 0.0self.condition = threading.Condition()self.done = False# Set timeout to None if rate is 0 (causes new_message to wait forever# for new data to publish)if rate != 0.0:self.timeout = 1.0 / rateelse:self.timeout = Noneself.start()def wait_for_subscribers(self):i = 0while not rospy.is_shutdown() and self.publisher.get_num_connections() == 0:if i == 4:print("Waiting for subscriber to connect to {}".format(self.publisher.name))rospy.sleep(0.5)i += 1i = i % 5if rospy.is_shutdown():raise Exception("Got shutdown request before subscribers connected")def update(self, x, y, z, th, speed, turn):self.condition.acquire()self.x = xself.y = yself.z = zself.th = thself.speed = speedself.turn = turn# Notify publish thread that we have a new message.self.condition.notify()self.condition.release()def stop(self):self.done = Trueself.update(0, 0, 0, 0, 0, 0)self.join()def run(self):twist = Twist()while not self.done:self.condition.acquire()# Wait for a new message or timeout.self.condition.wait(self.timeout)# Copy state into twist message.twist.linear.x = self.x * self.speedtwist.linear.y = self.y * self.speedtwist.linear.z = self.z * self.speedtwist.angular.x = 0twist.angular.y = 0twist.angular.z = self.th * self.turnself.condition.release()# Publish.self.publisher.publish(twist)# Publish stop message when thread exits.twist.linear.x = 0twist.linear.y = 0twist.linear.z = 0twist.angular.x = 0twist.angular.y = 0twist.angular.z = 0self.publisher.publish(twist)def getKey(key_timeout):tty.setraw(sys.stdin.fileno())rlist, _, _ = select.select([sys.stdin], [], [], key_timeout)if rlist:key = sys.stdin.read(1)else:key = ''termios.tcsetattr(sys.stdin, termios.TCSADRAIN, settings)return keydef vels(speed, turn):return "currently:\tspeed %s\tturn %s " % (speed,turn)if __name__ == "__main__":# load custom pluginsPLUGIN_LIBRARY = "build/libmyplugins.so"ctypes.CDLL(PLUGIN_LIBRARY)engine_file_path = "build/yolov5s.engine"settings = termios.tcgetattr(sys.stdin)rospy.init_node('my_code2')speed = rospy.get_param("~speed", 0.5)turn = rospy.get_param("~turn", 1.0)repeat = rospy.get_param("~repeat_rate", 0.0)# key_timeout = rospy.get_param("~key_timeout", 0.0)# if key_timeout == 0.0:#key_timeout = Nonepub_thread = PublishThread(repeat)# load coco labelscategories = ["person", "bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck", "boat", "traffic light","fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow","elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee","skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard","tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple","sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "couch","potted plant", "bed", "dining table", "toilet", "tv", "laptop", "mouse", "remote", "keyboard", "cell phone","microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear","hair drier", "toothbrush"]# a YoLov5TRT instanceyolov5_wrapper = YoLov5TRT(engine_file_path)cap = cv2.VideoCapture(0)fourcc = cv2.VideoWriter_fourcc(*'XVID') # output video codecfps = cap.get(cv2.CAP_PROP_FPS)w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))save_path = "save/video_out.avi"vid_writer = cv2.VideoWriter(save_path, fourcc, fps, (w, h))_,image =cap.read()i=0try:pub_thread.wait_for_subscribers()pub_thread.update(x, y, z, th, speed, turn)print(msg)print(vels(speed, turn))while 1:_,image =cap.read()img=yolov5_wrapper.infer(image)cv2.imshow("result", img)if cv2.waitKey(1) & 0XFF == ord('q'): # 1 millisecondbreakpub_thread.update(x, y, z, th, speed, turn)i=i+1if i<=100:# vid_writer.write(img)passexcept Exception as e:print(e)finally:cap.release()vid_writer.release()cv2.destroyAllWindows()yolov5_wrapper.destroy()pub_thread.stop()termios.tcsetattr(sys.stdin, termios.TCSADRAIN, settings)

conda activate yolov5envpython Follow.py

实现效果

整个控制效果如图，实现了简单的人体跟踪，除了控制有点僵硬，感觉还可以。

可惜车上的雷达没有利用好，当时是打算结合雷达做避障功能的，

但是由于个人尚未涉及SLAM方面的知识（太菜），所以成品很简单粗糙。

对于不同的项目而言，大家可以根据自己的跟踪目标单独训练权重。

由于本文直接使用官方权重文件，所以在检测速度上会有所影响。