Posts

Simplest Demo for Displaying Image Processing Results using PySide6

The Deep Learning algorithm applications, especially image processing algorithms, its results are always needed to be displayed visually. Raw OpenCV imshow stuff is simple but not enough, failed on expansibility and maintainability. The Qt library, on the other side, is a well-developed UI framework that is evolved from C/C++ and is available in python now. The PyQt4/PyQt5 packages are from the community, and PySide2/PySide6 packages are from Qt officially. We will use PySide6 here, for it’s newer and official. The data wrappers, which are used for images data, are always OpenCV Mat. Here’s the workflow. Images are captured from camera devices or video stream/files and decoded into OpenCV mat one by one, then given to the algorithm for further processing. Finally, the UI framework, Qt, takes over the results display work.

Wednesday, March 9, 2022 Read

Using MMOCR: Installation and Training with my own Data

What is OCR? According to the wikipedia, OCR, which refers to Optical character recognition or optical character reader, is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text. Why MMOCR? Pytorch Maintained by Openmmlab Easy to train your own data Can be deployed into C++ code Pytorch is sweet, it seems to have absorbed most of the advantages of current deep learning frameworks. It is maintained by the OpenMMLab, belongs to SenseTime, which means that this repo can live long and keep updating. MMOCR has a clear structure just like MMDetection. It has the experimental deployment code too.

Friday, March 4, 2022 Read

Setup GitLab Sevice on Ubuntu 16.04 in Local LAN

Setup GitLab Sevice on Ubuntu 16.04 in Local LAN GitLab is as powerful as GitHub. We decide to change the version control system from SVN to Git. We run the GitLab server in our local LAN. I found the blogs for GitLab installation are all too long to read and too complicated for using GitLab in local LAN with no domain name, so I wrote this one. Before installing GitLab, please update the Ubuntu repository.

Tuesday, February 8, 2022 Read

使用&运算符在OpenCV图像裁剪时进行边界检查

给定ROI的图像裁剪假设需要按照既定的ROI对图像进行取窗裁剪，用cv::Rect给定ROI区域，裁剪可以按照如下方式： cv::Mat image = cv::imread("/path/to/image.jpg"); cv::Rect roi = cv::Rect(x, y, width, height); cv::Mat crop = image(roi); 限制边界如果roi的坐标超出了图像的合法区域，会引发运行时错误，导致程序崩溃。此时一般要提前进行边界检查和规范，比如这样： if(roi.x<0) roi.x=0; if(roi.y<0) roi.y=0; if(roi.x+roi.width >= image.cols) roi.width = image.cols-roi.x; if(roi.y+roi.height >= image.rows) roi.heigth = image.rows-roi.y; 这样写代码，看上去不太直观，而且有些冗长，更谈不上优雅或者可读性。或者这样： int w = image.cols; int h = image.rows; int x0 = std::max<int>(0, roi.tl().x); int y0 = std::max<int>(0, roi.tl().y); int x1 = std::min<int>(w, roi.br().x); int y1 = std::min<int>(h, roi.br().y); roi = cv::Rect(cv::Point(x0, y0), cv::Point(x1, y1)); 稍微增加了些可读性，特别是如果习惯于使用stl的max/min函数进行边界检查。但是仍然冗长，不够优雅。冗长有什么坏处？一般来讲，冗长的代码不易于维护，可读性不会太强。另外以上面这段实现为例，由于反复使用同一变量，仅仅为了对其不同的成员做类似的操作，非常容易导致低级错误。 Operator & : Get Intersection of cv::Rect 这个运算符&比较直观。在C/C++语法中，&属于位运算，是按位与的功能。cv::Rect类型重载了它，可以想象它的功能就是取矩形的相交区域。所以要对图像ROI的cv::Rect进行边界限制，那么将ROI和表示图像区域的Bounding Box求相交区域即可。代码实现如下： cv::Rect bbox(0, 0, mat.cols, mat.rows); cv::Rect roi = roi & bbox; // that's all 这样基本上就一句话完成了边界限制。

Thursday, August 26, 2021 Read

PyTorch中的lr_scheduler的用法

几种常见的LR_Scheduler: StepLR MultiStepLR ExponentialLR 对于一个基本的训练流程，LR_Scheduler可能不是必须的。但是对于一个完整的训练流程，LR_Scheduler就是必须存在的。LR_Scheduler跟在Optimizer之后，利用对Optim变量的跟踪，在Optim执行Update后，检查lr是否满足预设条件并对学习率learning_rate进行更新。以StepLR为例，StepLR每隔N个epoch改变lr学习率为lr=lr*gamma。 import torch.optim.lr_scheduler as lr_scheduler optimizer = SGD(model, 0.1) scheduler = lr_scheduler.StepLR(optimizer, step_size=10, gamma=0.1) for epoch in range(20): for input, target in dataset: optimizer.zero_grad() output = model(input) loss = loss_fn(output, target) loss.backward() optimizer.step() scheduler.step()

Tuesday, August 24, 2021 Read

Using MNIST-ROT Dataset in PyTorch

Dataset Download Link: MNIST-ROT Introduction As was depicted in paper PDO-eConvs, MNIST-rot-12k is the most commonly used dataset for validating rotation equivariant algorithms. So if you are interested in the rotation equivariant algorithms research, this Dataset maybe helpful. Turn to DataLoader Code.

Tuesday, August 24, 2021 Read

OpenCV DNN Batch Inference in C++

OpenCV has a DNN module, which is powerful, efficient, and easy to use. To implement a DNN inference application, we need only to call a couple of APIs which are offered by OpenCV DNN module. The basic routine of implementation a DNN inference code by OpenCV is as below. Initialization. Creating the cv::dnn::Net object by reading in the network weight. (caffe/onnx…) Preprocessing. Determine shape of input data of the network. Reshape the raw input image(s) to match the input data shape. This step always combined some other operatations such as normalization. Inference. Call inference method by the created cv::dnn::Net object. Postprocessing. Decoding the output data and do further wrangling. In my opinion, as the network weights are already determined, the most important parts of the deployment are pre&postprocessing. You need to figure out exactly what shape of input data is, and what the normalization method is (mean/std value). In post processing, things may be much more complicated. Some tasks are easy to implement, classification tasks for instance. Some tasks will be much harder to implement, such as object detection/segmentation tasks. You need to do a lot of work to crack the data wrangling problems, and sometimes may need to rewrite some operatations yourself from scratch, just because there is no corresponding method with the original python implementation in C++.

Monday, August 23, 2021 Read

QtHttpServer Demo and a POST Client

Client HTTP POST 假设http://127.0.0.1:8888/post/是一个能够接受POST请求的路径，我们想要向它提交一段json数据，用Qt可以这样实现： Suppose we want to make an HTTP POST with json body to http://127.0.0.1:8888/post/. QCoreApplication app(argc, argv); QNetworkAccessManager *mgr = new QNetworkAccessManager; const QUrl url(QStringLiteral("http://127.0.0.1:8888/post/")); QNetworkRequest request(url); request.setHeader(QNetworkRequest::ContentTypeHeader, "application/json; charset=utf-8"); QJsonObject obj; obj["key1"] = "value1"; obj["key2"] = "value2"; QJsonDocument doc(obj); QByteArray data = doc.toJson(); QNetworkReply *reply = mgr->post(request, data); QObject::connect(reply, &QNetworkReply::finished, [=](){ if(reply->error() == QNetworkReply::NoError){ QString contents = QString::fromUtf8(reply->readAll()); qDebug() << contents; } else{ QString err = reply->errorString(); qDebug() << err; } reply->deleteLater(); mgr->deleteLater(); }); Http Server 而这个本地的Server，亦可使用QtHttpServer方便实现： Server can be implemented by QtHttpServer easily, too.

Saturday, August 21, 2021 Read

CMake管理C/C++工程的一点心得

现在我自己的几乎所有C/C++项目均使用CMake管理。CMake语法简洁功能强大，并且大部分主流C/C++ Lib库都内建了对CMake的支持。我在工作中主要使用到比较有代表性的Lib库： OpenCV OpenCV是Intel维护的开源库，图像处理必备 Boost Boost是对C++语言最重要的扩展库，提供了对标注库的扩展、标准编译器尚未支持的新特性和一些语法糖 Qt 强大的UI库 CUDA NVIDIA显卡并行加速支持下面从一个最简单的Hello CMake程序开始，介绍CMake在实际使用中的一些方式。 Nearly Empty C/C++ Project 最简单的CMakeLists文件，可以参考hello-cmake，非常的简洁，这里我略加修改引用一下。假设我们的项目里面只有这么一个代码文件，它的内容是这样的： #include <iostream> int main(int argc, char *argv[]) { std::cout << "Hello CMake!" << std::endl; return 0; } 没有外部依赖，甚至也没有多余的逻辑，这基本就是C++的HelloWorld代码，只不过打印输出变成了Hello CMake。在项目路径下，添加CMakeLists.txt文件，写入如下内容： cmake_minimum_required(VERSION 3.5) # Set the project name project (hello_cmake) # Add an executable add_executable(hello_cmake main.cpp) 在项目路径下执行如下命令，即可完成编译： mkdir build && cd build cmake .. make 执行 ./hello_cmake 输出Hello CMake，编译成功。 CMake with OpenCV 那么如果想在项目中调用OpenCV，该怎么做呢？写一个简单的程序cv-test.cpp，调用OpenCV进行读图像和显示： #include <iostream> #include <opencv2/opencv.hpp> #include <opencv2/highgui.hpp> int main(int argc, char** argv ) { if ( argc != 2 ) { std::cout<<"usage: DisplayImage.out <Image_Path>"<<std::endl; return -1; } cv::Mat image; image = cv::imread( argv[1] ); if ( image.empty() ) { std::cout<<"No image data"<<std::endl; return -1; } cv::imshow("Display Image", image); cv::waitKey(0); return 0; } 对应的CMakeLists.txt文件：

Friday, August 20, 2021 Read

在动态链接库中以非GUI形式调用Qt组件并提供C语言形式API

编译 QtHttpServer 模块首先拉取代码 git clone https://github.com/qt-labs/qthttpserver.git cd qthttpserver git checkout 5.15 git submodule update --init --recursive 然后用qtcreator打开工程，编译最简单的Qt代码如果想构建一个最简单的Qt程序，那么大概就是一个没有UI、控制台下运行的HelloWorld程序，它的代码大概是这个样子： #include <QtCore> int main(int argc, char **argv) { QCoreApplication app(argc, argv); // qDebug()<<"Hello World"; return app.exec(); } 使用QtHttpServer QtHttpServer目前不在Qt的主lib中，据说Qt6会正式加入。所以需要自行下载编译。编译QtHttpServer 我使用的是Qt5，牵出5.15分支，并自己使用QtCreator编译： git clone https://github.com/qt-labs/qthttpserver.git git checkout 5.15 编译成功后，将头文件和动态库拷贝到所使用的Qt安装路径下的对应目录内即可。 cd build-qthttpserver-Desktop_Qt_5_12_6_GCC_64bit-Release/ mv ./* /opt/Qt5.12.6/5.12.6/gcc_64/lib/ cd cmake/ mv ./* /opt/Qt5.12.6/5.12.6/gcc_64/lib/cmake/ cd .. cd pkgconfig/ mv ./* /opt/Qt5.12.6/5.12.6/gcc_64/lib/pkgconfig/ cd .. cd .. cd include/ mv ./* /opt/Qt5.12.6/5.12.6/gcc_64/include/ cd .. cd mkspecs cd modules mv ./* /opt/Qt5.12.6/5.12.6/gcc_64/mkspecs/modules/ 启动QtHttpServer 在Qt官方blog introducing-qt-http-server，介绍了一个最基本的QHttpServer用法：

Wednesday, August 18, 2021 Read

Build OpenCV 4.5.2 with CUDA support | CMake+VS2019+Win10+CUDA

使用C++进行计算机图像处理方向的开发，OpenCV是最常使用的Lib之一。个人对OpenCV的使用，大概从2.4版本开始，一直到现在的Release版4.5.2[现在是2021.06.08]。 OpenCV功能比较丰富，包含了传统图像处理的绝大部分经典算法，其基本矩阵类型Mat简单强大，现在还加入了DNN模块支持深度学习模型推理，这两部分都可以单独深入讨论。最近决定使用DNN模块进行算法部署，需要用到GPU加速,而官方提供的标准编译安装包不满足需求了，遂决定自己动手编译，定制符合自己需求的OpenCV Lib。这对自己日常的开发工作来说，其实是一本万利的。由于业务需求，本文实践在Windows 10操作系统下进行。在Linux下进行此类编译更为便捷，问题也更少，步骤基本类似。好在Windows平台下的Visual Stuido对CMake的支持也越做越好了。按照自己的习惯，我在自己的work目录下建立了一个名为opencv-github的文件夹，随后从git上把最新的opencv和opencv-contrib牵下来。opencv-contrib一般包含非release组件以及nonfree组件。 git clone https://github.com/opencv/opencv.git git clone https://github.com/opencv/opencv_contrib.git 如果安装了visual stuido 2019，cmake-gui大概率可以从cmd控制台直接呼出来。打开cmake-gui后，配置一下源码路径和编译路径。我的配置如下: 源码路径 D:/WORK/opencv-github/opencv 编译路径 D:/WORK/opencv-github/opencv/build 点击Configure，选择vs2019 x64编译第一次Configure，ippicv依赖包下载失败，重新configure了一次，这次成功了。搜索OPENCV_EXTRA_MODULES_PATH选型，填入OpenCV Contrib的路径 D:/WORK/opencv-github/opencv_contrib/modules 搜索cuda，勾选OPENCV DNN CUDA和WITH CUDA。不要勾选 BUILD CUDA STUBS。 cuda是要提前安装的，cudnn最好也安装上，注意cuda和cudnn版本的对应。再次点击configure。选中BUILD opencv world选项。这里主要是为了使用便捷，一个胜过你所有，不必单独一个个去链接。当然，单独选择链接组件的方式更有利于控制release程序的体积，有利有弊吧。选中 enable nonfree选项，打开这个选项会把一些不能免费商用的算法编译进去，做些研究和实验还是可以的。之前的nonfree算法中最著名的当属sift，不过现在sift已经免费啦。现在点击generate Generate结束。可以点击Open Project了。这样就呼出了宇宙第一IDE 2019。然后就可以开始编译啦~ 展开CMakeTargets，选择ALL BUILD 右键菜单选择build或者直接f7键，即可开始编译。ALL BUILD编译结束后，再选中INSTALL执行编译，这样编译出来的头文件、lib文件、dll文件和cmake文件就在开头提到的build目录下的install目录下面了。其实这里的ALL BUILD和INSTALL，相当于linux下面常见的make和make install。现在已经有了头文件、lib文件、dll文件和cmake文件，就可以开始愉快的开发了哦。

Monday, August 16, 2021 Read

在kaggle的notebook 执行 import tensorflow-io 报错 undefined symbol

最近尝试在kaggle的ipython notebook搞事情，在尝试一个新方案的时候，被动执行了import tensorflow-io，代码如下： import tensorflow-io 这时提示一个错误，核心内容主要是这段： libtensorflow_io.so undefined symbol: _ZN10tensorflow0pKernel11TraceStringEPNS_150pKernelContextEb' 对tensorflow-io不是很熟，直接google,一个简单的error,搜出来一堆看上去不那么靠谱的答案，例如这个问题下面：unable-to-open-file-libtensorflow-io-so-caused-by-undefined-symbol 好几个答案让卸掉tensorflow和tensorflow-io，重新安装tensorflow-gpu和tensorflow-io。 But WHY?? 真正的错误原因很简单，就是tensorflow-io和tensorflow版本要适配，或者说兼容。tensorflow-io官方的git repo提供了版本号对应关系的表格。我按照表格检查了一下，kaggle笔记本环境提供的默认tensorflow版本是2.12，但是安装的tensorflow-io是0.31.0。根据表格信息，我应该安装tensorflow-io 0.32.0版本。笔记本中执行安装命令 !pip install tensorflow-io==0.32.0 错误消失。看来官方的笔记本也不是那么靠谱的喔。表格链接tensorflow-version-compatibility TensorFlow I/O Version TensorFlow Compatibility Release Date 0.32.0 2.12.x Mar 28, 2023 0.31.0 2.11.x Feb 25, 2023 0.30.0 2.11.x Jan 20, 2023 0.29.0 2.11.x Dec 18, 2022 0.28.0 2.11.x Nov 21, 2022 0.27.0 2.10.x Sep 08, 2022 0.26.0 2.9.x May 17, 2022 0.25.0 2.8.x Apr 19, 2022 0.24.0 2.8.x Feb 04, 2022 0.23.1 2.7.x Dec 15, 2021 0.23.0 2.7.x Dec 14, 2021 0.22.0 2.7.x Nov 10, 2021 0.21.0 2.6.x Sep 12, 2021 0.20.0 2.6.x Aug 11, 2021 0.19.1 2.5.x Jul 25, 2021 0.19.0 2.5.x Jun 25, 2021 0.18.0 2.5.x May 13, 2021 0.17.1 2.4.x Apr 16, 2021 0.17.0 2.4.x Dec 14, 2020 0.16.0 2.3.x Oct 23, 2020 0.15.0 2.3.x Aug 03, 2020 0.14.0 2.2.x Jul 08, 2020 0.13.0 2.2.x May 10, 2020 0.12.0 2.1.x Feb 28, 2020 0.11.0 2.1.x Jan 10, 2020 0.10.0 2.0.x Dec 05, 2019 0.9.1 2.0.x Nov 15, 2019 0.9.0 2.0.x Oct 18, 2019 0.8.1 1.15.x Nov 15, 2019 0.8.0 1.15.x Oct 17, 2019 0.7.2 1.14.x Nov 15, 2019 0.7.1 1.14.x Oct 18, 2019 0.7.0 1.14.x Jul 14, 2019 0.6.0 1.13.x May 29, 2019 0.5.0 1.13.x Apr 12, 2019 0.4.0 1.13.x Mar 01, 2019 0.3.0 1.12.0 Feb 15, 2019 0.2.0 1.12.0 Jan 29, 2019 0.1.0 1.12.0 Dec 16, 2018

Monday, January 1, 1 Read