An updated guide to get hardware-accelerated OpenCV on BeagleBone Black

If you're interested in BBB and OpenCV, you must have known Michael Darling's guide. Seriously, if you have not read it, go read it please to understand the hard works that people have done to make our life with BBB much much easier. These two blog posts about BBB and OpenCV are also very interesting [1] & [2].

That guide is dated to September 24, 2013. Since then, many things have changed, and fortunately, things are much easier now. I will point out steps that need to be changed from the guide (due to links changes, sources changes, ...). I will also include some steps that I hope will make things easier new comers.

My steps assume that your main operating system is Windows.

This is is intended to be used side by side with Michael Darling's guide. Michael Darling wrote a really excellent guide, I just want to update and add some steps.

You have to install Ubuntu onto your BBB. Go to here, I recommend getting Ubuntu Precise 12.04. I also recommend flash it directly to your eMMC, as your eMMC is much faster than any uSD card. You can do that by:
  1. Download ubuntu-precise-12.04.3-armhf-3.8.13-bone30.img.xz to your desktop
  2. Extract it (you need 7-zip, it's a free and very powerful archive manager)
  3. Copy it to the root of your uSD card
  4. Go to here, download it, and do as instructed to perform flashing
We need to install dependencies, tons of them. Make an "install.sh" file with the following content is the easiest and most automatic way to do so:

1
2
3
4
5
6
7
sudo apt-get -y install build-essential checkinstall cmake cmake-curses-gui pkg-config yasm
sudo apt-get -y install libtiff4-dev libjpeg-dev libjasper-dev
sudo apt-get -y install libavcodec-dev libavformat-dev libswscale-dev libxine-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev libv4l-dev
sudo apt-get -y install python-dev python-numpy
sudo apt-get -y install libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev libvorbis-dev libxvidcore-dev
sudo apt-get -y install x264 v4l-utils ffmpeg
sudo apt-get -y install libgtk2.0-dev

And one tip for you, doing "sudo apt-get clean" after installing packages will clean the download cache and free up disk space (it can free 200-400MB, which matters much on BBB).

Recently, libjpeg-turbo have been choose to be default libjpeg-dev package for armhf distro, so you don't have to built it from source like in the guide.

I highly recommend doing this step. Building OpenCV on a single core ARM board takes 3-4 hours and will make your little ARM CPU hot as hell. Setting up distcc and configure OpenCV correctly, we can build it in 15 minutes, and BBB CPU is not even warm.

About the compiler, you can use the one in the guide, version 4.8-2013.08 or use the updated here.

The new commands for installing distcc are:

1
2
3
4
wget https://distcc.googlecode.com/files/distcc-3.1.tar.bz2
tar xjf distcc-3.1.tar.bz2
cd distcc-3.1
./configure --with-gtk --disable-Werrormakesudo make install

Get a USB flash drive, and use MiniTool Partition Wizard Home Edition (free) to format it to Ext2. Remember to give it a name when doing so, the name must contain no space, for example "myusb". Plug it in your BBB.

The new commands for building OpenCV:

1
2
3
4
5
6
7
8
su
cd /media/myusb
wget http://downloads.sourceforge.net/project/opencvlibrary/opencv-unix/2.4.8/opencv-2.4.8.zip
unzip opencv-2.4.8.zip
cd opencd-2.4.8
mkdir build
cd build
cmake -D CMAKE_C_FLAGS='-O3 -mfpu=neon -mfloat-abi=hard' -D CMAKE_CXX_FLAGS='-O3 -mfpu=neon -mfloat-abi=hard' -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -DENABLE_VFPV3=ON -DENABLE_NEON=ON ..

We need to disable some part of the build to speed it up:

1
ccmake .

Remember the dot "." after ccmake command. Check to make sure that ENABLE_NEON and ENABLE_VFPV3 are ON. Now we disable BUILD_PERF_TESTS, BUILD_TESTS, and ENABLE_PRECOMPILED_HEADERS. Some explanations:

  • ENABLE_NEON and ENABLE_VFPV3: to enable hardware acceleration
  • BUILD_PERF_TESTS and BUILD_TESTS: set these 2 to OFF to disable the building of these 2 test suites of OpenCV. Why do we want to not build the tests? Building these tests takes lots lots of time, and I'm sure that you don't have enough patient to run those tests. There is an individual executable for each module of OpenCV, and I tried running the test for opencv_core myself just to see me interrupted the test after waiting for ~20 minutes.
  • ENABLE_PRECOMPILED_HEADERS: by theory, using precompiled headers speed up compilation. But that's not in this case. Our BBB CPU will have to compile those headers alone, as distcc can't do this. And doing this takes BBB CPU lots of time. The precompiled headers in this case will only speed up compilation on host machine, which is already powerful. Less work for BBB = faster overall compilation :).

Follow the on-screen instructions, press "c" to reconfigure and then "g" to regenerate. Finally, we can begin compilation:

1
make -j6

The "-jX" after make command allow parallelization, set X to the number of jobs your host allow.

My distcc host is a virtual machine running Ubuntu 12.04. My desktop is powered by an AMD Phenom II X4 and 4GB RAM so I give my virtual Ubuntu 3 cores and 1.5GB RAM. Therefore the number of jobs is 6. My setup built OpenCV in just 15 minutes.

Some final notes:

  • Environmental variables in Unix systems are only temporary, they got wiped out after reboot or even on session exit. So if you plan to keep using distcc, you will have to edit the ".profile" file in your home folder. Just paste the export commands at the end of that file. The variables will be created every time you log in using that username.
  • Don't forget to tell your system where to look for OpenCV runtime libraries. Edit /etc/ld.so.conf and add "/usr/local/lib" and the end of it. Then run "sudo ldconfig".
  • The easiest way to edit those files from Windows is using WinSCP (free). Using WinSCP, you just need to navigate to the needed file, double click it, a text editor will open it up, and when you save, WinSCP will put the updated version in the right place. Nice and clean. Log in as your username to edit the ".profile" file, and as root to edit /etc/ld.so.conf.

This "guide" lacks lots of stuffs. That's my intention, so you have to read this guide in combination with Michael Darling's one.

Feel free to leave suggestions and/or questions :).

Update: Watch the Final test run of my Thesis on YouTube

17 comments:

kannan said...

i want to know wer exactly ur compiling opencv . is it on beaglebone black or pc . please make it clear.

Unknown said...

Please read again, I am using Distributed Cross Compiler. The compilation is distributed between BBB and PC, with the heavy work on PC.

Unknown said...

Very nice! Congratulations! I am using opencv in BEAGLEBONE black, but unfortunately without NEON and without cross-compile :(
I will try to follow in your footsteps this week!
But I'm wanting to keep the newly released Debian 7.7 in my BBB, do you think I can get in trouble for that?

Unknown said...

Well, I don't know about whether there will be problems with your Debian. Just keep in mind that you definitely need libjpeg-turbo library.

And the cross-compile is actually "distributed cross-compile", much easier to setup. Having the burdening of compiling handled by powerful PC saves your time installing OpenCV and building your program.

Unknown said...

I just finished my tests using Debian 7.7 and OpenCV 2.4.10, and everything went very well. Thank you so much.

Unknown said...

It's good to hear about your success. Good luck, my friend.

Unknown said...

First of all . Perfect post for such hardware sensitive topic. I am motivated to learn about cross compilation . This can't be better . And shout out to Michael Darling too.
Second. I encountered a problem in the build process.
The following log :
//log begins
c++: internal compiler error: Killed (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See for instructions.
make[2]: *** [modules/core/CMakeFiles/opencv_core.dir/src/matmul.cpp.o] Error 4
make[2]: *** Waiting for unfinished jobs....
^Cmake[2]: *** [modules/core/CMakeFiles/opencv_core.dir/src/rand.cpp.o] Interrupt
make[2]: *** [modules/core/CMakeFiles/opencv_core.dir/src/stat.cpp.o] Interrupt
make[2]: *** [modules/core/CMakeFiles/opencv_core.dir/src/conjugate_gradient.cpp.o] Interrupt
make[2]: *** wait: No child processes. Stop.
make[1]: *** [modules/core/CMakeFiles/opencv_core.dir/all] Error 2
make: *** [all] Interrupt

//log ends

Please enlighten me.
PS : I havent' used an external SD Card.

Regards.

Unknown said...

From my humble experience (and provided you did not use a SD card), it is likely that you ran out of space.

You know, when compiling, lots of *.o files are made (object file). And they occupy tons of space.

If I remember it right, I encounter similar errors once when I did not use a SD card ;).

Unknown said...

Hi Tung Vu,

Excellent guide on the updated instructions for achieving 30 fps on the BBB.
I have some questions to ask.

Currently, I have successfully built OpenCV with libjpeg-turbo and NEON enabled on Debian Jessie 8.3 in the BBB.
I have set up cross-compilation for OpenCV from my Host PC (x86) running uBuntu 14.04 LTS. The toolchain I am using is gcc/g++ Linaro 4.8

1. Do I need to build libjpeg-turbo in my host PC as well when I compile the program in my host and transfer it to my target?

2. Also, when building libjpeg-turbo in the PC, do I need to specify the options
../configure CPPFLAGS=`-O3 -pipe -fPIC -mfpu=neon -mfloat-abi=hard`
as stated in the guide? My PC certainly won't support NEON SIMD instructions but I was wondering whether it is needed for the cross-compile process.

Thank you in advance!

Unknown said...

Hi Joshua Wong,

I think you misunderstood my compilation process.

I did not "cross-compile" as that is very complicated to set up. As you said yourself, some instructions are not supported on either side.

I use something call "Distributed Cross-Compiler". You can get more information here: http://distcc.googlecode.com/svn/tags/distcc-3.0rc3/doc/web/index.html

Basically, there are 3 stages in the "compilation" process (the correct word is "building"). The code are preprocessed, then compiled to create object files which are then linked to create the final executable. The pre-process and linking stage are light on CPU, but needs full libraries support (all needed headers as well as run-time libraries). While the compile stage is CPU-heavy but just need a cross-compiler to do so.

Therefore, the distributed cross-compiler will do the preprocess on BBB, then send preprocessed code to computer (much more powerful) to compile, then take back the object files to link on BBB.

I hope this clear your concerns. Feel free to contact me if there is anything else.

Unknown said...

Hi Tung Vu,

Thank you for your reply. Yes I understand the cross-compilation process better now. It's just that I thought I could set it up on my laptop and compile the binary for the BBB.

So from my understanding, after building OpenCV on the BBB, I also have to compile the code directly on the BBB and distcc will take care of the distributed cross-compilation right?

Thank you.

Best regards,
Joshua

Unknown said...

Hi Joshua Wong,

Yes, your understanding is now correct. According to my own experience, once setup, distcc is quite reliable.

Good luck, my friend.

Unknown said...

Thank you very much Tung Vu. I will try it out and see how it goes =)

Anuradha (Andre) said...
This comment has been removed by the author.
Anuradha (Andre) said...

@Tung Vu : Hi, thanks for your great article and supportive information. I was able to create symlinks and install compilers as described in Michael Darling's guide. The updates have been also verified. However after the distcc daemon command is run, the host machine hangs at following point :

listening on 0.0.0.0:3632
distccd[2648] (dcc_defer_accept) TCP_DEFER_ACCEPT turned on
distccd[2648] (dcc_standalone_server) 1 CPU online on this server
distccd[2648] (dcc_standalone_server) allowing up to 4 active jobs
distccd[2648] (dcc_standalone_server) not detaching
distccd[2648] (dcc_new_pgrp) already a process group leader
distccd[2648] (dcc_log_daemon_started) preforking daemon started (3.1 i686-pc-linux-gnu, built Jul 11 2016 17:18:42)
distccd[2648] (dcc_create_kids) up to 1 children
distccd[2648] (dcc_create_kids) up to 2 children
distccd[2648] (dcc_create_kids) up to 3 children
distccd[2648] (dcc_create_kids) up to 4 children

after this point, no activities are observed. BB's OS is Debian Jessie and the cross compilation is done on an Ubuntu 12.04 virtual machine. The opencv version I am using is v3 rc1. What could stop the distributed compiling at host side ?

Unknown said...

Hi Anu,

I honestly don't think that your host machine is hanging.

The whole idea is that we start the deamon, then the host machine will stands by and waits for jobs from BBB.

The compilation needs to be started from BBB site. Once started, you can see jobs being sent to host machine.

Good luck, my friend.

Unknown said...

Hi Tung Vu,

This is a very user friendly post, thank you for your efforts. I have built opencv with NEON and libjpeg-turbo. The flags you mentioned out are all correct in the opencv build directory (checked this using ccmake .) and first of all i set the frequency using the command: cpufreq-set -f 1000MHz and this is how I am compiling my program:

g++ -I/usr/local/include/opencv -I/usr/local/include/opencv2 -L/usr/local/lib/ -g -o swaraj_webcam swaraj_webcam.cpp all_my_libraries, where "all_my_libraries" is -lopencv_core and so on. Then I recorded 30 frames and it turns out my frame rate is still ~7 to 8 fps. Though i'm new to this, I feel i'm making a minor mistake but unable to find it out for now, what's your take on this issue? , am I compiling my program correctly?, anything wrong with libjpeg-turbo , neon may be?. I'll continue to find the problem till then and thank you once again for getting so many people this far :)

Regards,
Swaraj Dube.