

## **DEVELOP SMART COMPUTER VISION SOLUTIONS FASTER**

# WITH INTEL® COMPUTER VISION SDK & OTHER ADVANCED SOFTWARE TOOLS

Hosted by the Embedded Vision Alliance Presented by Intel Corporation

## AGENDA

- 1. Trends Driving a Need for Computer Vision
- 2. Computer Vision & Deep Learning Value Together
- 3. Optimize Your Applications with the Right Tools
  - Intel<sup>®</sup> System Studio
  - Intel<sup>®</sup> Computer Vision SDK
  - Intel<sup>®</sup> Media SDK
  - Intel<sup>®</sup> SDK for OpenCL<sup>™</sup> Applications

### Goals

- Show how integrating computer vision can bring smart capabilities to great solutions
- Provide a technical introduction to each so you can get started

#### **Optimization Notice**



## **VIDEO: THE NEW FRONTIER**

Multiple sources: IHS, Markets & Markets, Strategy Analytics, Intel research

Central Management, Archive and Analytics





**Video Gateways Servers & Recorders** 



Security & Surveillance **Public Safety** 



**Traffic Control** ADAS



Smart Home & Building

Manufacturing

**Robotics** 



Retail Analytics



Healthcare



Infrastructure

#### **Optimization Notice**



## **CONNECTED DEVICES ARE EVERYWHERE**

### And Video Use is Increasing



### **Developers need tools that...**

- Are comprehensive and easy to use
- Quickly help resolve defects in complex systems
- Offer insight into sources of excess power consumption
- Enable & accelerate performance demanding & unique, competitive use cases

### ...and take full advantage of Intel hardware accelerators

#### Optimization Notice



## **DEEP LEARNING BREAKTHROUGH**







## **END-TO-END DISTRIBUTED INTELLIGENCE**



## **ACCELERATE & DIFFERENTIATE WITH INTEL SOFTWARE TOOLS**





## **END-TO-END INTELLIGENCE**



#### Optimization Notice

## **END-TO-END ARTIFICIAL INTELLIGENCE FOR AUTOMATED DRIVING**



#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



## **INTEL COMPUTER VISION PORTFOLIO**



#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



## **INTEL HARDWARE IS HETEROGENEOUS**

## Skylake



### 6<sup>th</sup> Generation Core





### 7<sup>th</sup> Generation Core

### Apollo Lake



• E3950, E3940, E3930

#### **Optimization Notice**

## INTEL® SYSTEM STUDIO + HETEROGENEOUS SDKS

Deep System-wide Insight - Unlock Performance for System, Embedded & IoT Developers



### Better Together: A Portfolio to use Full Processor Capabilities

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



## **UNLOCK HARDWARE CAPABILITIES**



### NEW! INTEGRATE VISUAL UNDERSTANDING Intel® Computer Vision SDK Beta Linux\*/Yocto\* version available

### HETEROGENEOUS CUSTOM DEVELOPMENT Intel® SDK for OpenCL™ Application Development

### ACCELERATE VIDEO PROCESSING Intel® Media SDK for Embedded Linux\*, Windows\* & Open Source

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



## INTEL® C++ COMPILERS DELIVER IMPRESSIVE PERFORMANCE ON EMBEDDED APPLICATIONS POWERED BY INTEL ATOM PROCESSORS

**Coremark Pro\* benchmarks running on Intel Atom processors** 

32-bit mode

#### 64-bit mode





Configuration: Intel<sup>®</sup> Atom<sup>™</sup> CPU C2750 @ 2.41GHz. Software: Intel<sup>®</sup> C++ Compiler 17.0, Intel C++ compiler 16.0, GCC 6.1.0, Clang/LLVVM 3.8. Linux OS: Red Hat Enterprise Linux\* 7.0, Kernel 3.10.0-123.el7.x86\_64, 32GB RAM. Coremark Pro\* Benchmark (www.eembc.org). Compiler flags: Intel C++ 17.0; -03 -ipo -no-prec-div -ansi-alias -xATOM\_SSE4.2 -static; GCC 6.1.0; -Ofast -mfpmath=sse -flto -march=native -funroll-loops; static. GCC and clang/LLVM 3.8: Coremark Pro\* Benchmark [www.eembc.org]. Compiler flags: Intel C++ 16.0; -03 -ipo -no-prec-div -ansi-alias -xATOM\_SSE4.2 -static; GCC 6.1.0; -Ofast -mfpmath=sse -flto -march=native -funroll-loops; static. GCC and clang/LLVM 3.8: Compilers have additional flag -m32. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance systems and functions. Any change to any of those factors may cause the results to vary. You should consult other information & performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when compliner on the products. Benchmark Source: Intel Corporation - **Optimization Notice**: Intel's compilers may or optimization on microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets & other optimizations in this product are reserved by this notice: Net availability, functionality, or effectiveness of any optimizations not specific instruction sets for more information by the specific instruction sets or more information by the set of runce for mace reserved by this notice. Notice revision #20110804 .

#### **Optimization Notice**

## YOUR BUILDING BLOCKS FOR IMAGE, SIGNAL & DATA PROCESSING APPS

Intel<sup>®</sup> Integrated Performance Primitives (Intel<sup>®</sup> IPP)

What is Intel® IPP? Provides developers with readyto-use, processor- optimized functions to accelerate Image, Signal, Data Processing & Cryptography computation tasks

### Why use Intel<sup>®</sup> IPP?

- High Performance
- Easy to use API's
- Faster Time To Market
- Production Ready

### How to get Intel<sup>®</sup> IPP

Intel System Studio Intel Parallel Studio XE Free Tools Program



### **Image Processing**

- Medical Imaging
- Computer Vision
- Digital Surveillance
- Biometric Identification
- Automated Sorting
- ADAS
- Visual Search

### **Signal Processing**

- Games (sophisticated audio content or effects)
- Echo cancellation
- Telecommunications
- Energy

### Data Compression & Cryptography

- Data centers
- Enterprise data Managements
- ID verification
- Smart Cards/wallets
- Electronic Signature
- Informationsecurity/cybersecurity

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the p<u>roperty of others.</u> Find out more at: software.intel.com/intel-ipp

Contact us: software.intel.com/en-us/forums/intel-integrated-performance-primitives

## MULTI-THREADING & HETEROGENEOUS COMPUTING MADE EASY

Intel<sup>®</sup> Threading Building Blocks (Intel<sup>®</sup> TBB)

### What is Intel<sup>®</sup> TBB?

A highly templatized C++ library designed to simplify adding parallelism to your application by taking advantage of all the CPU's on a single device or across multiple devices (heterogeneity).



#### How to get Intel<sup>®</sup> TBB

Intel System Studio Intel Parallel Studio XE Free Tools Program Open Source site

Applications

- Artificial Intelligence & Automation
- Image processing
- Any solution needing sophisticated threading

### Why use Intel<sup>®</sup> TBB?

- High Performance
- Easy to use API's
- Faster Time To Market
- Production Ready

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



15

## FASTER, SCALABLE CODE, FASTER

Intel<sup>®</sup> VTune<sup>™</sup> Amplifier Performance Profiler

### Get Faster Code Faster with Accurate Data & Meaningful Analysis

- Accurate CPU, GPU & threading data
- Memory access & storage analysis
- Powerful data analysis & filtering
- Data displayed on the source code
- Easy set-up, no special compiles

"Last week, Intel<sup>®</sup> VTune<sup>™</sup> Amplifier helped us find almost **3X performance improvement**. This week it helped us improve the performance another 3X."

Claire Cates Principal Developer SAS Institute Inc.



Learn More: intel.ly/vtune-amplifier-xe



#### **Optimization Notice**

## **A NEW SDK FOR COMPUTER VISION**

## INTEL® COMPUTER VISION SDK BETA



#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.





### Model Optimizer Bridges Train and Deploy

- Generate OpenVX code
- Generate Intermediate Representation (IR)
- Optimize the network
  - Node fusion
  - Node merging
  - Batch normalization
- Calculate and dump the normalized and converted weights/biases (normalization factor can be supplied by user if learning phase is skipped)



#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



18

## **RUN MODEL OPTIMIZER**

\$ cd /opt/intel/computer\_vision\_sdk\_2017.0.090
\$ source bin/setupvars.sh
\$ cd mo/bin

```
$ export FRAMEWORK_HOME=
/home/user/Desktop/MO_LAB/caffe/build/lib/
```

```
$ ./ModelOptimizer --target APLK -i \
-d /home/user/Downloads/caffe-
master/models/bvlc_reference_caffenet/deploy.proto
txt \
```

-w /home/user/Downloads/caffemaster/models/bvlc\_reference\_caffenet/bvlc\_referen ce\_caffenet.caffemodel \

-f 1 \ -p FP16 \ -o artifacts

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others. Start working... Framework plugin: CAFFE Target type: APLK Network type: CLASSIFICATION Batch size: 8 Precision: FP16 Layer fusion: true Output directory: artifacts Custom kernels directory: Network input normalization: 1 Writing binary data to: artifacts/CaffeNet/CaffeNet.bin



## MODEL OPTIMIZER GENERATED OPENVX CODE

| <u>F</u> ile <u>E</u> dit <u>A</u> nalytics OpenVX <u>H</u> elp |
|-----------------------------------------------------------------|
|                                                                 |
| GoogleNet.graphml                                               |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |
|                                                                 |

**Optimization Notice** 

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



3

Inference

2

Prepare

model

1

Train

## **SETUP CODE**



#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



## **SIMPLE CLASSIFICATION CODE**



| <pre>//get top classifier label<br/>int blobsize=output-&gt;size();<br/>float *data=output-&gt;data();<br/>float max=0;</pre> | Output is an ar<br>of category<br>possibilities |
|-------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------|
| int maxidx=0;                                                                                                                 |                                                 |
| <pre>for (int i1=0; i1<blobsize; i1++)<="" pre=""></blobsize;></pre>                                                          | {                                               |
| if (data[i1]>max) {                                                                                                           |                                                 |
| <pre>max=data[i1];</pre>                                                                                                      |                                                 |
| <pre>maxidx=i1;</pre>                                                                                                         |                                                 |
| }                                                                                                                             |                                                 |
| }                                                                                                                             |                                                 |
|                                                                                                                               |                                                 |
| <pre>// do something with classificat</pre>                                                                                   | tion                                            |
|                                                                                                                               |                                                 |
|                                                                                                                               |                                                 |
| <pre>imshow( "frame", frame2 );</pre>                                                                                         |                                                 |

if (waitKey(30) >= 0) break;

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others. OpenVX and the OpenVX logo are trademarks of the Khronos Group Inc. OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos



rav



## **OPENCV VS. OPENVX**





- Industry standard
  - 47K user community
  - >14M downloads
- Community driven open source
- >2500 algorithms
- CPU C++, growing list of OpenCL/CUDA implementations
- Standard scheduling, no automatic tiling across functions, etc.

- Emerging standard
- Created for power optimized heterogeneous HW development
- Vendor driven, all or partial closed source
- ~50 algorithms
- Designed for fixed function, may be implemented in C++, OpenCL, etc.
- Automatic graph level optimizations (tiling, etc.)

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



## **DEVELOPMENT FLOW OPTIONS**

### **Vision Algorithm Designer**

- Build Pipelines
- Debug
- Performance feedback



### OpenVX<sup>™</sup> C/C++ API

- Use with familiar IDEs
- Interoperable with other libraries, SDKs & programming models

vx\_context context = vxCreateContext(); vx\_image input = vxCreateImage( context, 640, 480, VX\_DF\_IMAGE\_U8 ); vx\_image output = vxCreateImage( context, 640, 480, VX\_DF\_IMAGE\_U8 );

vx\_graph graph = vxCreateGraph( context ); vx\_image intermediate = vxCreateVirtualImage( graph, 640, 480, VX\_DF\_IMAGE\_U8 ); vx node F1 = vxF1Node( graph, input, intermediate );

vx\_node F1 = vxF1Node( graph, input, intermediate ); vx\_node F2 = vxF2Node( graph, intermediate, output );

vxVerifyGraph( graph ); vxProcessGraph( graph ); // run in a loop

#### **Optimization Notice**



## THEORY OF OPERATION: INTEL® MEDIA SDK / INTEL® MEDIA SERVER STUDIO



Out of scope: \* audio, containers, networking...

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others. Media accelerator framework Codec based High level/parameter interface 3 operations

### Good option for:

- Accelerated video encode, decode
- (and short list of frame processing)

### More Information

- Media Server Studio
- Media SDK
- Intel Media Code Samples



## **CODECS + FRAME PROCESSING USE FIXED FUNCTION + EUS**

### **Video Encoding**

ENC= EU+VDBox VME (MB type, motion vectors, bit budget/BRC) PAK = VDBox (residue packing & entropy coding) VDENC = low power encode (6<sup>th</sup> Generation Core<sup>®</sup> & forward)





### VPP

### **VPHal** Video Processing Hardware Acceleration Layer

#### VEBox

- Deinterlacing
- Denoise (Luma/Chroma)
- Frame Rate Conversion
- Color space conversions
- Composition/alpha blending
- Scaling

#### **Optimization Notice**



## **BASIC DECODE FLOW**



### Expected Return Codes for DecodeFrameAsync

#### MFX ERR MORE SURFACE • A new surface is required to proceed – this is where decode will write its output MFX ERR MORE DATA • More input bitstream data is required to proceed MFX WRN DEVICE BUSY •Hardware device is unable to respond. This is an expected output for normal operation and should clear after a short wait. However, if this state persists more than a few milliseconds this may indicate a problem. MFX WRN VIDEO PARAM CHANGED • The SDK decoder parsed a new sequence header. Decoding can continue with existing frame buffers. The application can optionally retrieve new video parameters by calling MFXVideoDECODE GetVideoParam. Other •Other error codes may be bugs. Please

Other error codes may be bugs. Please contact an Intel support representative for more info.

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



## DECODE

```
do {
                         if (still reading file) { // main loop
                                   sts = mfxDEC.DecodeFrameAsync(&mfxBS, pmfxSurfaces[nIndex], &pmfxOutSurface, &syncp);
                         else { // drain loop
                                   sts = mfxDEC.DecodeFrameAsync(NULL, pmfxSurfaces[nIndex], &pmfxOutSurface, &syncp);
                                   if (sts==MFX ERR MORE DATA) break;
                         switch (sts)
                         case MFX WRN DEVICE BUSY:
                                   MSDK SLEEP(1); // Wait if device is busy, then repeat
                                   break;
Add VPP resize
                         case MFX ERR MORE SURFACE:
                                   nIndex = GetFreeSurfaceIndex(pmfxSurfaces, numSurfaces); // Find free frame surface
      and
                                   break;
 classification
                         case MFX ERR MORE DATA:
                                   code here
                                   if (readsts!=MFX ERR NONE) still reading file=0;
                                   break;
                         if (MFX ERR NONE!=sts) continue;
                         sts = session.SyncOperation(syncp, 60000); // Wait until decode finished
                         // frame data can be used by application now
      } while (true);
```

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



## INTEL<sup>®</sup> SDK FOR OPENCL<sup>™</sup> APPLICATIONS

### **SDK Tools**



- Kernel analyzer
- Kernel debugger
- Offline compiler
- IDE integration

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



### Download from software.intel.com/intel-opencl



OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos

## **EXTENSIONS MAP**



For more info: software.intel.com/articles/opencl-intel-graphics-extensions

#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.



....

## **SHARING APIS IN ACTION**





For more info: <u>software.intel.com/articles/tutorial-opencl-interoperability-with-video-acceleration-api-on-linux-os</u> Interop example code (in this tutorial) Intel<sup>®</sup> Media SDK/Intel<sup>®</sup> Media Server Studio samples (sample\_multi\_transcode, sample\_encode)

#### **Optimization Notice**



## HARDWARE VIDEO MOTION ESTIMATION



### VME is part of the Media Sampler

- Programmable through EUs
- Operates on 16x16 macroblocks
- 1 per sub-slice
  - 2 sub-units (co-issuable)
- Implements key motion estimation operations
  - Inter Motion Estimation
  - Sub-pixel refinement
  - Intra Prediction
  - Many more...
- Programmable general purpose operations
- Optimized for memory bandwidth
- Provides configurable raw compute
- Smarts in the hands of the programmer

### intel

#### **Optimization Notice**





**Optimization Notice** 



## **INTRODUCING VEBOX**







### Total Color Correction (TCC)

### A configurable pipeline of common video processing operations

**Optimization Notice** 







#### **Optimization Notice**

Copyright © 2017, Intel Corporation. All rights reserved. \*Other names and brands may be claimed as the property of others.

| Computer Vision SDK                                   |                                                                  | OpenVX / CNN    | SED CNN OpenVX Graph |           |                       |  |
|-------------------------------------------------------|------------------------------------------------------------------|-----------------|----------------------|-----------|-----------------------|--|
| Input Source                                          | Decoide                                                          | Video Processor | Encode               | Targ      |                       |  |
| File Source<br>Wind Rodrewiczonard<br>Detwork Company | BUDGLINVC REVIX<br>BUDGLINVC REVIX<br>BUDGLINVC VER<br>RECLINVER | TAXA DISTORT    |                      | EK Enbode | Renderer<br>File Sink |  |
| Media SDK Architecture Diagr                          | am                                                               |                 | MAN                  |           |                       |  |

## MORE RESOURCES - DOWNLOAD SOFTWARE TO GET STARTED



37

#### **Optimization Notice**



# **THANK YOU!**

## **Empowering Product Creators to** Harness Embedded Vision

The Embedded Vision Alliance (<u>www.Embedded-Vision.com</u>) is a partnership of 60+ leading embedded vision technology and services suppliers

Mission: Inspire and empower product creators to incorporate visual intelligence into their products

The Alliance provides low-cost, high-quality technical educational resources for product developers

### Register for updates at <u>www.Embedded-Vision.com</u>

The Alliance enables vision technology providers to grow their businesses through leads, ecosystem partnerships, and insights

For membership, email us: <a href="mailto:membership@Embedded-Vision.com">membership@Embedded-Vision.com</a>









## Legal Notices and Disclaimers

Intel technologies' features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer.

No computer system can be absolutely secure.

Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit <u>http://www.intel.com/performance</u>.

Cost reduction scenarios described are intended as examples of how a given Intel<sup>®</sup>-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate.

Intel, the Intel logo, the Intel Inside logo, Intel Atom, Quark and Mashery are trademarks of Intel Corporation in the U.S. and/or other countries.

\*Other names and brands may be claimed as the property of others.



## Legal Disclaimer & Optimization Notice

INFORMATION IN THIS DOCUMENT IS PROVIDED "AS IS". NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Copyright © 2017, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.

#### **Optimization Notice**

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804