转载

arXiv Paper Daily: Tue, 6 Dec 2016

Neural and Evolutionary Computing

BrainFrame: A heterogeneous accelerator platform for neuron simulations

Georgios Smaragdos , Georgios Chatzikonstantis , Rahul Kukreja , Harrys Sidiropoulos , Dimitrios Rodopoulos , Ioannis Sourdis , Zaid Al-Ars , Christoforos Kachris , Dimitrios Soudris , Chris I. De Zeeuw , Christos Strydis

Comments: 17 pages, 19 figures, 4 tables

Subjects

Neural and Evolutionary Computing (cs.NE)

; Distributed, Parallel, and Cluster Computing (cs.DC)

Objective: The advent of High-Performance Computing (HPC) in recent years has

led to its increasing use in brain study through computational models. The

scale and complexity of such models are constantly increasing, leading to

challenging computational requirements. Even though modern HPC platforms can

often deal with such challenges, the vast diversity of the modeling field does

not permit for a single acceleration (or homogeneous) platform to effectively

address the complete array of modeling requirements. Approach: In this paper we

propose and build BrainFrame, a heterogeneous acceleration platform,

incorporating three distinct acceleration technologies, a Dataflow Engine, a

Xeon Phi and a GP-GPU. The PyNN framework is also integrated into the platform.

As a challenging proof of concept, we analyze the performance of BrainFrame on

different instances of a state-of-the-art neuron model, modeling the Inferior-

Olivary Nucleus using a biophysically-meaningful, extended Hodgkin-Huxley

representation. The model instances take into account not only the neuronal-

network dimensions but also different network-connectivity circumstances that

can drastically change application workload characteristics. Main results: The

synthetic approach of three HPC technologies demonstrated that BrainFrame is

better able to cope with the modeling diversity encountered. Our performance

analysis shows clearly that the model directly affect performance and all three

technologies are required to cope with all the model use cases.

Enabling Bio-Plausible Multi-level STDP using CMOS Neurons with Dendrites and Bistable RRAMs

Xinyu Wu , Vishal Saxena Subjects : Emerging Technologies (cs.ET) ; Neural and Evolutionary Computing (cs.NE)

Large-scale integration of emerging nanoscale non-volatile memory devices,

e.g. resistive random-access memory (RRAM), can enable a new generation of

neuromorphic computers that can solve a wide range of machine learning

problems. Such hybrid CMOS-RRAM neuromorphic architectures will result in

several orders of magnitude reduction in energy consumption at a very small

form factor, and herald autonomous learning machines capable of self-adapting

to their environment. However, the progress in this area has been impeded from

the realization that the actual memory devices fall well short of their

expected behavior. In this work, we discuss the challenges associated with

these memory devices and their use in neuromorphic computing circuits, and

propose pathways to overcome these limitations by introducing ‘dendritic

learning’.

Message Passing Multi-Agent GANs

Arnab Ghosh , Viveka Kulharia , Vinay Namboodiri

Comments: The first 2 authors contributed equally for this work

Subjects

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Communicating and sharing intelligence among agents is an important facet of

achieving Artificial General Intelligence. As a first step towards this

challenge, we introduce a novel framework for image generation: Message Passing

Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have

recently been shown to be very effective for image generation and other tasks,

these networks have been limited to mostly single generator-discriminator

networks. We show that we can obtain multi-agent GANs that communicate through

message passing to achieve better image generation. The objectives of the

individual agents in this framework are two fold: a co-operation objective and

a competing objective. The co-operation objective ensures that the message

sharing mechanism guides the other generator to generate better than itself

while the competing objective encourages each generator to generate better than

its counterpart. We analyze and visualize the messages that these GANs share

among themselves in various scenarios. We quantitatively show that the message

sharing formulation serves as a regularizer for the adversarial training.

Qualitatively, we show that the different generators capture different traits

of the underlying data distribution.

Known Unknowns: Uncertainty Quality in Bayesian Neural Networks

Ramon Oliveira , Pedro Tabacof , Eduardo Valle

Comments: Workshop on Bayesian Deep Learning, NIPS 2016, Barcelona, Spain

Subjects

Machine Learning (stat.ML)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

We evaluate the uncertainty quality in neural networks using anomaly

detection. We extract uncertainty measures (e.g. entropy) from the predictions

of candidate models, use those measures as features for an anomaly detector,

and gauge how well the detector differentiates known from unknown classes. We

assign higher uncertainty quality to candidate models that lead to better

detectors. We also propose a novel method for sampling a variational

approximation of a Bayesian neural network, called One-Sample Bayesian

Approximation (OSBA). We experiment on two datasets, MNIST and CIFAR10. We

compare the following candidate neural network models: Maximum Likelihood,

Bayesian Dropout, OSBA, and — for MNIST — the standard variational

approximation. We show that Bayesian Dropout and OSBA provide better

uncertainty information than Maximum Likelihood, and are essentially equivalent

to the standard variational approximation, but much faster.

Semi-supervised learning of deep metrics for stereo reconstruction

Stepan Tulyakov , Anton Ivanov , Francois Fleuret

Comments: 11 pages, 3 figures

Subjects

Computer Vision and Pattern Recognition (cs.CV)

; Neural and Evolutionary Computing (cs.NE)

Deep-learning metrics have recently demonstrated extremely good performance

to match image patches for stereo reconstruction. However, training such

metrics requires large amount of labeled stereo images, which can be difficult

or costly to collect for certain applications. The main contribution of our

work is a new semi-supervised method for learning deep metrics from unlabeled

stereo images, given coarse information about the scenes and the optical

system. Our method alternatively optimizes the metric with a standard

stochastic gradient descent, and applies stereo constraints to regularize its

prediction. Experiments on reference data-sets show that, for a given network

architecture, training with this new method without ground-truth produces a

metric with performance as good as state-of-the-art baselines trained with the

said ground-truth. This work has three practical implications. Firstly, it

helps to overcome limitations of training sets, in particular noisy ground

truth. Secondly it allows to use much more training data during learning.

Thirdly, it allows to tune deep metric for a particular stereo system, even if

ground truth is not available.

Positive blood culture detection in time series data using a BiLSTM network

Leen De Baets , Joeri Ruyssinck , Thomas Peiffer , Johan Decruyenaere , Filip De Turck , Femke Ongenae , Tom Dhaene Subjects : Learning (cs.LG) ; Neural and Evolutionary Computing (cs.NE); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

The presence of bacteria or fungi in the bloodstream of patients is abnormal

and can lead to life-threatening conditions. A computational model based on a

bidirectional long short-term memory artificial neural network, is explored to

assist doctors in the intensive care unit to predict whether examination of

blood cultures of patients will return positive. As input it uses nine

monitored clinical parameters, presented as time series data, collected from

2177 ICU admissions at the Ghent University Hospital. Our main goal is to

determine if general machine learning methods and more specific, temporal

models, can be used to create an early detection system. This preliminary

research obtains an area of 71.95% under the precision recall curve, proving

the potential of temporal neural networks in this context.

Parameter Compression of Recurrent Neural Networks and Degredation of Short-term Memory

Jonathan A. Cox

Comments: Submitted to IJCNN 2017

Subjects

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

The significant computational costs of deploying neural networks in

large-scale or resource constrained environments, such as data centers and

mobile devices, has spurred interest in model compression, which can achieve a

reduction in both arithmetic operations and storage memory. Several techniques

have been proposed for reducing or compressing the parameters for feed-forward

and convolutional neural networks, but less is understood about the effect of

parameter compression on recurrent neural networks (RNN). In particular, the

extent to which the recurrent parameters can be compressed and the impact on

short-term memory performance, is not well understood. In this paper, we study

the effect of complexity reduction, through singular value decomposition rank

reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several

tasks. We show that considerable rank reduction is possible when compressing

recurrent weights, even without fine tuning. Furthermore, we propose a

perturbation model for the effect of general perturbations, such as a

compression, on the recurrent parameters of RNNs. The model is tested against a

noiseless memorization experiment that elucidates the short-term memory

performance. In this way, we demonstrate that the effect of compression of

recurrent parameters is dependent on the degree of temporal coherence present

in the data and task. This work can guide on-the-fly RNN compression for novel

environments or tasks, and provides insight for applying RNN compression in

low-power devices, such as hearing aids.

Computer Vision and Pattern Recognition

ROAM: a Rich Object Appearance Model with Application to Rotoscoping

Ondrej Miksik , Juan-Manuel Pérez-Rúa , Philip H. S. Torr , Patrick Pérez Subjects : Computer Vision and Pattern Recognition (cs.CV)

Rotoscoping, the detailed delineation of scene elements through a video shot,

is a painstaking task of tremendous importance in professional post-production

pipelines. While pixel-wise segmentation techniques can help for this task,

professional rotoscoping tools rely on parametric curves that offer the artists

a much better interactive control on the definition, editing and manipulation

of the segments of interest. Sticking to this prevalent rotoscoping paradigm,

we propose a novel framework to capture and track the visual aspect of an

arbitrary object in a scene, given a first closed outline of this object. This

model combines a collection of local foreground/background appearance models

spread along the outline, a global appearance model of the enclosed object and

a set of distinctive foreground landmarks. The structure of this rich

appearance model allows simple initialization, efficient iterative optimization

with exact minimization at each step, and on-line adaptation in videos. We

demonstrate qualitatively and quantitatively the merit of this framework

through comparisons with tools based on either dynamic segmentation with a

closed curve or pixel-wise binary labelling.

Authoring image decompositions with generative models

Jason Rock , Theerasit Issaranon , Aditya Deshpande , David Forsyth Subjects : Computer Vision and Pattern Recognition (cs.CV)

We show how to extend traditional intrinsic image decompositions to

incorporate further layers above albedo and shading. It is hard to obtain data

to learn a multi-layer decomposition. Instead, we can learn to decompose an

image into layers that are “like this” by authoring generative models for each

layer using proxy examples that capture the Platonic ideal (Mondrian images for

albedo; rendered 3D primitives for shading; material swatches for shading

detail). Our method then generates image layers, one from each model, that

explain the image. Our approach rests on innovation in generative models for

images. We introduce a Convolutional Variational Auto Encoder (conv-VAE), a

novel VAE architecture that can reconstruct high fidelity images. The approach

is general, and does not require that layers admit a physical interpretation.

Articulated Multi-person Tracking in the Wild

Eldar Insafutdinov , Mykhaylo Andriluka , Leonid Pishchulin , Siyu Tang , Bjoern Andres , Bernt Schiele Subjects : Computer Vision and Pattern Recognition (cs.CV)

In this paper we propose an approach for articulated tracking of multiple

people in unconstrained videos. Our starting point is a model that resembles

existing architectures for single-frame pose estimation but is several orders

of magnitude faster. We achieve this in two ways: (1) by simplifying and

sparsifying the body-part relationship graph and leveraging recent methods for

faster inference, and (2) by offloading a substantial share of computation onto

a feed-forward convolutional architecture that is able to detect and associate

body joints of the same person even in clutter. We use this model to generate

proposals for body joint locations and formulate articulated tracking as

spatio-temporal grouping of such proposals. This allows to jointly solve the

association problem for all people in the scene by propagating evidence from

strong detections through time and enforcing constraints that each proposal can

be assigned to one person only. We report results on a public MPII Human Pose

benchmark and on a new dataset of videos with multiple people. We demonstrate

that our model achieves state-of-the-art results while using only a fraction of

time and is able to leverage temporal information to improve state-of-the-art

for crowded scenes.

ImageNet pre-trained models with batch normalization

Marcel Simon , Erik Rodner , Joachim Denzler Subjects : Computer Vision and Pattern Recognition (cs.CV)

Convolutional neural networks (CNN) pre-trained on ImageNet are the backbone

of most state-of-the-art approaches. In this paper, we present a new set of

pre-trained models with popular state-of-the-art architectures for the Caffe

framework. The first release includes Residual Networks (ResNets) with

generation script as well as the batch-normalization-variants of AlexNet and

VGG19. All models outperform previous models with the same architecture. The

models and training code are available at

this http URL and

this https URL

A Distance Function for Comparing Straight-Edge Geometric Figures

Apoorva Honnegowda Roopa , Shrisha Rao

Comments: 29 pages, 12 figures including appendices

Subjects

Computer Vision and Pattern Recognition (cs.CV)

; Computational Geometry (cs.CG)

This paper defines a distance function that measures the dissimilarity

between planar geometric figures formed with straight lines. This function can

in turn be used in partial matching of different geometric figures. For a given

pair of geometric figures that are graphically isomorphic, one function

measures the angular dissimilarity and another function measures the edge

length disproportionality. The distance function is then defined as the convex

sum of these two functions. The novelty of the presented function is that it

satisfies all properties of a distance function and the computation of the same

is done by projecting appropriate features to a cartesian plane. To compute the

deviation from the angular similarity property, the Euclidean distance between

the given angular pairs and the corresponding points on the (y=x) line is

measured. Further while computing the deviation from the edge length

proportionality property, the best fit line, for the set of edge lengths, which

passes through the origin is found, and the Euclidean distance between the

given edge length pairs and the corresponding point on a (y=mx) line is

calculated. Iterative Proportional Fitting Procedure (IPFP) is used to find

this best fit line. We demonstrate the behavior of the defined function for

some sample pairs of figures.

From One-Trick Ponies to All-Rounders: On-Demand Learning for Image Restoration

Ruohan Gao , Kristen Grauman Subjects : Computer Vision and Pattern Recognition (cs.CV)

While machine learning approaches to image restoration offer great promise,

current methods risk training “one-trick ponies” that perform well only for

image corruption of a particular level of difficulty—such as a certain level

of noise or blur. First, we expose the weakness of today’s one-trick pony and

demonstrate that training general models equipped to handle arbitrary levels of

corruption is indeed non-trivial. Then, we propose an on-demand learning

algorithm for training image restoration models with deep convolutional neural

networks. The main idea is to exploit a feedback mechanism to self-generate

training instances where they are needed most, thereby learning models that can

generalize across difficulty levels. On four restoration tasks—image

inpainting, pixel interpolation, image deblurring, and image denoising—and

three diverse datasets, our approach consistently outperforms both the status

quo training procedure and curriculum learning alternatives.

Human-In-The-Loop Person Re-Identification

Hanxiao Wang , Shaogang Gong , Xiatian Zhu , Tao Xiang Subjects : Computer Vision and Pattern Recognition (cs.CV)

Current person re-identification (re-id) methods assume that (1) pre-labelled

training data are available for every camera pair, (2) the gallery size for

re-identification is moderate. Both assumptions scale poorly to real-world

applications when camera network size increases and gallery size becomes large.

Human verification of automatic model ranked re-id results becomes inevitable.

In this work, a novel human-in-the-loop re-id model based on Human Verification

Incremental Learning (HVIL) is formulated which does not require any

pre-labelled training data to learn a model, therefore readily scalable to new

camera pairs. This HVIL model learns cumulatively from human feedback to

provide instant improvement to re-id ranking of each probe on-the-fly enabling

the model scalable to large gallery sizes. We further formulate a Regularised

Metric Ensemble Learning (RMEL) model to combine a series of incrementally

learned HVIL models into a single ensemble model to be used when human feedback

becomes unavailable.

Highly Efficient Regression for Scalable Person Re-Identification

Hanxiao Wang , Shaogang Gong , Tao Xiang Subjects : Computer Vision and Pattern Recognition (cs.CV)

Existing person re-identification models are poor for scaling up to large

data required in real-world applications due to: (1) Complexity: They employ

complex models for optimal performance resulting in high computational cost for

training at a large scale; (2) Inadaptability: Once trained, they are

unsuitable for incremental update to incorporate any new data available. This

work proposes a truly scalable solution to re-id by addressing both problems.

Specifically, a Highly Efficient Regression (HER) model is formulated by

embedding the Fisher’s criterion to a ridge regression model for very fast

re-id model learning with scalable memory/storage usage. Importantly, this new

HER model supports faster than real-time incremental model updates therefore

making real-time active learning feasible in re-id with human-in-the-loop.

Extensive experiments show that such a simple and fast model not only

outperforms notably the state-of-the-art re-id methods, but also is more

scalable to large data with additional benefits to active learning for reducing

human labelling effort in re-id deployment.

Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection

Dimitrios Marmanis , Konrad Schindler , Jan Dirk Wegner , Silvano Galliani , Mihai Datcu , Uwe Stilla Subjects : Computer Vision and Pattern Recognition (cs.CV)

We present an end-to-end trainable deep convolutional neural network (DCNN)

for semantic segmentation with built-in awareness of semantically meaningful

boundaries. Semantic segmentation is a fundamental remote sensing task, and

most state-of-the-art methods rely on DCNNs as their workhorse. A major reason

for their success is that deep networks learn to accumulate contextual

information over very large windows (receptive fields). However, this success

comes at a cost, since the associated loss of effecive spatial resolution

washes out high-frequency details and leads to blurry object boundaries. Here,

we propose to counter this effect by combining semantic segmentation with

semantically informed edge detection, thus making class-boundaries explicit in

the model, First, we construct a comparatively simple, memory-efficient model

by adding boundary detection to the Segnet encoder-decoder architecture.

Second, we also include boundary detection in FCN-type models and set up a

high-end classifier ensemble. We show that boundary detection significantly

improves semantic segmentation with CNNs. Our high-end ensemble achieves > 90%

overall accuracy on the ISPRS Vaihingen benchmark.

Stereo image de-fencing using smartphones

Sankaraganesh Jonna , Sukla Satapathy , Rajiv R. Sahay

Comments: Under review as a conference paper

Subjects

Computer Vision and Pattern Recognition (cs.CV)

Conventional approaches to image de-fencing have limited themselves to using

only image data in adjacent frames of the captured video of an approximately

static scene. In this work, we present a method to harness disparity using a

stereo pair of fenced images in order to detect fence pixels. Tourists and

amateur photographers commonly carry smartphones/phablets which can be used to

capture a short video sequence of the fenced scene. We model the formation of

the occluded frames in the captured video. Furthermore, we propose an

optimization framework to estimate the de-fenced image using the total

variation prior to regularize the ill-posed problem.