转载

arXiv Paper Daily: Mon, 6 Feb 2017

Neural and Evolutionary Computing

Robust Particle Swarm Optimizer based on Chemomimicry

Casey Kneale , Karl S. Booksh

Comments: To be revised for formatting and submitted as a Letters style paper

Subjects

Neural and Evolutionary Computing (cs.NE)

A particle swarm optimizer (PSO) loosely based on the phenomena of

crystallization and a chaos factor which follows the complimentary error

function is described. The method features three phases: diffusion, directed

motion, and nucleation. During the diffusion phase random walk is the only

contributor to particle motion. As the algorithm progresses the contribution

from chaos decreases and movement toward global best locations is pursued until

convergence has occurred. The algorithm was found to be more robust to local

minima in multimodal test functions than a standard PSO algorithm and is

designed for problems which feature experimental precision.

Eye-Movement behavior identification for AD diagnosis

Juan Biondi , Gerardo Fernandez , Silvia Castro , Osvaldo Agamenonni Subjects : Neural and Evolutionary Computing (cs.NE) ; Neurons and Cognition (q-bio.NC)

In the present work, we develop a deep-learning approach for differentiating

the eye-movement behavior of people with neurodegenerative diseases over

healthy control subjects during reading well-defined sentences. We define an

information compaction of the eye-tracking data of subjects without and with

probable Alzheimer’s disease when reading a set of well-defined, previously

validated, sentences including high-, low-predictable sentences, and proverbs.

Using this information we train a set of denoising sparse-autoencoders and

build a deep neural network with these and a softmax classifier. Our results

are very promising and show that these models may help to understand the

dynamics of eye movement behavior and its relationship with underlying

neuropsychological correlates.

Optimal Experimental Design of Field Trials using Differential Evolution

Vitaliy Feoktistov , Stephane Pietravalle , Nicolas Heslot

Comments: 7 pages, 5 figures

Subjects

Neural and Evolutionary Computing (cs.NE)

; Quantitative Methods (q-bio.QM)

When setting up field experiments, to test and compare a range of genotypes

(e.g. maize hybrids), it is important to account for any possible field effect

that may otherwise bias performance estimates of genotypes. To do so, we

propose a model-based method aimed at optimizing the allocation of the tested

genotypes and checks between fields and placement within field, according to

their kinship. This task can be formulated as a combinatorial permutation-based

problem. We used Differential Evolution concept to solve this problem. We then

present results of optimal strategies for between-field and within-field

placements of genotypes and compare them to existing optimization strategies,

both in terms of convergence time and result quality. The new algorithm gives

promising results in terms of convergence and search space exploration.

Structured Attention Networks

Yoon Kim , Carl Denton , Luong Hoang , Alexander M. Rush Subjects : Computation and Language (cs.CL) ; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Attention networks have proven to be an effective approach for embedding

categorical inference within a deep neural network. However, for many tasks we

may want to model richer structural dependencies without abandoning end-to-end

training. In this work, we experiment with incorporating richer structural

distributions, encoded using graphical models, within deep networks. We show

that these structured attention networks are simple extensions of the basic

attention procedure, and that they allow for extending attention beyond the

standard soft-selection approach, such as attending to partial segmentations or

to subtrees. We experiment with two different classes of structured attention

networks: a linear-chain conditional random field and a graph-based parsing

model, and describe how these models can be practically implemented as neural

network layers. Experiments show that this approach is effective for

incorporating structural biases, and structured attention networks outperform

baseline attention models on a variety of synthetic and real tasks: tree

transduction, neural machine translation, question answering, and natural

language inference. We further find that models trained in this way learn

interesting unsupervised hidden representations that generalize simple

attention.

Computer Vision and Pattern Recognition

Joint 2D-3D-Semantic Data for Indoor Scene Understanding

Iro Armeni , Sasha Sax , Amir R. Zamir , Silvio Savarese

Comments: The dataset is available this http URL

Subjects

Computer Vision and Pattern Recognition (cs.CV)

; Robotics (cs.RO)

We present a dataset of large-scale indoor spaces that provides a variety of

mutually registered modalities from 2D, 2.5D and 3D domains, with

instance-level semantic and geometric annotations. The dataset covers over

6,000 m2 and contains over 102,000 RGB images, along with the corresponding

depths, surface normals, semantic annotations, global XYZ images (all in forms

of both regular and 360{deg} equirectangular images) as well as camera

information. It also includes registered raw and semantically an- notated 3D

meshes and point clouds. The dataset enables development of joint and

cross-modal learning models and potentially unsupervised approaches utilizing

the regularities present in large-scale indoor spaces. The dataset is available

here: this http URL

Deep Learning with Low Precision by Half-wave Gaussian Quantization

Zhaowei Cai , Xiaodong He , Jian Sun , Nuno Vasconcelos Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI); Learning (cs.LG)

The problem of quantizing the activations of a deep neural network is

considered. An examination of the popular binary quantization approach shows

that this consists of approximating a classical non-linearity, the hyperbolic

tangent, by two functions: a piecewise constant sign function, which is used in

feedforward network computations, and a piecewise linear hard tanh function,

used in the backpropagation step during network learning. The problem of

approximating the ReLU non-linearity, widely used in the recent deep learning

literature, is then considered. An half-wave Gaussian quantizer (HWGQ) is

proposed for forward approximation and shown to have efficient implementation,

by exploiting the statistics of of network activations and batch normalization

operations commonly used in the literature. To overcome the problem of gradient

mismatch, due to the use of different forward and backward approximations,

several piece-wise backward approximators are then investigated. The

implementation of the resulting quantized network, denoted as HWGQ-Net, is

shown to achieve much closer performance to full precision networks, such as

AlexNet, ResNet, GoogLeNet and VGG-Net, than previously available low-precision

networks, with 1-bit binary weights and 2-bit quantized activations.

A method of limiting performance loss of CNNs in noisy environments

James R. Geraci , Parichay Kapoor Subjects : Computer Vision and Pattern Recognition (cs.CV)

Convolutional Neural Network (CNN) recognition rates drop in the presence of

noise. We demonstrate a novel method of counteracting this drop in recognition

rate by adjusting the biases of the neurons in the convolutional layers

according to the noise conditions encountered at runtime. We compare our

technique to training one network for all possible noise levels, dehazing via

preprocessing a signal with a denoising autoencoder, and training a network

specifically for each noise level. Our system compares favorably in terms of

robustness, computational complexity and recognition rate.

FCSS: Fully Convolutional Self-Similarity for Dense Semantic Correspondence

Seungryong Kim , Dongbo Min , Bumsub Ham , Sangryul Jeon , Stephen Lin , Kwanghoon Sohn Subjects : Computer Vision and Pattern Recognition (cs.CV)

We present a descriptor, called fully convolutional self-similarity (FCSS),

for dense semantic correspondence. To robustly match points among different

instances within the same object class, we formulate FCSS using local

self-similarity (LSS) within a fully convolutional network. In contrast to

existing CNN-based descriptors, FCSS is inherently insensitive to intra-class

appearance variations because of its LSS-based structure, while maintaining the

precise localization ability of deep neural networks. The sampling patterns of

local structure and the self-similarity measure are jointly learned within the

proposed network in an end-to-end and multi-scale manner. As training data for

semantic correspondence is rather limited, we propose to leverage object

candidate priors provided in existing image datasets and also correspondence

consistency between object pairs to enable weakly-supervised learning.

Experiments demonstrate that FCSS outperforms conventional handcrafted

descriptors and CNN-based descriptors on various benchmarks.

Seeded Laplaican: An Eigenfunction Solution for Scribble Based Interactive Image Segmentation

Ahmed Taha , Marwan Torki Subjects : Computer Vision and Pattern Recognition (cs.CV)

In this paper, we cast the scribble-based interactive image segmentation as a

semi-supervised learning problem. Our novel approach alleviates the need to

solve an expensive generalized eigenvector problem by approximating the

eigenvectors using efficiently computed eigenfunctions. The smoothness operator

defined on feature densities at the limit n tends to infinity recovers the

exact eigenvectors of the graph Laplacian, where n is the number of nodes in

the graph. To further reduce the computational complexity without scarifying

our accuracy, we select pivots pixels from user annotations. In our

experiments, we evaluate our approach using both human scribble and “robot

user” annotations to guide the foreground/background segmentation. We developed

a new unbiased collection of five annotated images datasets to standardize the

evaluation procedure for any scribble-based segmentation method. We

experimented with several variations, including different feature vectors,

pivot count and the number of eigenvectors. Experiments are carried out on

datasets that contain a wide variety of natural images. We achieve better

qualitative and quantitative results compared to state-of-the-art interactive

segmentation algorithms.

Deep Learning For Video Saliency Detection

Wenguan Wang , Jianbing Shen , Ling Shao Subjects : Computer Vision and Pattern Recognition (cs.CV)

This paper proposes a deep learning model to efficiently detect salient

regions in videos. It addresses two important issues: (1) deep video saliency

model training with the absence of sufficiently large and pixel-wise annotated

video data; and (2) fast video saliency training and detection. The proposed

deep video saliency network consists of two modules, for capturing the spatial

and temporal saliency stimuli, respectively. The dynamic saliency model,

explicitly incorporating saliency estimates from the static saliency model,

directly produces spatiotemporal saliency inference without time-consuming

optical flow computation. We further propose a novel data augmentation

technique that simulates video training data from existing annotated image

datasets, which enables our network to learn diverse saliency stimuli and

prevents overfitting with the limited number of training videos. Leveraging our

synthetic video data (150K video sequences) and real videos, our deep video

saliency model successfully learns both spatial and temporal saliency stimuli,

thus producing accurate spatiotemporal saliency estimate. We advance the

state-of-the-art on the DAVIS dataset (MAE of .06) and the FBMS dataset (MAE of

.07), and do so with much improved speed (2fps with all steps) on one GPU.

YouTube-BoundingBoxes: A Large High-Precision Human-Annotated Data Set for Object Detection in Video

Esteban Real , Jonathon Shlens , Stefano Mazzocchi , Xin Pan , Vincent Vanhoucke Subjects : Computer Vision and Pattern Recognition (cs.CV)

We introduce a new large-scale data set of video URLs with densely-sampled

object bounding box annotations called YouTube-BoundingBoxes (YT-BB). The data

set consists of approximately 380,000 video segments about 19s long,

automatically selected to feature objects in natural settings without editing

or post-processing, with a recording quality often akin to that of a hand-held

cell phone camera. The objects represent a subset of the MS COCO label set. All

video segments were human-annotated with high-precision classification labels

and bounding boxes at 1 frame per second. The use of a cascade of increasingly

precise human annotations ensures a label accuracy above 95% for every class

and tight bounding boxes. Finally, we train and evaluate well-known deep

network architectures and report baseline figures for per-frame classification

and localization to provide a point of comparison for future work. We also

demonstrate how the temporal contiguity of video can potentially be used to

improve such inferences. The data set can be found at

this https URL We hope the availability of such large

curated corpus will spur new advances in video object detection and tracking.

Intrinsic Grassmann Averages for Online Linear and Robust Subspace Learning

Rudrasis Chakraborty , Søren Hauberg , Baba C. Vemuri Subjects : Learning (cs.LG) ; Computer Vision and Pattern Recognition (cs.CV)

Principal Component Analysis (PCA) is a fundamental method for estimating a

linear subspace approximation to high-dimensional data. Many algorithms exist

in literature to achieve a statistically robust version of PCA called RPCA. In

this paper, we present a geometric framework for computing the principal linear

subspaces in both situations that amounts to computing the intrinsic average on

the space of all subspaces (the Grassmann manifold). Points on this manifold

are defined as the subspaces spanned by (K)-tuples of observations. We show

that the intrinsic Grassmann average of these subspaces coincide with the

principal components of the observations when they are drawn from a Gaussian

distribution. Similar results are also shown to hold for the RPCA. Further, we

propose an efficient online algorithm to do subspace averaging which is of

linear complexity in terms of number of samples and has a linear convergence

rate. When the data has outliers, our proposed online robust subspace averaging

algorithm shows significant performance (accuracy and computation time) gain

over a recently published RPCA methods with publicly accessible code. We have

demonstrated competitive performance of our proposed online subspace algorithm

method on one synthetic and two real data sets. Experimental results depicting

stability of our proposed method are also presented. Furthermore, on two real

outlier corrupted datasets, we present comparison experiments showing lower

reconstruction error using our online RPCA algorithm. In terms of

reconstruction error and time required, both our algorithms outperform the

competition.

Artificial Intelligence

The Value of Inferring the Internal State of Traffic Participants for Autonomous Freeway Driving

Zachary Sunberg , Christopher Ho , Mykel Kochenderfer Subjects : Artificial Intelligence (cs.AI)

Safe interaction with human drivers is one of the primary challenges for

autonomous vehicles. In order to plan driving maneuvers effectively, the

vehicle’s control system must infer and predict how humans will behave based on

their latent internal state (e.g., intentions and aggressiveness). This

research uses a simple model for human behavior with unknown parameters that

make up the internal states of the traffic participants and presents a method

for quantifying the value of estimating these states and planning with their

uncertainty explicitly modeled. An upper performance bound is established by an

omniscient Monte Carlo Tree Search (MCTS) planner that has perfect knowledge of

the internal states. A baseline lower bound is established by planning with

MCTS assuming that all drivers have the same internal state. MCTS variants are

then used to solve a partially observable Markov decision process (POMDP) that

models the internal state uncertainty to determine whether inferring the

internal state offers an advantage over the baseline. Applying this method to a

freeway lane changing scenario reveals that there is a significant performance

gap between the upper bound and baseline. POMDP planning techniques come close

to closing this gap, especially when important hidden model parameters are

correlated with measurable parameters.

On Robustness in Multilayer Interdependent Network

Joydeep Banerjee , Chenyang Zhou , Arunabha Sen

Comments: CRITIS 2015

Subjects

Networking and Internet Architecture (cs.NI)

; Artificial Intelligence (cs.AI)

Critical Infrastructures like power and communication networks are highly

interdependent on each other for their full functionality. Many significant

research have been pursued to model the interdependency and failure analysis of

these interdependent networks. However, most of these models fail to capture

the complex interdependencies that might actually exist between the

infrastructures. The emph{Implicative Interdependency Model} that utilizes

Boolean Logic to capture complex interdependencies was recently proposed which

overcome the limitations of the existing models. A number of problems were

studies based on this model. In this paper we study the extit{Robustness}

problem in Interdependent Power and Communication Network. The robustness is

defined with respect to two parameters (K in I^{+} cup {0}) and (

ho in

(0,1]). We utilized the emph{Implicative Interdependency Model} model to

capture the complex interdependency between the two networks. The model

classifies the interdependency relations into four cases. Computational

complexity of the problem is analyzed for each of these cases. A polynomial

time algorithm is designed for the first case that outputs the optimal

solution. All the other cases are proved to be NP-complete. An

in-approximability bound is provided for the third case. For the general case

we formulate an Integer Linear Program to get the optimal solution and a

polynomial time heuristic. The applicability of the heuristic is evaluated

using power and communication network data of Maricopa County, Arizona. The

experimental results showed that the heuristic almost always produced near

optimal value of parameter (K) for (

ho < 0.42).