转载

arXiv Paper Daily: Tue, 6 Dec 2016

Neural and Evolutionary Computing

BrainFrame: A heterogeneous accelerator platform for neuron simulations

Georgios Smaragdos , Georgios Chatzikonstantis , Rahul Kukreja , Harrys Sidiropoulos , Dimitrios Rodopoulos , Ioannis Sourdis , Zaid Al-Ars , Christoforos Kachris , Dimitrios Soudris , Chris I. De Zeeuw , Christos Strydis

Comments: 17 pages, 19 figures, 4 tables

Subjects

:

Neural and Evolutionary Computing (cs.NE)

; Distributed, Parallel, and Cluster Computing (cs.DC)

Objective: The advent of High-Performance Computing (HPC) in recent years has

led to its increasing use in brain study through computational models. The

scale and complexity of such models are constantly increasing, leading to

challenging computational requirements. Even though modern HPC platforms can

often deal with such challenges, the vast diversity of the modeling field does

not permit for a single acceleration (or homogeneous) platform to effectively

address the complete array of modeling requirements. Approach: In this paper we

propose and build BrainFrame, a heterogeneous acceleration platform,

incorporating three distinct acceleration technologies, a Dataflow Engine, a

Xeon Phi and a GP-GPU. The PyNN framework is also integrated into the platform.

As a challenging proof of concept, we analyze the performance of BrainFrame on

different instances of a state-of-the-art neuron model, modeling the Inferior-

Olivary Nucleus using a biophysically-meaningful, extended Hodgkin-Huxley

representation. The model instances take into account not only the neuronal-

network dimensions but also different network-connectivity circumstances that

can drastically change application workload characteristics. Main results: The

synthetic approach of three HPC technologies demonstrated that BrainFrame is

better able to cope with the modeling diversity encountered. Our performance

analysis shows clearly that the model directly affect performance and all three

technologies are required to cope with all the model use cases.

Enabling Bio-Plausible Multi-level STDP using CMOS Neurons with Dendrites and Bistable RRAMs

Xinyu Wu , Vishal Saxena Subjects : Emerging Technologies (cs.ET) ; Neural and Evolutionary Computing (cs.NE)

Large-scale integration of emerging nanoscale non-volatile memory devices,

e.g. resistive random-access memory (RRAM), can enable a new generation of

neuromorphic computers that can solve a wide range of machine learning

problems. Such hybrid CMOS-RRAM neuromorphic architectures will result in

several orders of magnitude reduction in energy consumption at a very small

form factor, and herald autonomous learning machines capable of self-adapting

to their environment. However, the progress in this area has been impeded from

the realization that the actual memory devices fall well short of their

expected behavior. In this work, we discuss the challenges associated with

these memory devices and their use in neuromorphic computing circuits, and

propose pathways to overcome these limitations by introducing ‘dendritic

learning’.

Message Passing Multi-Agent GANs

Arnab Ghosh , Viveka Kulharia , Vinay Namboodiri

Comments: The first 2 authors contributed equally for this work

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Communicating and sharing intelligence among agents is an important facet of

achieving Artificial General Intelligence. As a first step towards this

challenge, we introduce a novel framework for image generation: Message Passing

Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have

recently been shown to be very effective for image generation and other tasks,

these networks have been limited to mostly single generator-discriminator

networks. We show that we can obtain multi-agent GANs that communicate through

message passing to achieve better image generation. The objectives of the

individual agents in this framework are two fold: a co-operation objective and

a competing objective. The co-operation objective ensures that the message

sharing mechanism guides the other generator to generate better than itself

while the competing objective encourages each generator to generate better than

its counterpart. We analyze and visualize the messages that these GANs share

among themselves in various scenarios. We quantitatively show that the message

sharing formulation serves as a regularizer for the adversarial training.

Qualitatively, we show that the different generators capture different traits

of the underlying data distribution.

Known Unknowns: Uncertainty Quality in Bayesian Neural Networks

Ramon Oliveira , Pedro Tabacof , Eduardo Valle

Comments: Workshop on Bayesian Deep Learning, NIPS 2016, Barcelona, Spain

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

We evaluate the uncertainty quality in neural networks using anomaly

detection. We extract uncertainty measures (e.g. entropy) from the predictions

of candidate models, use those measures as features for an anomaly detector,

and gauge how well the detector differentiates known from unknown classes. We

assign higher uncertainty quality to candidate models that lead to better

detectors. We also propose a novel method for sampling a variational

approximation of a Bayesian neural network, called One-Sample Bayesian

Approximation (OSBA). We experiment on two datasets, MNIST and CIFAR10. We

compare the following candidate neural network models: Maximum Likelihood,

Bayesian Dropout, OSBA, and — for MNIST — the standard variational

approximation. We show that Bayesian Dropout and OSBA provide better

uncertainty information than Maximum Likelihood, and are essentially equivalent

to the standard variational approximation, but much faster.

Semi-supervised learning of deep metrics for stereo reconstruction

Stepan Tulyakov , Anton Ivanov , Francois Fleuret

Comments: 11 pages, 3 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Neural and Evolutionary Computing (cs.NE)

Deep-learning metrics have recently demonstrated extremely good performance

to match image patches for stereo reconstruction. However, training such

metrics requires large amount of labeled stereo images, which can be difficult

or costly to collect for certain applications. The main contribution of our

work is a new semi-supervised method for learning deep metrics from unlabeled

stereo images, given coarse information about the scenes and the optical

system. Our method alternatively optimizes the metric with a standard

stochastic gradient descent, and applies stereo constraints to regularize its

prediction. Experiments on reference data-sets show that, for a given network

architecture, training with this new method without ground-truth produces a

metric with performance as good as state-of-the-art baselines trained with the

said ground-truth. This work has three practical implications. Firstly, it

helps to overcome limitations of training sets, in particular noisy ground

truth. Secondly it allows to use much more training data during learning.

Thirdly, it allows to tune deep metric for a particular stereo system, even if

ground truth is not available.

Positive blood culture detection in time series data using a BiLSTM network

Leen De Baets , Joeri Ruyssinck , Thomas Peiffer , Johan Decruyenaere , Filip De Turck , Femke Ongenae , Tom Dhaene Subjects : Learning (cs.LG) ; Neural and Evolutionary Computing (cs.NE); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

The presence of bacteria or fungi in the bloodstream of patients is abnormal

and can lead to life-threatening conditions. A computational model based on a

bidirectional long short-term memory artificial neural network, is explored to

assist doctors in the intensive care unit to predict whether examination of

blood cultures of patients will return positive. As input it uses nine

monitored clinical parameters, presented as time series data, collected from

2177 ICU admissions at the Ghent University Hospital. Our main goal is to

determine if general machine learning methods and more specific, temporal

models, can be used to create an early detection system. This preliminary

research obtains an area of 71.95% under the precision recall curve, proving

the potential of temporal neural networks in this context.

Parameter Compression of Recurrent Neural Networks and Degredation of Short-term Memory

Jonathan A. Cox

Comments: Submitted to IJCNN 2017

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

The significant computational costs of deploying neural networks in

large-scale or resource constrained environments, such as data centers and

mobile devices, has spurred interest in model compression, which can achieve a

reduction in both arithmetic operations and storage memory. Several techniques

have been proposed for reducing or compressing the parameters for feed-forward

and convolutional neural networks, but less is understood about the effect of

parameter compression on recurrent neural networks (RNN). In particular, the

extent to which the recurrent parameters can be compressed and the impact on

short-term memory performance, is not well understood. In this paper, we study

the effect of complexity reduction, through singular value decomposition rank

reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several

tasks. We show that considerable rank reduction is possible when compressing

recurrent weights, even without fine tuning. Furthermore, we propose a

perturbation model for the effect of general perturbations, such as a

compression, on the recurrent parameters of RNNs. The model is tested against a

noiseless memorization experiment that elucidates the short-term memory

performance. In this way, we demonstrate that the effect of compression of

recurrent parameters is dependent on the degree of temporal coherence present

in the data and task. This work can guide on-the-fly RNN compression for novel

environments or tasks, and provides insight for applying RNN compression in

low-power devices, such as hearing aids.

Computer Vision and Pattern Recognition

ROAM: a Rich Object Appearance Model with Application to Rotoscoping

Ondrej Miksik , Juan-Manuel Pérez-Rúa , Philip H. S. Torr , Patrick Pérez Subjects : Computer Vision and Pattern Recognition (cs.CV)

Rotoscoping, the detailed delineation of scene elements through a video shot,

is a painstaking task of tremendous importance in professional post-production

pipelines. While pixel-wise segmentation techniques can help for this task,

professional rotoscoping tools rely on parametric curves that offer the artists

a much better interactive control on the definition, editing and manipulation

of the segments of interest. Sticking to this prevalent rotoscoping paradigm,

we propose a novel framework to capture and track the visual aspect of an

arbitrary object in a scene, given a first closed outline of this object. This

model combines a collection of local foreground/background appearance models

spread along the outline, a global appearance model of the enclosed object and

a set of distinctive foreground landmarks. The structure of this rich

appearance model allows simple initialization, efficient iterative optimization

with exact minimization at each step, and on-line adaptation in videos. We

demonstrate qualitatively and quantitatively the merit of this framework

through comparisons with tools based on either dynamic segmentation with a

closed curve or pixel-wise binary labelling.

Authoring image decompositions with generative models

Jason Rock , Theerasit Issaranon , Aditya Deshpande , David Forsyth Subjects : Computer Vision and Pattern Recognition (cs.CV)

We show how to extend traditional intrinsic image decompositions to

incorporate further layers above albedo and shading. It is hard to obtain data

to learn a multi-layer decomposition. Instead, we can learn to decompose an

image into layers that are “like this” by authoring generative models for each

layer using proxy examples that capture the Platonic ideal (Mondrian images for

albedo; rendered 3D primitives for shading; material swatches for shading

detail). Our method then generates image layers, one from each model, that

explain the image. Our approach rests on innovation in generative models for

images. We introduce a Convolutional Variational Auto Encoder (conv-VAE), a

novel VAE architecture that can reconstruct high fidelity images. The approach

is general, and does not require that layers admit a physical interpretation.

Articulated Multi-person Tracking in the Wild

Eldar Insafutdinov , Mykhaylo Andriluka , Leonid Pishchulin , Siyu Tang , Bjoern Andres , Bernt Schiele Subjects : Computer Vision and Pattern Recognition (cs.CV)

In this paper we propose an approach for articulated tracking of multiple

people in unconstrained videos. Our starting point is a model that resembles

existing architectures for single-frame pose estimation but is several orders

of magnitude faster. We achieve this in two ways: (1) by simplifying and

sparsifying the body-part relationship graph and leveraging recent methods for

faster inference, and (2) by offloading a substantial share of computation onto

a feed-forward convolutional architecture that is able to detect and associate

body joints of the same person even in clutter. We use this model to generate

proposals for body joint locations and formulate articulated tracking as

spatio-temporal grouping of such proposals. This allows to jointly solve the

association problem for all people in the scene by propagating evidence from

strong detections through time and enforcing constraints that each proposal can

be assigned to one person only. We report results on a public MPII Human Pose

benchmark and on a new dataset of videos with multiple people. We demonstrate

that our model achieves state-of-the-art results while using only a fraction of

time and is able to leverage temporal information to improve state-of-the-art

for crowded scenes.

ImageNet pre-trained models with batch normalization

Marcel Simon , Erik Rodner , Joachim Denzler Subjects : Computer Vision and Pattern Recognition (cs.CV)

Convolutional neural networks (CNN) pre-trained on ImageNet are the backbone

of most state-of-the-art approaches. In this paper, we present a new set of

pre-trained models with popular state-of-the-art architectures for the Caffe

framework. The first release includes Residual Networks (ResNets) with

generation script as well as the batch-normalization-variants of AlexNet and

VGG19. All models outperform previous models with the same architecture. The

models and training code are available at

this http URL and

this https URL

A Distance Function for Comparing Straight-Edge Geometric Figures

Apoorva Honnegowda Roopa , Shrisha Rao

Comments: 29 pages, 12 figures including appendices

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Computational Geometry (cs.CG)

This paper defines a distance function that measures the dissimilarity

between planar geometric figures formed with straight lines. This function can

in turn be used in partial matching of different geometric figures. For a given

pair of geometric figures that are graphically isomorphic, one function

measures the angular dissimilarity and another function measures the edge

length disproportionality. The distance function is then defined as the convex

sum of these two functions. The novelty of the presented function is that it

satisfies all properties of a distance function and the computation of the same

is done by projecting appropriate features to a cartesian plane. To compute the

deviation from the angular similarity property, the Euclidean distance between

the given angular pairs and the corresponding points on the (y=x) line is

measured. Further while computing the deviation from the edge length

proportionality property, the best fit line, for the set of edge lengths, which

passes through the origin is found, and the Euclidean distance between the

given edge length pairs and the corresponding point on a (y=mx) line is

calculated. Iterative Proportional Fitting Procedure (IPFP) is used to find

this best fit line. We demonstrate the behavior of the defined function for

some sample pairs of figures.

From One-Trick Ponies to All-Rounders: On-Demand Learning for Image Restoration

Ruohan Gao , Kristen Grauman Subjects : Computer Vision and Pattern Recognition (cs.CV)

While machine learning approaches to image restoration offer great promise,

current methods risk training “one-trick ponies” that perform well only for

image corruption of a particular level of difficulty—such as a certain level

of noise or blur. First, we expose the weakness of today’s one-trick pony and

demonstrate that training general models equipped to handle arbitrary levels of

corruption is indeed non-trivial. Then, we propose an on-demand learning

algorithm for training image restoration models with deep convolutional neural

networks. The main idea is to exploit a feedback mechanism to self-generate

training instances where they are needed most, thereby learning models that can

generalize across difficulty levels. On four restoration tasks—image

inpainting, pixel interpolation, image deblurring, and image denoising—and

three diverse datasets, our approach consistently outperforms both the status

quo training procedure and curriculum learning alternatives.

Human-In-The-Loop Person Re-Identification

Hanxiao Wang , Shaogang Gong , Xiatian Zhu , Tao Xiang Subjects : Computer Vision and Pattern Recognition (cs.CV)

Current person re-identification (re-id) methods assume that (1) pre-labelled

training data are available for every camera pair, (2) the gallery size for

re-identification is moderate. Both assumptions scale poorly to real-world

applications when camera network size increases and gallery size becomes large.

Human verification of automatic model ranked re-id results becomes inevitable.

In this work, a novel human-in-the-loop re-id model based on Human Verification

Incremental Learning (HVIL) is formulated which does not require any

pre-labelled training data to learn a model, therefore readily scalable to new

camera pairs. This HVIL model learns cumulatively from human feedback to

provide instant improvement to re-id ranking of each probe on-the-fly enabling

the model scalable to large gallery sizes. We further formulate a Regularised

Metric Ensemble Learning (RMEL) model to combine a series of incrementally

learned HVIL models into a single ensemble model to be used when human feedback

becomes unavailable.

Highly Efficient Regression for Scalable Person Re-Identification

Hanxiao Wang , Shaogang Gong , Tao Xiang Subjects : Computer Vision and Pattern Recognition (cs.CV)

Existing person re-identification models are poor for scaling up to large

data required in real-world applications due to: (1) Complexity: They employ

complex models for optimal performance resulting in high computational cost for

training at a large scale; (2) Inadaptability: Once trained, they are

unsuitable for incremental update to incorporate any new data available. This

work proposes a truly scalable solution to re-id by addressing both problems.

Specifically, a Highly Efficient Regression (HER) model is formulated by

embedding the Fisher’s criterion to a ridge regression model for very fast

re-id model learning with scalable memory/storage usage. Importantly, this new

HER model supports faster than real-time incremental model updates therefore

making real-time active learning feasible in re-id with human-in-the-loop.

Extensive experiments show that such a simple and fast model not only

outperforms notably the state-of-the-art re-id methods, but also is more

scalable to large data with additional benefits to active learning for reducing

human labelling effort in re-id deployment.

Classification With an Edge: Improving Semantic Image Segmentation with Boundary Detection

Dimitrios Marmanis , Konrad Schindler , Jan Dirk Wegner , Silvano Galliani , Mihai Datcu , Uwe Stilla Subjects : Computer Vision and Pattern Recognition (cs.CV)

We present an end-to-end trainable deep convolutional neural network (DCNN)

for semantic segmentation with built-in awareness of semantically meaningful

boundaries. Semantic segmentation is a fundamental remote sensing task, and

most state-of-the-art methods rely on DCNNs as their workhorse. A major reason

for their success is that deep networks learn to accumulate contextual

information over very large windows (receptive fields). However, this success

comes at a cost, since the associated loss of effecive spatial resolution

washes out high-frequency details and leads to blurry object boundaries. Here,

we propose to counter this effect by combining semantic segmentation with

semantically informed edge detection, thus making class-boundaries explicit in

the model, First, we construct a comparatively simple, memory-efficient model

by adding boundary detection to the Segnet encoder-decoder architecture.

Second, we also include boundary detection in FCN-type models and set up a

high-end classifier ensemble. We show that boundary detection significantly

improves semantic segmentation with CNNs. Our high-end ensemble achieves > 90%

overall accuracy on the ISPRS Vaihingen benchmark.

Stereo image de-fencing using smartphones

Sankaraganesh Jonna , Sukla Satapathy , Rajiv R. Sahay

Comments: Under review as a conference paper

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Conventional approaches to image de-fencing have limited themselves to using

only image data in adjacent frames of the captured video of an approximately

static scene. In this work, we present a method to harness disparity using a

stereo pair of fenced images in order to detect fence pixels. Tourists and

amateur photographers commonly carry smartphones/phablets which can be used to

capture a short video sequence of the fenced scene. We model the formation of

the occluded frames in the captured video. Furthermore, we propose an

optimization framework to estimate the de-fenced image using the total

variation prior to regularize the ill-posed problem.

Message Passing Multi-Agent GANs

Arnab Ghosh , Viveka Kulharia , Vinay Namboodiri

Comments: The first 2 authors contributed equally for this work

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Communicating and sharing intelligence among agents is an important facet of

achieving Artificial General Intelligence. As a first step towards this

challenge, we introduce a novel framework for image generation: Message Passing

Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have

recently been shown to be very effective for image generation and other tasks,

these networks have been limited to mostly single generator-discriminator

networks. We show that we can obtain multi-agent GANs that communicate through

message passing to achieve better image generation. The objectives of the

individual agents in this framework are two fold: a co-operation objective and

a competing objective. The co-operation objective ensures that the message

sharing mechanism guides the other generator to generate better than itself

while the competing objective encourages each generator to generate better than

its counterpart. We analyze and visualize the messages that these GANs share

among themselves in various scenarios. We quantitatively show that the message

sharing formulation serves as a regularizer for the adversarial training.

Qualitatively, we show that the different generators capture different traits

of the underlying data distribution.

Point Pair Feature based Object Detection for Random Bin Picking

Wim Abbeloos , Toon Goedemé Subjects : Computer Vision and Pattern Recognition (cs.CV)

Point pair features are a popular representation for free form 3D object

detection and pose estimation. In this paper, their performance in an

industrial random bin picking context is investigated. A new method to generate

representative synthetic datasets is proposed. This allows to investigate the

influence of a high degree of clutter and the presence of self similar

features, which are typical to our application. We provide an overview of

solutions proposed in literature and discuss their strengths and weaknesses. A

simple heuristic method to drastically reduce the computational complexity is

introduced, which results in improved robustness, speed and accuracy compared

to the naive approach.

Panoramic Structure from Motion via Geometric Relationship Detection

Satoshi Ikehata , Ivaylo Boyadzhiev , Qi Shan , Yasutaka Furukawa Subjects : Computer Vision and Pattern Recognition (cs.CV)

This paper addresses the problem of Structure from Motion (SfM) for indoor

panoramic image streams, extremely challenging even for the state-of-the-art

due to the lack of textures and minimal parallax. The key idea is the fusion of

single-view and multi-view reconstruction techniques via geometric relationship

detection (e.g., detecting 2D lines as coplanar in 3D). Rough geometry suffices

to perform such detection, and our approach utilizes rough surface normal

estimates from an image-to-normal deep network to discover geometric

relationships among lines. The detected relationships provide exact geometric

constraints in our line-based linear SfM formulation. A constrained linear

least squares is used to reconstruct a 3D model and camera motions, followed by

the bundle adjustment. We have validated our algorithm on challenging datasets,

outperforming various state-of-the-art reconstruction techniques.

Deep Image Category Discovery using a Transferred Similarity Function

Yen-Chang Hsu , Zhaoyang Lv , Zsolt Kira

Comments: 13 pages, 9 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG)

Automatically discovering image categories in unlabeled natural images is one

of the important goals of unsupervised learning. However, the task is

challenging and even human beings define visual categories based on a large

amount of prior knowledge. In this paper, we similarly utilize prior knowledge

to facilitate the discovery of image categories. We present a novel end-to-end

network to map unlabeled images to categories as a clustering network. We

propose that this network can be learned with contrastive loss which is only

based on weak binary pair-wise constraints. Such binary constraints can be

learned from datasets in other domains as transferred similarity functions,

which mimic a simple knowledge transfer. We first evaluate our experiments on

the MNIST dataset as a proof of concept, based on predicted similarities

trained on Omniglot, showing a 99/% accuracy which significantly outperforms

clustering based approaches. Then we evaluate the discovery performance on

Cifar-10, STL-10, and ImageNet, which achieves both state-of-the-art accuracy

and shows it can be scalable to various large natural images.

Cancerous Nuclei Detection and Scoring in Breast Cancer Histopathological Images

Pegah Faridi , Habibollah Danyali , Mohammad Sadegh Helfroush , Mojgan Akbarzadeh Jahromi Subjects : Computer Vision and Pattern Recognition (cs.CV)

Early detection and prognosis of breast cancer are feasible by utilizing

histopathological grading of biopsy specimens. This research is focused on

detection and grading of nuclear pleomorphism in histopathological images of

breast cancer. The proposed method consists of three internal steps. First,

unmixing colors of H&E is used in the preprocessing step. Second, nuclei

boundaries are extracted incorporating the center of cancerous nuclei which are

detected by applying morphological operations and Difference of Gaussian filter

on the preprocessed image. Finally, segmented nuclei are scored to accomplish

one parameter of the Nottingham grading system for breast cancer. In this

approach, the nuclei area, chromatin density, contour regularity, and nucleoli

presence, are features for nuclear pleomorphism scoring. Experimental results

showed that the proposed algorithm, with an accuracy of 86.6%, made significant

advancement in detecting cancerous nuclei compared to existing methods in the

related literature.

Turning an Urban Scene Video into a Cinemagraph

Hang Yan , Yebin Liu , Yasutaka Furukawa Subjects : Computer Vision and Pattern Recognition (cs.CV)

This paper proposes an algorithm that turns a regular video capturing urban

scenes into a high-quality endless animation, known as a Cinemagraph. The

creation of a Cinemagraph usually requires a static camera in a carefully

configured scene. The task becomes challenging for a regular video with a

moving camera and objects. Our approach first warps an input video into the

viewpoint of a reference camera. Based on the warped video, we propose

effective temporal analysis algorithms to detect regions with static geometry

and dynamic appearance, where geometric modeling is reliable and visually

attractive animations can be created. Lastly, the algorithm applies a sequence

of video processing techniques to produce a Cinemagraph movie. We have tested

the proposed approach on numerous challenging real scenes. To our knowledge,

this work is the first to automatically generate Cinemagraph animations from

regular movies in the wild.

Multi-way Particle Swarm Fusion

Chen Liu , Hang Yan , Pushmeet Kohli , Yasutaka Furukawa Subjects : Computer Vision and Pattern Recognition (cs.CV)

This paper proposes a novel MAP inference framework for Markov Random Field

(MRF) in parallel computing environments. The inference framework, dubbed Swarm

Fusion, is a natural generalization of the Fusion Move method. Every thread (in

a case of multi-threading environments) maintains and updates a solution. At

each iteration, a thread can generate arbitrary number of solution proposals

and take arbitrary number of concurrent solutions from the other threads to

perform multi-way fusion in updating its solution. The framework is general,

making popular existing inference techniques such as alpha-expansion, fusion

move, parallel alpha-expansion, and hierarchical fusion, its special cases. We

have evaluated the effectiveness of our approach against competing methods on

three problems of varying difficulties, in particular, the stereo, the optical

flow, and the layered depthmap estimation problems.

Deep Pyramidal Residual Networks with Separated Stochastic Depth

Yoshihiro Yamada , Masakazu Iwamura , Koichi Kise Subjects : Computer Vision and Pattern Recognition (cs.CV)

On general object recognition, Deep Convolutional Neural Networks (DCNNs)

achieve high accuracy. In particular, ResNet and its improvements have broken

the lowest error rate records. In this paper, we propose a method to

successfully combine two ResNet improvements, ResDrop and PyramidNet. We

confirmed that the proposed network outperformed the conventional methods; on

CIFAR-100, the proposed network achieved an error rate of 16.18% in contrast to

PiramidNet achieving that of 18.29% and ResNeXt 17.31%.

Local Blur Mapping: Exploiting High-Level Semantics by Deep Neural Networks

Kede Ma , Huan Fu , Tongliang Liu , Zhou Wang , Dacheng Tao

Comments: Tech report

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

The human visual system excels at detecting local blur of visual images, but

the underlying mechanism is mysterious. Traditional views of blur such as

reduction in local or global high-frequency energy and loss of local phase

coherence have fundamental limitations. For example, they cannot well

discriminate flat regions from blurred ones. Here we argue that high-level

semantic information is critical in successfully detecting local blur.

Therefore, we resort to deep neural networks that are proficient in learning

high-level features and propose the first end-to-end local blur mapping

algorithm based on a fully convolutional network (FCN). We empirically show

that high-level features of deeper layers indeed play a more important role

than low-level features of shallower layers in resolving challenging

ambiguities for this task. We test the proposed method on a standard blur

detection benchmark and demonstrate that it significantly advances the

state-of-the-art (ODS F-score of 0.853). In addition, we explore the use of the

generated blur map in three applications, including blur region segmentation,

blur degree estimation, and blur magnification.

Deep Multi-Modal Image Correspondence Learning

Chen Liu , Jiajun Wu , Pushmeet Kohli , Yasutaka Furukawa Subjects : Computer Vision and Pattern Recognition (cs.CV)

Inference of correspondences between images from different modalities is an

extremely important perceptual ability that enables humans to understand and

recognize cross-modal concepts. In this paper, we consider an instance of this

problem that involves matching photographs of building interiors with their

corresponding floorplan. This is a particularly challenging problem because a

floorplan, as a stylized architectural drawing, is very different in appearance

from a color photograph. Furthermore, individual photographs by themselves

depict only a part of a floorplan (e.g., kitchen, bathroom, and living room).

We propose the use of a number of different neural network architectures for

this task, which are trained and evaluated on a novel large-scale dataset of 5

million floorplan images and 80 million associated photographs. Experimental

evaluation reveals that our neural network architectures are able to identify

visual cues that result in reliable matches across these two quite different

modalities. In fact, the trained networks are able to even outperform human

subjects in several challenging image matching problems. Our result implies

that neural networks are effective at perceptual tasks that require long

periods of reasoning even for humans to solve.

Learnable Structured Clustering Framework for Deep Metric Learning

Hyun Oh Song , Stefanie Jegelka , Vivek Rathod , Kevin Murphy Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Learning (cs.LG)

Learning the representation and the similarity metric in an end-to-end

fashion with deep networks have demonstrated outstanding results for clustering

and retrieval. However, these recent approaches still suffer from the

performance degradation stemming from the local metric training procedure which

is unaware of the global structure of the embedding space.

We propose a global metric learning scheme for optimizing the deep metric

embedding with the learnable clustering function and the clustering metric

(NMI) in a novel structured prediction framework.

Our experiments on CUB200-2011, Cars196, and Stanford online products

datasets show state of the art performance both on the clustering and retrieval

tasks measured in the NMI and Recall@K evaluation metrics.

DenseReg: Fully Convolutional Dense Shape Regression In-the-Wild

Rıza Alp Güler , George Trigeorgis , Epameinondas Antonakos , Patrick Snape , Stefanos Zafeiriou , Iasonas Kokkinos Subjects : Computer Vision and Pattern Recognition (cs.CV)

In this paper we propose to learn a mapping from image pixels into a dense

template grid through a fully convolutional network. We formulate this task as

a regression problem and train our network by leveraging upon manually

annotated facial landmarks “in-the-wild”. We use such landmarks to establish a

dense correspondence field between a three-dimensional object template and the

input image, which then serves as the ground-truth for training our regression

system. We show that we can combine ideas from semantic segmentation with

regression networks, yielding a highly-accurate “quantized regression”

architecture.

Our system, called DenseReg, allows us to estimate dense image-to-template

correspondences in a fully convolutional manner. As such our network can

provide useful correspondence information as a stand-alone system, while when

used as an initialization for Statistical Deformable Models we obtain landmark

localization results that largely outperform the current state-of-the-art on

the challenging 300W benchmark. We thoroughly evaluate our method on a host of

facial analysis tasks, and also demonstrate its use for other correspondence

estimation tasks, such as modelling of the human ear. DenseReg code is made

available at this http URL along with supplementary

materials.

Online Localization and Prediction of Actions and Interactions

Khurram Soomro , Haroon Idrees , Mubarak Shah Subjects : Computer Vision and Pattern Recognition (cs.CV)

This paper proposes a person-centric and online approach to the challenging

problem of localization and prediction of actions and interactions in videos.

Typically, localization or recognition is performed in an offline manner where

all the frames in the video are processed together. This prevents timely

localization and prediction of actions and interactions – an important

consideration for many tasks including surveillance and human-machine

interaction.

In our approach, we estimate human poses at each frame and train

discriminative appearance models using the superpixels inside the pose bounding

boxes. Since the pose estimation per frame is inherently noisy, the conditional

probability of pose hypotheses at current time-step (frame) is computed using

pose estimations in the current frame and their consistency with poses in the

previous frames. Next, both the superpixel and pose-based foreground

likelihoods are used to infer the location of actors at each time through a

Conditional Random. The issue of visual drift is handled by updating the

appearance models, and refining poses using motion smoothness on joint

locations, in an online manner. For online prediction of action (interaction)

confidences, we propose an approach based on Structural SVM that operates on

short video segments, and is trained with the objective that confidence of an

action or interaction increases as time progresses. Lastly, we quantify the

performance of both detection and prediction together, and analyze how the

prediction accuracy varies as a time function of observed action (interaction)

at different levels of detection performance. Our experiments on several

datasets suggest that despite using only a few frames to localize actions

(interactions) at each time instant, we are able to obtain competitive results

to state-of-the-art offline methods.

Who is Mistaken?

Benjamin Eysenbach , Carl Vondrick , Antonio Torralba

Comments: See project website at: this http URL

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Recognizing when people have false beliefs is crucial for understanding their

actions. We introduce the novel problem of identifying when people in abstract

scenes have incorrect beliefs. We present a dataset of scenes, each visually

depicting an 8-frame story in which a character has a mistaken belief. We then

create a representation of characters’ beliefs for two tasks in human action

understanding: predicting who is mistaken, and when they are mistaken.

Experiments suggest that our method for identifying mistaken characters

performs better on these tasks than simple baselines. Diagnostics on our model

suggest it learns important cues for recognizing mistaken beliefs, such as

gaze. We believe models of people’s beliefs will have many applications in

action understanding, robotics, and healthcare.

General models for rational cameras and the case of two-slit projections

Matthew Trager , Bernd Sturmfels , John Canny , Martial Hebert , Jean Ponce

Comments: 9 pages

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

The rational camera model recently introduced in [18] provides a general

methodology for studying abstract nonlinear imaging systems and their

multi-view geometry. This paper provides a concrete embedding of rational

cameras explicitly accounting for the mapping between physical visual rays and

image points, missing in the original model. This allows us to derive simple

but general analytical expressions for direct and inverse projections, and

define primitive rational cameras equivalent under the action of various

projective transformations, leading to a generalized notion of intrinsic

coordinates in this setting. The methodology is general, but it is illustrated

concretely by an in-depth study of two-slit cameras, which we describe using a

pair of linear projections. This simple analytical form allows us to describe

models for the corresponding primitive cameras, to introduce intrinsic

parameters with a clear geometric meaning, and to define an epipolar tensor

characterizing two-view correspondences. In turn, this leads to new algorithms

for structure from motion and self-calibration.

A method for the segmentation of images based on thresholding and applied to vesicular textures

Amelia Carolina Sparavigna

Comments: Keywords: Segmentation, Edge Detection, Image Analysis, 2D Textures, Texture Functions

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

In image processing, a segmentation is a process of partitioning an image

into multiple sets of pixels, that are defined as super-pixels. Each

super-pixel is characterized by a label or parameter. Here, we are proposing a

method for determining the super-pixels based on the thresholding of the image.

This approach is quite useful for studying the images showing vesicular

textures.

Pyramid Scene Parsing Network

Hengshuang Zhao , Jianping Shi , Xiaojuan Qi , Xiaogang Wang , Jiaya Jia Subjects : Computer Vision and Pattern Recognition (cs.CV)

Scene parsing is challenging for unrestricted open vocabulary and diverse

scenes. In this paper, we exploit the capability of global context information

by different-region-based context aggregation through our pyramid pooling

module together with the proposed pyramid scene parsing network (PSPNet). Our

global prior representation is effective to produce good quality results on the

scene parsing task, while PSPNet provides a superior framework for pixel-level

prediction tasks. The proposed approach achieves state-of-the-art performance

on various datasets. It came first in ImageNet scene parsing challenge 2016,

PASCAL VOC 2012 benchmark and Cityscapes benchmark. A single PSPNet yields new

record of mIoU score as 85.4% on PASCAL VOC 2012 and 80.2% on Cityscapes.

Multi-Label Image Classification with Regional Latent Semantic Dependencies

Junjie Zhang , Qi Wu , Chunhua Shen , Jian Zhang , Jianfeng Lu Subjects : Computer Vision and Pattern Recognition (cs.CV)

Recent state-of-the-art approaches of multi-label image classification

exploit the label dependencies at the global level, largely improving the

labeling ability. However, accurately predicting small objects and visual

concepts is still challenging due to the limited discrimination of the global

visual features. In this paper, we propose a Regional Latent Semantic

Dependencies model (RLSD) to address this problem. The utilized model includes

a fully convolutional localization architecture to localize the regions that

may contain multiple highly-dependent labels. The localized regions are further

sent to the recurrent neural networks (RNN) to characterize the latent semantic

dependencies at the regional level. Experimental results on several benchmark

datasets show that our proposed model achieves the best performance compared to

the state-of-the-art models, especially for predicting small objects occurred

in the images. In addition, we set up an upper bound model (RLSD+ft-RPN) using

bounding box coordinates during training, the experimental results also show

that our RLSD can approach the upper bound without using the bounding-box

annotations, which is more realistic in the real world.

End-to-end Learning of Driving Models from Large-scale Video Datasets

Huazhe Xu , Yang Gao , Fisher Yu , Trevor Darrell Subjects : Computer Vision and Pattern Recognition (cs.CV)

Robust perception-action models should be learned from training data with

diverse visual appearances and realistic behaviors, yet current approaches to

deep visuomotor policy learning have been generally limited to in-situ models

learned from a single vehicle or a simulation environment. We advocate learning

a generic vehicle motion model from large scale crowd-sourced video data, and

develop an end-to-end trainable architecture for learning to predict a

distribution over future vehicle egomotion from instantaneous monocular camera

observations and previous vehicle state. Our model incorporates a novel

FCN-LSTM architecture, which can be learned from large-scale crowd-sourced

vehicle action data, and leverages available scene segmentation side tasks to

improve performance under a privileged learning paradigm.

Joint Visual Denoising and Classification using Deep Learning

Gang Chen , Yawei Li , Sargur N. Srihari

Comments: 5 pages, 7 figures, ICIP 2016

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Visual restoration and recognition are traditionally addressed in pipeline

fashion, i.e. denoising followed by classification. Instead, observing

correlations between the two tasks, for example clearer image will lead to

better categorization and vice visa, we propose a joint framework for visual

restoration and recognition for handwritten images, inspired by advances in

deep autoencoder and multi-modality learning. Our model is a 3-pathway deep

architecture with a hidden-layer representation which is shared by multi-inputs

and outputs, and each branch can be composed of a multi-layer deep model. Thus,

visual restoration and classification can be unified using shared

representation via non-linear mapping, and model parameters can be learnt via

backpropagation. Using MNIST and USPS data corrupted with structured noise, the

proposed framework performs at least 20/% better in classification than

separate pipelines, as well as clearer recovered images. The noise model and

the reproducible source code is available at

{url{ this https URL }}.

Skin Cancer Detection and Tracking using Data Synthesis and Deep Learning

Yunzhu Li , Andre Esteva , Brett Kuprel , Rob Novoa , Justin Ko , Sebastian Thrun

Comments: 4 pages, 5 figures, Yunzhu Li and Andre Esteva contributed equally to this work

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Dense object detection and temporal tracking are needed across applications

domains ranging from people-tracking to analysis of satellite imagery over

time. The detection and tracking of malignant skin cancers and benign moles

poses a particularly challenging problem due to the general uniformity of large

skin patches, the fact that skin lesions vary little in their appearance, and

the relatively small amount of data available. Here we introduce a novel data

synthesis technique that merges images of individual skin lesions with

full-body images and heavily augments them to generate significant amounts of

data. We build a convolutional neural network (CNN) based system, trained on

this synthetic data, and demonstrate superior performance to traditional

detection and tracking techniques. Additionally, we compare our system to

humans trained with simple criteria. Our system is intended for potential

clinical use to augment the capabilities of healthcare providers. While

domain-specific, we believe the methods invoked in this work will be useful in

applying CNNs across domains that suffer from limited data availability.

Word Recognition with Deep Conditional Random Fields

Gang Chen , Yawei Li , Sargur N. Srihari

Comments: 5 pages, published in ICIP 2016. arXiv admin note: substantial text overlap with arXiv:1412.3397

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Recognition of handwritten words continues to be an important problem in

document analysis and recognition. Existing approaches extract hand-engineered

features from word images–which can perform poorly with new data sets.

Recently, deep learning has attracted great attention because of the ability to

learn features from raw data. Moreover they have yielded state-of-the-art

results in classification tasks including character recognition and scene

recognition. On the other hand, word recognition is a sequential problem where

we need to model the correlation between characters. In this paper, we propose

using deep Conditional Random Fields (deep CRFs) for word recognition.

Basically, we combine CRFs with deep learning, in which deep features are

learned and sequences are labeled in a unified framework. We pre-train the deep

structure with stacked restricted Boltzmann machines (RBMs) for feature

learning and optimize the entire network with an online learning algorithm. The

proposed model was evaluated on two datasets, and seen to perform significantly

better than competitive baseline models. The source code is available at

this https URL

Learning to Segment Object Proposals via Recursive Neural Networks

Tianshui Chen , Liang Lin , Xian Wu , Xiaonan Luo

Comments: 12 pages

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

To avoid the exhaustive search over locations and scales, current

state-of-the-art object detection systems usually involve a crucial component

generating a batch of candidate object proposals from images. In this paper, we

present a simple yet effective approach for segmenting object proposals via a

deep architecture of recursive neural networks (RNNs), which hierarchically

groups regions for detecting object candidates over scales. Unlike traditional

methods that mainly adopt fixed similarity measures for merging regions or

finding object proposals, our approach adaptively learns the region merging

similarity and the objectness measure during the process of hierarchical region

grouping. Specifically, guided by a structured loss, the RNN model jointly

optimizes the cross-region similarity metric with the region merging process as

well as the objectness prediction. During inference of the object proposal

generation, we introduce randomness into the greedy search to cope with the

ambiguity of grouping regions. Extensive experiments on standard benchmarks,

e.g., PASCAL VOC and ImageNet, suggest that our approach is capable of

producing object proposals with high recall while well preserving the object

boundaries and outperforms other existing methods in both accuracy and

efficiency.

SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving

Bichen Wu , Forrest Iandola , Peter H. Jin , Kurt Keutzer

Comments: The supplementary material of this paper, which discusses the energy efficiency of SqueezeDet, is attached after the main paper

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Object detection is a crucial task for autonomous driving. In addition to

requiring high accuracy to ensure safety, object detection for autonomous

driving also requires real-time inference speed to guarantee prompt vehicle

control, as well as small model size and energy efficiency to enable embedded

system deployment. In this work, we propose SqueezeDet, a fully convolutional

neural network for object detection that aims to simultaneously satisfy all of

the above constraints. In our network we use convolutional layers not only to

extract feature maps, but also as the output layer to compute bounding boxes

and class probabilities. The detection pipeline of our model only contains a

single forward pass of a neural network, thus it is extremely fast. Our model

is fully-convolutional, which leads to small model size and better energy

efficiency. Finally, our experiments show that our model is very accurate,

achieving state-of-the-art accuracy on the KITTI benchmark.

Semi-Automated Annotation of Discrete States in Large Video Datasets

Lex Fridman , Bryan Reimer

Comments: To be presented at AAAI 2017. arXiv admin note: text overlap with arXiv:1508.04028

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

We propose a framework for semi-automated annotation of video frames where

the video is of an object that at any point in time can be labeled as being in

one of a finite number of discrete states. A Hidden Markov Model (HMM) is used

to model (1) the behavior of the underlying object and (2) the noisy

observation of its state through an image processing algorithm. The key insight

of this approach is that the annotation of frame-by-frame video can be reduced

from a problem of labeling every single image to a problem of detecting a

transition between states of the underlying objected being recording on video.

The performance of the framework is evaluated on a driver gaze classification

dataset composed of 16,000,000 images that were fully annotated over 6,000

hours of direct manual annotation labor. On this dataset, we achieve a 13x

reduction in manual annotation for an average accuracy of 99.1% and a 84x

reduction for an average accuracy of 91.2%.

Areas of Attention for Image Captioning

Marco Pedersoli , Thomas Lucas , Cordelia Schmid , Jakob Verbeek

Comments: Submitted to CVPR 2017

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

We propose “Areas of Attention”, a novel attention-based model for automatic

image caption generation. Our approach models the interplay between the state

of the RNN, image region descriptors and word embedding vectors by three

pairwise interactions. It allows association of caption words with local visual

appearances rather than with descriptors of the entire scene. This enables

better generalization to complex scenes not seen during training. Our model is

agnostic to the type of attention areas, and we instantiate it using regions

based on CNN activation grids, object proposals, and spatial transformer

networks. Our results show that all components of our model contribute to

obtain state-of-the-art performance on the MSCOCO dataset. In addition, our

results indicate that attention areas are correctly associated to meaningful

latent semantic structure in the generated captions.

Short-term traffic flow forecasting with spatial-temporal correlation in a hybrid deep learning framework

Yuankai Wu , Huachun Tan

Comments: 14 pages

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Deep learning approaches have reached a celebrity status in artificial

intelligence field, its success have mostly relied on Convolutional Networks

(CNN) and Recurrent Networks. By exploiting fundamental spatial properties of

images and videos, the CNN always achieves dominant performance on visual

tasks. And the Recurrent Networks (RNN) especially long short-term memory

methods (LSTM) can successfully characterize the temporal correlation, thus

exhibits superior capability for time series tasks. Traffic flow data have

plentiful characteristics on both time and space domain. However, applications

of CNN and LSTM approaches on traffic flow are limited. In this paper, we

propose a novel deep architecture combined CNN and LSTM to forecast future

traffic flow (CLTFP). An 1-dimension CNN is exploited to capture spatial

features of traffic flow, and two LSTMs are utilized to mine the short-term

variability and periodicities of traffic flow. Given those meaningful features,

the feature-level fusion is performed to achieve short-term forecasting. The

proposed CLTFP is compared with other popular forecasting methods on an open

datasets. Experimental results indicate that the CLTFP has considerable

advantages in traffic flow forecasting. in additional, the proposed CLTFP is

analyzed from the view of Granger Causality, and several interesting properties

of CLTFP are discovered and discussed .

A Non-Local Means Approach for Gaussian Noise Removal from Images using a Modified Weighting Kernel

Mojtaba Kazemi , Ehsan Mohammadi.P , Parichehr shahidi sadeghi , Mohamad B. Menhaj

Comments: 2017 25th Iranian Conference on Electrical Engineering (ICEE)

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Gaussian noise removal is an interesting area in digital image processing not

only to improve the visual quality, but for its impact on other post-processing

algorithms like image registration or segmentation. Many presented

state-of-the-art denoising methods are based on the self-similarity or

patch-based image processing. Specifically, Non-Local Means (NLM) as a

patch-based filter has gained increasing attention in recent years.

Essentially, this filter tends to obtain the noise-less signal value by

computing the Gaussian-weighted Euclidean distance between the patch

under-processing and other patches inside the image. However, the NLM filter is

sensitive to the outliers (pixels that their intensity values are far away from

other pixels) inside the patch, meaning that the pixels with the symmetric

locations in the patch are assigned the same weight. This can lead to

sub-optimal denoising performance when the destructive nature of noise

generates some outliers inside patches. In this paper, we propose a new

weighting approach to modify the Gaussian kernel of the NLM filter. Our

approach employs the geometric distance between image intensities to come up

with new weights for each pixel of a patch, lowering the impact of outliers on

the denoising performance. Experiments on a set of standard images and

different noise levels show that our proposed method outperforms the other

compared denoising filters.

Mining Spatio-temporal Data on Industrialization from Historical Registries

David Berenbaum , Dwyer Deighan , Thomas Marlow , Ashley Lee , Scott Frickel , Mark Howison Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Information Retrieval (cs.IR)

Despite the growing availability of big data in many fields, historical data

on socioevironmental phenomena are often not available due to a lack of

automated and scalable approaches for collecting, digitizing, and assembling

them. We have developed a data-mining method for extracting tabulated, geocoded

data from printed directories. While scanning and optical character recognition

(OCR) can digitize printed text, these methods alone do not capture the

structure of the underlying data. Our pipeline integrates both page layout

analysis and OCR to extract tabular, geocoded data from structured text. We

demonstrate the utility of this method by applying it to scanned manufacturing

registries from Rhode Island that record 41 years of industrial land use. The

resulting spatio-temporal data can be used for socioenvironmental analyses of

industrialization at a resolution that was not previously possible. In

particular, we find strong evidence for the dispersion of manufacturing from

the urban core of Providence, the state’s capital, along the Interstate 95

corridor to the north and south.

Ensembles of Generative Adversarial Networks

Yaxing Wang , Lichao Zhang , Joost van de Weijer

Comments: accepted NIPS 2016 Workshop on Adversarial Training

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Ensembles are a popular way to improve results of discriminative CNNs. The

combination of several networks trained starting from different initializations

improves results significantly. In this paper we investigate the usage of

ensembles of GANs. The specific nature of GANs opens up several new ways to

construct ensembles. The first one is based on the fact that in the minimax

game which is played to optimize the GAN objective the generator network keeps

on changing even after the network can be considered optimal. As such ensembles

of GANs can be constructed based on the same network initialization but just

taking models which have different amount of iterations. These so-called self

ensembles are much faster to train than traditional ensembles. The second

method, called cascade GANs, redirects part of the training data which is badly

modeled by the first GAN to another GAN. In experiments on the CIFAR10 dataset

we show that ensembles of GANs obtain model probability distributions which

better model the data distribution. In addition, we show that these improved

results can be obtained at little additional computational cost.

Deep Learning with Energy-efficient Binary Gradient Cameras

Suren Jayasuriya , Orazio Gallo , Jinwei Gu , Jan Kautz Subjects : Computer Vision and Pattern Recognition (cs.CV)

Power consumption is a critical factor for the deployment of embedded

computer vision systems. We explore the use of computational cameras that

directly output binary gradient images to reduce the portion of the power

consumption allocated to image sensing. We survey the accuracy of binary

gradient cameras on a number of computer vision tasks using deep learning.

These include object recognition, head pose regression, face detection, and

gesture recognition. We show that, for certain applications, accuracy can be on

par or even better than what can be achieved on traditional images. We are also

the first to recover intensity information from binary spatial gradient

images–useful for applications with a human observer in the loop, such as

surveillance. Our results, which we validate with a prototype binary gradient

camera, point to the potential of gradient-based computer vision systems.

Food Image Recognition by Using Convolutional Neural Networks (CNNs)

Yuzhen Lu

Comments: 6 pages, 5 figures, 3 tables

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Food image recognition is one of the promising applications of visual object

recognition in computer vision. In this study, a small-scale dataset consisting

of 5822 images of ten categories and a five-layer CNN was constructed to

recognize these images. The bag-of-features (BoF) model coupled with support

vector machine was first tested as comparison, resulting in an overall accuracy

of 56%, while the CNN performed much better with an overall accuracy of 74%.

Data expansion techniques were applied to increase the size of training images,

which achieved a significantly improved accuracy of more than 90% and prevent

the overfitting issue that occurred to the CNN without using data expansion.

Further improvement is within reach by collecting more images and optimizing

the network architecture and relevant hyper-parameters.

Semi-supervised learning of deep metrics for stereo reconstruction

Stepan Tulyakov , Anton Ivanov , Francois Fleuret

Comments: 11 pages, 3 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Neural and Evolutionary Computing (cs.NE)

Deep-learning metrics have recently demonstrated extremely good performance

to match image patches for stereo reconstruction. However, training such

metrics requires large amount of labeled stereo images, which can be difficult

or costly to collect for certain applications. The main contribution of our

work is a new semi-supervised method for learning deep metrics from unlabeled

stereo images, given coarse information about the scenes and the optical

system. Our method alternatively optimizes the metric with a standard

stochastic gradient descent, and applies stereo constraints to regularize its

prediction. Experiments on reference data-sets show that, for a given network

architecture, training with this new method without ground-truth produces a

metric with performance as good as state-of-the-art baselines trained with the

said ground-truth. This work has three practical implications. Firstly, it

helps to overcome limitations of training sets, in particular noisy ground

truth. Secondly it allows to use much more training data during learning.

Thirdly, it allows to tune deep metric for a particular stereo system, even if

ground truth is not available.

End-to-end learning of brain tissue segmentation from imperfect labeling

Alex Fedorov , Jeremy Johnson , Eswar Damaraju , Alexei Ozerin , Vince Calhoun , Sergey Plis

Comments: 7 pages, 8 figures, 2 tables

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Segmenting a structural magnetic resonance imaging (MRI) scan is an important

pre-processing step for analytic procedures and subsequent inferences about

longitudinal tissue changes. Manual segmentation defines the current gold

standard in quality but is prohibitively expensive. Automatic approaches are

computationally intensive, incredibly slow at scale, and error prone due to

usually involving many potentially faulty intermediate steps. In order to

streamline the segmentation, we introduce a deep learning model that is based

on volumetric dilated convolutions, subsequently reducing both processing time

and errors. Compared to its competitors, the model has a reduced set of

parameters and thus is easier to train and much faster to execute. The contrast

in performance between the dilated network and its competitors becomes obvious

when both are tested on a large dataset of unprocessed human brain volumes. The

dilated network consistently outperforms not only another state-of-the-art deep

learning approach, the up convolutional network, but also the ground truth on

which it was trained. Not only can the incredible speed of our model make large

scale analyses much easier but we also believe it has great potential in a

clinical setting where, with little to no substantial delay, a patient and

provider can go over test results.

Commonly Uncommon: Semantic Sparsity in Situation Recognition

Mark Yatskar , Vicente Ordonez , Luke Zettlemoyer , Ali Farhadi Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)

Semantic sparsity is a common challenge in structured visual classification

problems; when the output space is complex, the vast majority of the possible

predictions are rarely, if ever, seen in the training set. This paper studies

semantic sparsity in situation recognition, the task of producing structured

summaries of what is happening in images, including activities, objects and the

roles objects play within the activity. For this problem, we find empirically

that most object-role combinations are rare, and current state-of-the-art

models significantly underperform in this sparse data regime. We avoid many

such errors by (1) introducing a novel tensor composition function that learns

to share examples across role-noun combinations and (2) semantically augmenting

our training data with automatically gathered examples of rarely observed

outputs using web data. When integrated within a complete CRF-based structured

prediction model, the tensor-based approach outperforms existing state of the

art by a relative improvement of 2.11% and 4.40% on top-5 verb and noun-role

accuracy, respectively. Adding 5 million images with our semantic augmentation

techniques gives further relative improvements of 6.23% and 9.57% on top-5 verb

and noun-role accuracy.

Parameter Compression of Recurrent Neural Networks and Degredation of Short-term Memory

Jonathan A. Cox

Comments: Submitted to IJCNN 2017

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

The significant computational costs of deploying neural networks in

large-scale or resource constrained environments, such as data centers and

mobile devices, has spurred interest in model compression, which can achieve a

reduction in both arithmetic operations and storage memory. Several techniques

have been proposed for reducing or compressing the parameters for feed-forward

and convolutional neural networks, but less is understood about the effect of

parameter compression on recurrent neural networks (RNN). In particular, the

extent to which the recurrent parameters can be compressed and the impact on

short-term memory performance, is not well understood. In this paper, we study

the effect of complexity reduction, through singular value decomposition rank

reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several

tasks. We show that considerable rank reduction is possible when compressing

recurrent weights, even without fine tuning. Furthermore, we propose a

perturbation model for the effect of general perturbations, such as a

compression, on the recurrent parameters of RNNs. The model is tested against a

noiseless memorization experiment that elucidates the short-term memory

performance. In this way, we demonstrate that the effect of compression of

recurrent parameters is dependent on the degree of temporal coherence present

in the data and task. This work can guide on-the-fly RNN compression for novel

environments or tasks, and provides insight for applying RNN compression in

low-power devices, such as hearing aids.

Procedural Generation of Videos to Train Deep Action Recognition Networks

César Roberto de Souza , Adrien Gaidon , Yohann Cabon , Antonio Manuel López Peña

Comments: Under review at CVPR 2017. this https URL

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

Deep learning for human action recognition in videos is making significant

progress, but is slowed down by its dependency on expensive manual labeling of

large video collections. In this work, we investigate the generation of

synthetic training data for action recognition, as it has recently shown

promising results for a variety of other computer vision tasks. We propose an

interpretable parametric generative model of human action videos that relies on

procedural generation and other computer graphics techniques of modern game

engines.We generate a diverse, realistic, and physically plausible dataset of

human action videos, called PHAV for “Procedural Human Action Videos”. It

contains a total of 37,536 videos, more than 1,000 examples for each action of

35 categories. Our approach is not limited to existing motion capture

sequences, and we procedurally define 14 synthetic actions. We introduce a deep

multi-task representation learning architecture to mix synthetic and real

videos, even if the action categories differ. Our experiments on the UCF101 and

HMDB51 benchmarks suggest that combining our large set of synthetic videos with

small real-world datasets can boost recognition performance, significantly

outperforming fine-tuning state-of-the-art unsupervised generative models of

videos.

Multi-resolution Data Fusion for Super-Resolution Electron Microscopy

Suhas Sreehari , S. V. Venkatakrishnan , Katherine L. Bouman , Jeffrey P. Simmons , Lawrence F. Drummy , Charles A. Bouman Subjects : Computer Vision and Pattern Recognition (cs.CV)

Perhaps surprisingly, the total electron microscopy (EM) data collected to

date is less than a cubic millimeter. Consequently, there is an enormous demand

in the materials and biological sciences to image at greater speed and lower

dosage, while maintaining resolution. Traditional EM imaging based on

homogeneous raster-order scanning severely limits the volume of high-resolution

data that can be collected, and presents a fundamental limitation to

understanding physical processes such as material deformation, crack

propagation, and pyrolysis.

We introduce a novel multi-resolution data fusion (MDF) method for

super-resolution computational EM. Our method combines innovative data

acquisition with novel algorithmic techniques to dramatically improve the

resolution/volume/speed trade-off. The key to our approach is to collect the

entire sample at low resolution, while simultaneously collecting a small

fraction of data at high resolution. The high-resolution measurements are then

used to create a material-specific patch-library that is used within the

“plug-and-play” framework to dramatically improve super-resolution of the

low-resolution data. We present results using FEI electron microscope data that

demonstrate super-resolution factors of 4x, 8x, and 16x, while substantially

maintaining high image quality and reducing dosage.

Artificial Intelligence

The Complexity of Bayesian Networks Specified by Propositional and Relational Languages

Fabio Gagliardi Cozman , Denis Deratani Mauá Subjects : Artificial Intelligence (cs.AI)

We examine the complexity of inference in Bayesian networks specified by

logical languages. We consider representations that range from fragments of

propositional logic to function-free first-order logic with equality; in doing

so we cover a variety of plate models and of probabilistic relational models.

We study the complexity of inferences when network, query and domain are the

input (the inferential and the combined complexity), when the network is fixed

and query and domain are the input (the query/data complexity), and when the

network and query are fixed and the domain is the input (the domain

complexity). We draw connections with probabilistic databases and liftability

results, and obtain complexity classes that range from polynomial to

exponential levels.

Deep Learning of Robotic Tasks using Strong and Weak Human Supervision

Bar Hilleli , Ran El-Yaniv Subjects : Artificial Intelligence (cs.AI) ; Learning (cs.LG); Robotics (cs.RO)

We propose a scheme for training a computerized agent to perform complex

human tasks such as highway steering. The scheme resembles natural

teaching-learning processes used by humans to teach themselves and each other

complex tasks, and consists of the following four stages. In the first stage

the agent learns by itself an informative low-dimensional representations of

raw input signals in an unsupervised learning manner. In the second stage the

agent learns to mimic the human instructor using supervised learning so as to

reach a basic performance level; the third stage is devoted to learning an

instantaneous reward model. Here, the (human) instructor observes (possibly in

real time) the agent performing the task and provides reward feedback. During

this stage the agent monitors both itself and the instructor feedback and

learns a reward model using supervised learning. This stage terminates when the

reward model is sufficiently accurate. In the last stage a reinforcement

learning algorithm is deployed to optimize the agent policy. The guidance

reward signal in the reinforcement learning algorithm relies on the previously

learned reward model. As a proof of concept for the proposed scheme, we

designed a system consisting of deep convolutional neural networks, and applied

it to successfully learn a computerized agent capable of autonomous highway

steering over the well-known racing game Assetto Corsa.

Algorithmic Songwriting with ALYSIA

Margareta Ackerman , David Loker Subjects : Artificial Intelligence (cs.AI) ; Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)

This paper introduces ALYSIA: Automated LYrical SongwrIting Application.

ALYSIA is based on a machine learning model using Random Forests, and we

discuss its success at pitch and rhythm prediction. Next, we show how ALYSIA

was used to create original pop songs that were subsequently recorded and

produced. Finally, we discuss our vision for the future of Automated

Songwriting for both co-creative and autonomous systems.

DeepBach: a Steerable Model for Bach chorales generation

Gaëtan Hadjeres , François Pachet

Comments: 20 pages, 11 figures

Subjects

:

Artificial Intelligence (cs.AI)

; Sound (cs.SD)

The composition of polyphonic chorale music in the style of J.S Bach has

represented a major challenge in automatic music composition over the last

decades. The art of Bach chorales composition involves combining four-part

harmony with characteristic rhythmic patterns and typical melodic movements to

produce musical phrases which begin, evolve and end (cadences) in a harmonious

way. To our knowledge, no model so far was able to solve all these problems

simultaneously using an agnostic machine-learning approach. This paper

introduces DeepBach, a statistical model aimed at modeling polyphonic music and

specifically four parts, hymn-like pieces. We claim that, after being trained

on the chorale harmonizations by Johann Sebastian Bach, our model is capable of

generating highly convincing chorales in the style of Bach. We evaluate how

indistinguishable our generated chorales are from existing Bach chorales with a

listening test. The results corroborate our claim. A key strength of DeepBach

is that it is agnostic and flexible. Users can constrain the generation by

imposing some notes, rhythms or cadences in the generated score. This allows

users to reharmonize user-defined melodies. DeepBach’s generation is fast,

making it usable for interactive music composition applications. Several

generation examples are provided and discussed from a musical point of view.

RecSys Challenge 2016: job recommendations based on preselection of offers and gradient boosting

Andrzej Pacuk , Piotr Sankowski , Karol Węgrzycki , Adam Witkowski , Piotr Wygocki

Comments: 6 pages, 1 figure, 2 tables, Description of 2nd place winning solution of RecSys 2016 Challange. To be published in RecSys’16 Challange Proceedings

Subjects

:

Artificial Intelligence (cs.AI)

; Information Retrieval (cs.IR); Software Engineering (cs.SE)

We present the Mim-Solution’s approach to the RecSys Challenge 2016, which

ranked 2nd. The goal of the competition was to prepare job recommendations for

the users of the website Xing.com.

Our two phase algorithm consists of candidate selection followed by the

candidate ranking. We ranked the candidates by the predicted probability that

the user will positively interact with the job offer. We have used Gradient

Boosting Decision Trees as the regression tool.

Using Discourse Signals for Robust Instructor Intervention Prediction

Muthu Kumar Chandrasekaran , Carrie Demmans Epp , Min-Yen Kan , Diane Litman

Comments: To appear in proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA

Subjects

:

Artificial Intelligence (cs.AI)

; Computation and Language (cs.CL); Computers and Society (cs.CY)

We tackle the prediction of instructor intervention in student posts from

discussion forums in Massive Open Online Courses (MOOCs). Our key finding is

that using automatically obtained discourse relations improves the prediction

of when instructors intervene in student discussions, when compared with a

state-of-the-art, feature-rich baseline. Our supervised classifier makes use of

an automatic discourse parser which outputs Penn Discourse Treebank (PDTB) tags

that represent in-post discourse features. We show PDTB relation-based features

increase the robustness of the classifier and complement baseline features in

recalling more diverse instructor intervention patterns. In comprehensive

experiments over 14 MOOC offerings from several disciplines, the PDTB discourse

features improve performance on average. The resultant models are less

dependent on domain-specific vocabulary, allowing them to better generalize to

new courses.

A Matrix Splitting Perspective on Planning with Options

Pierre-Luc Bacon , Doina Precup

Comments: Accepted at the Continual Learning and Deep Networks Workshop, NIPS 2016

Subjects

:

Artificial Intelligence (cs.AI)

We show that the Bellman operator underlying the options framework leads to a

matrix splitting, an approach traditionally used to speed up convergence of

iterative solvers for large linear systems of equations. Based on standard

comparison theo- rems for matrix splittings, we then show how the asymptotic

rate of convergence varies as a function of the inherent timescales of the

options. This new perspective highlights a trade-off between asymptotic

performance and the cost of computation associated with building a good set of

options.

N-gram Opcode Analysis for Android Malware Detection

BooJoong Kang , Suleiman Y. Yerima , Sakir Sezer , Kieran McLaughlin

Journal-ref: International Journal on Cyber Situational Awareness, Vol. 1, No.

1, pp231-255 (2016)

Subjects

:

Cryptography and Security (cs.CR)

; Artificial Intelligence (cs.AI)

Android malware has been on the rise in recent years due to the increasing

popularity of Android and the proliferation of third party application markets.

Emerging Android malware families are increasingly adopting sophisticated

detection avoidance techniques and this calls for more effective approaches for

Android malware detection. Hence, in this paper we present and evaluate an

n-gram opcode features based approach that utilizes machine learning to

identify and categorize Android malware. This approach enables automated

feature discovery without relying on prior expert or domain knowledge for

pre-determined features. Furthermore, by using a data segmentation technique

for feature selection, our analysis is able to scale up to 10-gram opcodes. Our

experiments on a dataset of 2520 samples showed an f-measure of 98% using the

n-gram opcode based approach. We also provide empirical findings that

illustrate factors that have probable impact on the overall n-gram opcodes

performance trends.

Proportional Rankings

Piotr Skowron , Martin Lackner , Markus Brill , Dominik Peters , Edith Elkind Subjects : Computer Science and Game Theory (cs.GT) ; Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)

In this paper we extend the principle of proportional representation to

rankings. We consider the setting where alternatives need to be ranked based on

approval preferences. In this setting, proportional representation requires

that cohesive groups of voters are represented proportionally in each initial

segment of the ranking. Proportional rankings are desirable in situations where

initial segments of different lengths may be relevant, e.g., hiring decisions

(if it is unclear how many positions are to be filled), the presentation of

competing proposals on a liquid democracy platform (if it is unclear how many

proposals participants are taking into consideration), or recommender systems

(if a ranking has to accommodate different user types). We study the

proportional representation provided by several ranking methods and prove

theoretical guarantees. Furthermore, we experimentally evaluate these methods

and present preliminary evidence as to which methods are most suitable for

producing proportional rankings.

A New Type-II Fuzzy Logic Based Controller for Non-linear Dynamical Systems with Application to a 3-PSP Parallel Robot

Hamid Reza Hassanzadeh

Comments: Master’s thesis

Subjects

:

Systems and Control (cs.SY)

; Artificial Intelligence (cs.AI)

The concept of uncertainty is posed in almost any complex system including

parallel robots as an outstanding instance of dynamical robotics systems. As

suggested by the name, uncertainty, is some missing information that is beyond

the knowledge of human thus we may tend to handle it properly to minimize the

side-effects through the control process.

Type-II fuzzy logic has shown its superiority over traditional fuzzy logic

when dealing with uncertainty. Type-II fuzzy logic controllers are however

newer and more promising approaches that have been recently applied to various

fields due to their significant contribution especially when noise (as an

important instance of uncertainty) emerges. During the design of Type-I fuzzy

logic systems, we presume that we are almost certain about the fuzzy membership

functions which is not true in many cases. Thus T2FLS as a more realistic

approach dealing with practical applications might have a lot to offer. Type-II

fuzzy logic takes into account a higher level of uncertainty, in other words,

the membership grade for a type-II fuzzy variable is no longer a crisp number

but rather is itself a type-I linguistic term. In this thesis the effects of

uncertainty in dynamic control of a parallel robot is considered. More

specifically, it is intended to incorporate the Type-II Fuzzy Logic paradigm

into a model based controller, the so-called computed torque control method,

and apply the result to a 3 degrees of freedom parallel manipulator.

Message Passing Multi-Agent GANs

Arnab Ghosh , Viveka Kulharia , Vinay Namboodiri

Comments: The first 2 authors contributed equally for this work

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Communicating and sharing intelligence among agents is an important facet of

achieving Artificial General Intelligence. As a first step towards this

challenge, we introduce a novel framework for image generation: Message Passing

Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have

recently been shown to be very effective for image generation and other tasks,

these networks have been limited to mostly single generator-discriminator

networks. We show that we can obtain multi-agent GANs that communicate through

message passing to achieve better image generation. The objectives of the

individual agents in this framework are two fold: a co-operation objective and

a competing objective. The co-operation objective ensures that the message

sharing mechanism guides the other generator to generate better than itself

while the competing objective encourages each generator to generate better than

its counterpart. We analyze and visualize the messages that these GANs share

among themselves in various scenarios. We quantitatively show that the message

sharing formulation serves as a regularizer for the adversarial training.

Qualitatively, we show that the different generators capture different traits

of the underlying data distribution.

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision (Short Version)

Chen Liang , Jonathan Berant , Quoc Le , Kenneth D. Forbus , Ni Lao

Comments: Published in NAMPI workshop at NIPS 2016. Short version of arXiv:1611.00020

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Learning (cs.LG)

Extending the success of deep neural networks to natural language

understanding and symbolic reasoning requires complex operations and external

memory. Recent neural program induction approaches have attempted to address

this problem, but are typically limited to differentiable memory, and

consequently cannot scale beyond small synthetic tasks. In this work, we

propose the Manager-Programmer-Computer framework, which integrates neural

networks with non-differentiable memory to support abstract, scalable and

precise operations through a friendly neural computer interface. Specifically,

we introduce a Neural Symbolic Machine, which contains a sequence-to-sequence

neural “programmer”, and a non-differentiable “computer” that is a Lisp

interpreter with code assist. To successfully apply REINFORCE for training, we

augment it with approximate gold programs found by an iterative maximum

likelihood training process. NSM is able to learn a semantic parser from weak

supervision over a large knowledge base. It achieves new state-of-the-art

performance on WebQuestionsSP, a challenging semantic parsing dataset, with

weak supervision. Compared to previous approaches, NSM is end-to-end, therefore

does not rely on feature engineering or domain specific knowledge.

Representing Independence Models with Elementary Triplets

Jose M. Peña Subjects : Machine Learning (stat.ML) ; Artificial Intelligence (cs.AI)

In an independence model, the triplets that represent conditional

independences between singletons are called elementary. It is known that the

elementary triplets represent the independence model unambiguously under some

conditions. In this paper, we show how this representation helps performing

some operations with independence models, such as finding the dominant triplets

or a minimal independence map of an independence model, or computing the union

or intersection of a pair of independence models, or performing causal

reasoning. For the latter, we rephrase in terms of conditional independences

some of Pearl’s results for computing causal effects.

Enhancing Use Case Points Estimation Method Using Soft Computing Techniques

Ali Bou Nassif , Luiz Fernando Capretz , Danny Ho Subjects : Software Engineering (cs.SE) ; Artificial Intelligence (cs.AI); Learning (cs.LG)

Software estimation is a crucial task in software engineering. Software

estimation encompasses cost, effort, schedule, and size. The importance of

software estimation becomes critical in the early stages of the software life

cycle when the details of software have not been revealed yet. Several

commercial and non-commercial tools exist to estimate software in the early

stages. Most software effort estimation methods require software size as one of

the important metric inputs and consequently, software size estimation in the

early stages becomes essential. One of the approaches that has been used for

about two decades in the early size and effort estimation is called use case

points. Use case points method relies on the use case diagram to estimate the

size and effort of software projects. Although the use case points method has

been widely used, it has some limitations that might adversely affect the

accuracy of estimation. This paper presents some techniques using fuzzy logic

and neural networks to improve the accuracy of the use case points method.

Results showed that an improvement up to 22% can be obtained using the proposed

approach.

Commonly Uncommon: Semantic Sparsity in Situation Recognition

Mark Yatskar , Vicente Ordonez , Luke Zettlemoyer , Ali Farhadi Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Artificial Intelligence (cs.AI)

Semantic sparsity is a common challenge in structured visual classification

problems; when the output space is complex, the vast majority of the possible

predictions are rarely, if ever, seen in the training set. This paper studies

semantic sparsity in situation recognition, the task of producing structured

summaries of what is happening in images, including activities, objects and the

roles objects play within the activity. For this problem, we find empirically

that most object-role combinations are rare, and current state-of-the-art

models significantly underperform in this sparse data regime. We avoid many

such errors by (1) introducing a novel tensor composition function that learns

to share examples across role-noun combinations and (2) semantically augmenting

our training data with automatically gathered examples of rarely observed

outputs using web data. When integrated within a complete CRF-based structured

prediction model, the tensor-based approach outperforms existing state of the

art by a relative improvement of 2.11% and 4.40% on top-5 verb and noun-role

accuracy, respectively. Adding 5 million images with our semantic augmentation

techniques gives further relative improvements of 6.23% and 9.57% on top-5 verb

and noun-role accuracy.

Toward Efficient Task Assignment and Motion Planning for Large Scale Underwater Mission

Somaiyeh Mahmoud Zadeh , David MW Powers , Karl Sammut , Amirmehdi Yazdani

Comments: arXiv admin note: substantial text overlap with arXiv:1604.03308

Subjects

:

Robotics (cs.RO)

; Artificial Intelligence (cs.AI)

An Autonomous Underwater Vehicle (AUV) needs to acquire a certain degree of

autonomy for any particular underwater mission to fulfill the mission

objectives successfully and ensure its safety in all stages of the mission in a

large scale operating filed. In this paper, a novel combinatorial

conflict-free-task assignment strategy consisting an interactive engagement of

a local path planner and an adaptive global route planner, is introduced. The

method is established upon the heuristic search potency of the Particle Swarm

Optimisation (PSO) algorithm to address the discrete nature of routing-task

assignment approach and the complexity of NP-hard path planning problem. The

proposed hybrid method is highly efficient for having a reactive guidance

framework that guarantees successful completion of missions specifically in

cluttered environments. To examine the performance of the method in a context

of mission productivity, mission time management and vehicle safety, a series

of simulation studies are undertaken. The results of simulations declare that

the proposed method is reliable and robust, particularly in dealing with

uncertainties, and it can significantly enhance the level of vehicle’s autonomy

by relying on its reactive nature and capability of providing fast feasible

solutions.

Information Retrieval

We used Neural Networks to Detect Clickbaits: You won't believe what happened Next!

Ankesh Anand , Tanmoy Chakraborty , Noseong Park

Comments: Accepted to the European Conference on Information Retrieval (ECIR), 2017

Subjects

:

Computation and Language (cs.CL)

; Information Retrieval (cs.IR)

Online content publishers often use catchy headlines for their articles in

order to attract users to their websites. These headlines, popularly known as

clickbaits, exploit a user’s curiosity gap and lure them to click on links that

often disappoint them. Existing methods for automatically detecting clickbaits

rely on heavy feature engineering and domain knowledge. Here, we introduce a

neural network architecture based on Recurrent Neural Networks for detecting

clickbaits. Our model relies on distributed word representations learned from a

large unannotated corpora, and character embeddings learned via Convolutional

Neural Networks. Experimental results on a dataset of news headlines show that

our model outperforms existing techniques for clickbait detection with an

accuracy of 0.98 with F1-score of 0.98 and ROC-AUC of 0.99.

Mining Spatio-temporal Data on Industrialization from Historical Registries

David Berenbaum , Dwyer Deighan , Thomas Marlow , Ashley Lee , Scott Frickel , Mark Howison Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Information Retrieval (cs.IR)

Despite the growing availability of big data in many fields, historical data

on socioevironmental phenomena are often not available due to a lack of

automated and scalable approaches for collecting, digitizing, and assembling

them. We have developed a data-mining method for extracting tabulated, geocoded

data from printed directories. While scanning and optical character recognition

(OCR) can digitize printed text, these methods alone do not capture the

structure of the underlying data. Our pipeline integrates both page layout

analysis and OCR to extract tabular, geocoded data from structured text. We

demonstrate the utility of this method by applying it to scanned manufacturing

registries from Rhode Island that record 41 years of industrial land use. The

resulting spatio-temporal data can be used for socioenvironmental analyses of

industrialization at a resolution that was not previously possible. In

particular, we find strong evidence for the dispersion of manufacturing from

the urban core of Providence, the state’s capital, along the Interstate 95

corridor to the north and south.

RecSys Challenge 2016: job recommendations based on preselection of offers and gradient boosting

Andrzej Pacuk , Piotr Sankowski , Karol Węgrzycki , Adam Witkowski , Piotr Wygocki

Comments: 6 pages, 1 figure, 2 tables, Description of 2nd place winning solution of RecSys 2016 Challange. To be published in RecSys’16 Challange Proceedings

Subjects

:

Artificial Intelligence (cs.AI)

; Information Retrieval (cs.IR); Software Engineering (cs.SE)

We present the Mim-Solution’s approach to the RecSys Challenge 2016, which

ranked 2nd. The goal of the competition was to prepare job recommendations for

the users of the website Xing.com.

Our two phase algorithm consists of candidate selection followed by the

candidate ranking. We ranked the candidates by the predicted probability that

the user will positively interact with the job offer. We have used Gradient

Boosting Decision Trees as the regression tool.

Computation and Language

Mapping the Dialog Act Annotations of the LEGO Corpus into the Communicative Functions of ISO 24617-2

Eugénio Ribeiro , Ricardo Ribeiro , David Martins de Matos

Comments: 20 pages, 2 figures

Subjects

:

Computation and Language (cs.CL)

In this paper we present strategies for mapping the dialog act annotations of

the LEGO corpus into the communicative functions of the ISO 24617-2 standard.

Using these strategies, we obtained an additional 347 dialogs annotated

according to the standard. This is particularly important given the reduced

amount of existing data in those conditions due to the recency of the standard.

Furthermore, these are dialogs from a widely explored corpus for dialog related

tasks. However, its dialog annotations have been neglected due to their high

domain-dependency, which renders them unuseful outside the context of the

corpus. Thus, through our mapping process, we both obtain more data annotated

according to a recent standard and provide useful dialog act annotations for a

widely explored corpus in the context of dialog research.

We used Neural Networks to Detect Clickbaits: You won't believe what happened Next!

Ankesh Anand , Tanmoy Chakraborty , Noseong Park

Comments: Accepted to the European Conference on Information Retrieval (ECIR), 2017

Subjects

:

Computation and Language (cs.CL)

; Information Retrieval (cs.IR)

Online content publishers often use catchy headlines for their articles in

order to attract users to their websites. These headlines, popularly known as

clickbaits, exploit a user’s curiosity gap and lure them to click on links that

often disappoint them. Existing methods for automatically detecting clickbaits

rely on heavy feature engineering and domain knowledge. Here, we introduce a

neural network architecture based on Recurrent Neural Networks for detecting

clickbaits. Our model relies on distributed word representations learned from a

large unannotated corpora, and character embeddings learned via Convolutional

Neural Networks. Experimental results on a dataset of news headlines show that

our model outperforms existing techniques for clickbait detection with an

accuracy of 0.98 with F1-score of 0.98 and ROC-AUC of 0.99.

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision (Short Version)

Chen Liang , Jonathan Berant , Quoc Le , Kenneth D. Forbus , Ni Lao

Comments: Published in NAMPI workshop at NIPS 2016. Short version of arXiv:1611.00020

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Learning (cs.LG)

Extending the success of deep neural networks to natural language

understanding and symbolic reasoning requires complex operations and external

memory. Recent neural program induction approaches have attempted to address

this problem, but are typically limited to differentiable memory, and

consequently cannot scale beyond small synthetic tasks. In this work, we

propose the Manager-Programmer-Computer framework, which integrates neural

networks with non-differentiable memory to support abstract, scalable and

precise operations through a friendly neural computer interface. Specifically,

we introduce a Neural Symbolic Machine, which contains a sequence-to-sequence

neural “programmer”, and a non-differentiable “computer” that is a Lisp

interpreter with code assist. To successfully apply REINFORCE for training, we

augment it with approximate gold programs found by an iterative maximum

likelihood training process. NSM is able to learn a semantic parser from weak

supervision over a large knowledge base. It achieves new state-of-the-art

performance on WebQuestionsSP, a challenging semantic parsing dataset, with

weak supervision. Compared to previous approaches, NSM is end-to-end, therefore

does not rely on feature engineering or domain specific knowledge.

CER: Complementary Entity Recognition via Knowledge Expansion on Large Unlabeled Product Reviews

Hu Xu , Sihong Xie , Lei Shu , Philip S. Yu

Comments: 10 pages, 2 figures, IEEE BigData 2016

Subjects

:

Computation and Language (cs.CL)

Product reviews contain a lot of useful information about product features

and customer opinions. One important product feature is the complementary

entity (products) that may potentially work together with the reviewed product.

Knowing complementary entities of the reviewed product is very important

because customers want to buy compatible products and avoid incompatible ones.

In this paper, we address the problem of Complementary Entity Recognition

(CER). Since no existing method can solve this problem, we first propose a

novel unsupervised method to utilize syntactic dependency paths to recognize

complementary entities. Then we expand category-level domain knowledge about

complementary entities using only a few general seed verbs on a large amount of

unlabeled reviews. The domain knowledge helps the unsupervised method to adapt

to different products and greatly improves the precision of the CER task. The

advantage of the proposed method is that it does not require any labeled data

for training. We conducted experiments on 7 popular products with about 1200

reviews in total to demonstrate that the proposed approach is effective.

Unit Dependency Graph and its Application to Arithmetic Word Problem Solving

Subhro Roy , Dan Roth

Comments: AAAI 2017

Subjects

:

Computation and Language (cs.CL)

Math word problems provide a natural abstraction to a range of natural

language understanding problems that involve reasoning about quantities, such

as interpreting election results, news about casualties, and the financial

section of a newspaper. Units associated with the quantities often provide

information that is essential to support this reasoning. This paper proposes a

principled way to capture and reason about units and shows how it can benefit

an arithmetic word problem solver. This paper presents the concept of Unit

Dependency Graphs (UDGs), which provides a compact representation of the

dependencies between units of numbers mentioned in a given problem. Inducing

the UDG alleviates the brittleness of the unit extraction system and allows for

a natural way to leverage domain knowledge about unit compatibility, for word

problem solving. We introduce a decomposed model for inducing UDGs with minimal

additional annotations, and use it to augment the expressions used in the

arithmetic word problem solver of (Roy and Roth 2015) via a constrained

inference framework. We show that introduction of UDGs reduces the error of the

solver by over 10 %, surpassing all existing systems for solving arithmetic

word problems. In addition, it also makes the system more robust to adaptation

to new vocabulary and equation forms .

End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

Xuesong Yang , Yun-Nung Chen , Dilek Hakkani-Tur , Paul Crook , Xiujun Li , Jianfeng Gao , Li Deng Subjects : Computation and Language (cs.CL) ; Learning (cs.LG)

Natural language understanding and dialogue policy learning are both

essential in conversational systems that predict the next system actions in

response to a current user utterance. Conventional approaches aggregate

separate models of natural language understanding (NLU) and system action

prediction (SAP) as a pipeline that is sensitive to noisy outputs of

error-prone NLU. To address the issues, we propose an end-to-end deep recurrent

neural network with limited contextual dialogue memory by jointly training NLU

and SAP on DSTC4 multi-domain human-human dialogues. Experiments show that our

proposed model significantly outperforms the state-of-the-art pipeline models

for both NLU and SAP, which indicates that our joint model is capable of

mitigating the affects of noisy NLU outputs, and NLU model can be refined by

error flows backpropagating from the extra supervised signals of system

actions.

Creating a Real-Time, Reproducible Event Dataset

John Beieler Subjects : Computation and Language (cs.CL)

The generation of political event data has remained much the same since the

mid-1990s, both in terms of data acquisition and the process of coding text

into data. Since the 1990s, however, there have been significant improvements

in open-source natural language processing software and in the availability of

digitized news content. This paper presents a new, next-generation event

dataset, named Phoenix, that builds from these and other advances. This dataset

includes improvements in the underlying news collection process and event

coding software, along with the creation of a general processing pipeline

necessary to produce daily-updated data. This paper provides a face validity

checks by briefly examining the data for the conflict in Syria, and a

comparison between Phoenix and the Integrated Crisis Early Warning System data.

Using Discourse Signals for Robust Instructor Intervention Prediction

Muthu Kumar Chandrasekaran , Carrie Demmans Epp , Min-Yen Kan , Diane Litman

Comments: To appear in proceedings of the 31st AAAI Conference on Artificial Intelligence, San Francisco, USA

Subjects

:

Artificial Intelligence (cs.AI)

; Computation and Language (cs.CL); Computers and Society (cs.CY)

We tackle the prediction of instructor intervention in student posts from

discussion forums in Massive Open Online Courses (MOOCs). Our key finding is

that using automatically obtained discourse relations improves the prediction

of when instructors intervene in student discussions, when compared with a

state-of-the-art, feature-rich baseline. Our supervised classifier makes use of

an automatic discourse parser which outputs Penn Discourse Treebank (PDTB) tags

that represent in-post discourse features. We show PDTB relation-based features

increase the robustness of the classifier and complement baseline features in

recalling more diverse instructor intervention patterns. In comprehensive

experiments over 14 MOOC offerings from several disciplines, the PDTB discourse

features improve performance on average. The resultant models are less

dependent on domain-specific vocabulary, allowing them to better generalize to

new courses.

Distributed, Parallel, and Cluster Computing

A Randomized Concurrent Algorithm for Disjoint Set Union

Siddhartha V. Jayanti , Robert E. Tarjan Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Data Structures and Algorithms (cs.DS)

The disjoint set union problem is a basic problem in data structures with a

wide variety of applications. We extend a known efficient sequential algorithm

for this problem to obtain a simple and efficient concurrent wait-free

algorithm running on an asynchronous parallel random access machine (APRAM).

Crucial to our result is the use of randomization. Under a certain independence

assumption, for a problem instance in which there are n elements, m operations,

and p processes, our algorithm does Theta(m (alpha(n, m/(np)) + log(np/m + 1)))

expected work, where the expectation is over the random choices made by the

algorithm and alpha is a functional inverse of Ackermann’s function. In

addition, each operation takes O(log n) steps with high probability. Our

algorithm is significantly simpler and more efficient than previous algorithms

proposed by Anderson and Woll. Under our independence assumption, our algorithm

achieves almost-linear speed-up for applications in which all or most of the

processes can be kept busy.

Support vector regression model for BigData systems

Alessandro Maria Rizzi Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Learning (cs.LG); Performance (cs.PF)

Nowadays Big Data are becoming more and more important. Many sectors of our

economy are now guided by data-driven decision processes. Big Data and business

intelligence applications are facilitated by the MapReduce programming model

while, at infrastructural layer, cloud computing provides flexible and cost

effective solutions for allocating on demand large clusters. In such systems,

capacity allocation, which is the ability to optimally size minimal resources

for achieve a certain level of performance, is a key challenge to enhance

performance for MapReduce jobs and minimize cloud resource costs. In order to

do so, one of the biggest challenge is to build an accurate performance model

to estimate job execution time of MapReduce systems. Previous works applied

simulation based models for modeling such systems. Although this approach can

accurately describe the behavior of Big Data clusters, it is too

computationally expensive and does not scale to large system. We try to

overcome these issues by applying machine learning techniques. More precisely

we focus on Support Vector Regression (SVR) which is intrinsically more robust

w.r.t other techniques, like, e.g., neural networks, and less sensitive to

outliers in the training set. To better investigate these benefits, we compare

SVR to linear regression.

High-Performance Distributed Machine Learning using Apache SPARK

Celestine Dünner , Thomas Parnell , Kubilay Atasu , Manolis Sifalakis , Haralampos Pozidis Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Learning (cs.LG)

In this paper we compare the performance of distributed learning using Apache

SPARK and MPI by implementing a distributed linear learning algorithm from

scratch on the two programming frameworks. We then explore the performance gap

and show how SPARK learning can be accelerated, by reducing computational cost

as well as communication-related overheads, to reduce the relative loss in

performance versus MPI from 20x to 2x. With these different implementations at

hand, we will illustrate how the optimal parameters of the algorithm depend

strongly on the characteristics of the framework on which it is executed. We

will show that carefully tuning a distributed algorithm to trade-off

communication and computation can improve performance by orders of magnitude.

Hence, understanding system aspects of the framework and their implications,

and then correctly adapting the algorithm proves to be the key to performance.

The communication-hiding pipelined BiCGStab method for the efficient parallel solution of large unsymmetric linear systems

Siegfried Cools , Wim Vanroose

Comments: 21 pages, 5 figures, 4 tables

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

A High Performance Computing alternative to traditional Krylov methods,

pipelined Krylov solvers offer better scalability in the strong scaling limit

compared to standard Krylov methods for large and sparse linear systems. The

traditional synchronization bottleneck is mitigated by overlapping

time-consuming global communication phases with local computations in the

algorithm. This paper describes a general framework for deriving the pipelined

variant of any Krylov algorithm. The proposed framework was implicitly used to

derive the pipelined Conjugate Gradient (p-CG) method in Hiding global

synchronization latency in the preconditioned Conjugate Gradient algorithm by

P. Ghysels and W. Vanroose, Parallel Computing, 40(7):224–238, 2014. The

pipelining framework is subsequently illustrated by formulating a pipelined

version of the BiCGStab method for the solution of large unsymmetric linear

systems on parallel hardware. A residual replacement strategy is proposed to

account for the possible loss of attainable accuracy and robustness by the

pipelined BiCGStab method. It is shown that the pipelined algorithm improves

scalability on distributed memory machines, leading to significant speedups

compared to standard preconditioned BiCGStab.

Adaptive Work-Efficient Connected Components on the GPU

Michael Sutton , Tal Ben-Nun , Amnon Barak , Sreepathi Pai , Keshav Pingali

Comments: 4 pages

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

; Data Structures and Algorithms (cs.DS)

This report presents an adaptive work-efficient approach for implementing the

Connected Components algorithm on GPUs. The results show a considerable

increase in performance (up to 6.8( imes)) over current state-of-the-art

solutions.

QoS-based Computing Resources Partitioning between Virtual Machines in the Cloud Architecture

Evgeny Nikulchev , Evgeniy Pluzhnik , Oleg Lukyanchikov , Dmitry Biryukov , Elena Andrianova

Comments: 6 pages, International Journal of Advanced Computer Science and Applications (2016) 7

Subjects

:

Distributed, Parallel, and Cluster Computing (cs.DC)

Cloud services have been used very widely, but configuration of the

parameters, including the efficient allocation of resources, is an important

objective for the system architect. The article is devoted to solving the

problem of choosing the architecture of computers based on simulation and

developed program for monitoring computing resources. Techniques were developed

aimed at providing the required quality of service and efficient use of

resources. The article describes the monitoring program of computing resources

and time efficiency of the target application functions. On the basis of this

application the technique is shown and described in the experiment, designed to

ensure the requirements for quality of service, by isolating one process from

the others on different virtual machines inside the hypervisor.

On spreading rumor in heterogeneous systems

Jacek Cichoń , Zbigniew Gołeobbiewski , Marcin Kardas , Marek Klonowski , Filip Zagórski Subjects : Distributed, Parallel, and Cluster Computing (cs.DC)

In this paper we consider a model of spreading information in heterogeneous

systems wherein we have two kinds of objects. Some of them are active and

others are passive. Active objects can, if they possess information, share it

with an encountered passive object. We focus on a particular case such that

active objects communicate independently with randomly chosen passive objects.

Such model is motivated by two real-life scenarios. The first one is a very

dynamic system of mobile devices distributing information among stationary

devices. The second is an architecture wherein clients communicate with several

servers and can leave some information learnt from other servers. The main

question we investigate is how many rounds is needed to deliver the information

to all objects under the assumption that at the beginning exactly one object

has the information?

In this paper we provide mathematical models of such process and show rigid

and very precise mathematical analysis for some special cases important from

practical point of view. Some mathematical results are quite surprising — we

find relation of investigated process to both coupon collector’s problem as

well as the birthday paradox. Additionally, we present simulations for showing

behaviour for general parameters

BrainFrame: A heterogeneous accelerator platform for neuron simulations

Georgios Smaragdos , Georgios Chatzikonstantis , Rahul Kukreja , Harrys Sidiropoulos , Dimitrios Rodopoulos , Ioannis Sourdis , Zaid Al-Ars , Christoforos Kachris , Dimitrios Soudris , Chris I. De Zeeuw , Christos Strydis

Comments: 17 pages, 19 figures, 4 tables

Subjects

:

Neural and Evolutionary Computing (cs.NE)

; Distributed, Parallel, and Cluster Computing (cs.DC)

Objective: The advent of High-Performance Computing (HPC) in recent years has

led to its increasing use in brain study through computational models. The

scale and complexity of such models are constantly increasing, leading to

challenging computational requirements. Even though modern HPC platforms can

often deal with such challenges, the vast diversity of the modeling field does

not permit for a single acceleration (or homogeneous) platform to effectively

address the complete array of modeling requirements. Approach: In this paper we

propose and build BrainFrame, a heterogeneous acceleration platform,

incorporating three distinct acceleration technologies, a Dataflow Engine, a

Xeon Phi and a GP-GPU. The PyNN framework is also integrated into the platform.

As a challenging proof of concept, we analyze the performance of BrainFrame on

different instances of a state-of-the-art neuron model, modeling the Inferior-

Olivary Nucleus using a biophysically-meaningful, extended Hodgkin-Huxley

representation. The model instances take into account not only the neuronal-

network dimensions but also different network-connectivity circumstances that

can drastically change application workload characteristics. Main results: The

synthetic approach of three HPC technologies demonstrated that BrainFrame is

better able to cope with the modeling diversity encountered. Our performance

analysis shows clearly that the model directly affect performance and all three

technologies are required to cope with all the model use cases.

StreamNF: Performance and Correctness for Stateful Chained NFs

Junaid Khalid , Aditya Akella Subjects : Networking and Internet Architecture (cs.NI) ; Distributed, Parallel, and Cluster Computing (cs.DC)

Network functions virtualization (NFV) — deploying network functions in

software on commodity machines — allows operators to employ rich chains of NFs

to realize custom performance, security, and compliance policies, and ensure

high performance by dynamically adding instances and/or failing over. Because

NFs are stateful, it is important to carefully manage their state, especially

during such dynamic actions. Crucially, state management must: (1) offer good

performance to match the needs of modern networks; (2) ensure NF chain-wide

properties; and (3) not require the operator to manage low-level state

management details. We present StreamNF, an NFV framework that satisfies the

above requirements. To do so, StreamNF leverages an external state store with

novel caching strategies and offloading of state operations, and chain-level

logical packet clocks and packet logging/replay. Extensive evaluation of a

StreamNF prototype built atop Apache Storm shows that the significant benefits

of StreamNF in terms of state management performance and chain-wide properties

come at a modest per-packet latency cost.

Online Page Migration on Ring Networks in Uniform Model

Amanj Khorramian , Akira Matsubayashi

Comments: 8 pages, 5 figures

Subjects

:

Data Structures and Algorithms (cs.DS)

; Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

This paper explores the problem of page migration in ring networks. A ring

network is a connected graph, in which each node is connected with exactly two

other nodes. In this problem, one of the nodes in a given network holds a page

of size D. This node is called the server and the page is a non-duplicable data

in the network. Requests are issued by nodes to access the page one after

another. Every time a new request is issued, the server must serve the request

and may migrate to another node before the next request arrives. A service

costs the distance between the server and the requesting node, and the

migration costs the distance of the migration multiplied by D. The problem is

to minimize the total costs of services and migrations. We study this problem

in uniform model, for which the page has a unit size, i.e. D=1. A

3.326-competitive algorithm improving the current best upper bound is designed.

We show that this ratio is tight for our algorithm.

Expander Graph and Communication-Efficient Decentralized Optimization

Yat-Tin Chow , Wei Shi , Tianyu Wu , Wotao Yin

Comments: 2016 IEEE Asilomar Conference on Signals, Systems, and Computers

Subjects

:

Optimization and Control (math.OC)

; Distributed, Parallel, and Cluster Computing (cs.DC); Multiagent Systems (cs.MA)

In this paper, we discuss how to design the graph topology to reduce the

communication complexity of certain algorithms for decentralized optimization.

Our goal is to minimize the total communication needed to achieve a prescribed

accuracy. We discover that the so-called expander graphs are near-optimal

choices. We propose three approaches to construct expander graphs for different

numbers of nodes and node degrees. Our numerical results show that the

performance of decentralized optimization is significantly better on expander

graphs than other regular graphs.

Learning

Incomplete data representation for SVM classification

Łukasz Struski , Marek Śmieja , Jacek Tabor

Comments: 15 pages, 3 figures

Subjects

:

Learning (cs.LG)

; Machine Learning (stat.ML)

In this paper we propose two ways of incomplete data representation. The

first one is a generalization of a flag representation, where a vector with

missing attributes is filled with some values and joined with flag vectors

indicating missing components. Our generalization uses pointed affine

subspaces, which in addition to flag representation allows to perform various

affine transformations of data, as whitening or dimensionality reduction. We

show how to embed such affine subspaces into a vector space and how to define a

proper scalar product. In the second approach, we represent missing data points

by degenerated Gaussian densities, which additionally model the uncertainty

connected with missing features. This representation allows to construct an

analogue of RBF kernel on incomplete data space.

Zeroth-order Asynchronous Doubly Stochastic Algorithm with Variance Reduction

Bin Gu , Zhouyuan Huo , Heng Huang Subjects : Learning (cs.LG)

Zeroth-order (derivative-free) optimization attracts a lot of attention in

machine learning, because explicit gradient calculations may be computationally

expensive or infeasible. To handle large scale problems both in volume and

dimension, recently asynchronous doubly stochastic zeroth-order algorithms were

proposed. The convergence rate of existing asynchronous doubly stochastic

zeroth order algorithms is (O(frac{1}{sqrt{T}})) (also for the sequential

stochastic zeroth-order optimization algorithms). In this paper, we focus on

the finite sums of smooth but not necessarily convex functions, and propose an

asynchronous doubly stochastic zeroth-order optimization algorithm using the

accelerated technology of variance reduction (AsyDSZOVR). Rigorous theoretical

analysis show that the convergence rate can be improved from

(O(frac{1}{sqrt{T}})) the best result of existing algorithms to

(O(frac{1}{T})). Also our theoretical results is an improvement to the ones of

the sequential stochastic zeroth-order optimization algorithms.

Sparse Label Propagation

Alexander Jung Subjects : Learning (cs.LG) ; Machine Learning (stat.ML)

We consider massive heterogeneous datasets with intrinsic network structure,

i.e., big data over networks. These datasets can be modelled by graph signals,

which are defined over large-scale irregular graphs representing complex

networks. We show that (semi-supervised) learning of the entire underlying

graph signal based on incomplete information provided by few initial labels can

be reduced to a compressed sensing recovery problem within the cosparse

analysis model. This reduction provides two things: First, it allows to apply

highly developed compressed sensing methods to the learning problem. In

particular, by implementing a recent primal-dual method for convex

optimization, we obtain a sparse label propagation algorithm. Moreover, by

casting the learning problem within compressed sensing, we are able to derive

sufficient conditions on the graph structure and available label information,

such that sparse label propagation is accurate.

Learning Adversary-Resistant Deep Neural Networks

Qinglong Wang , Wenbo Guo , Kaixuan Zhang , Alexander G. Ororbia II , Xinyu Xing , C. Lee Giles , Xue Liu Subjects : Learning (cs.LG)

Deep neural networks (DNNs) have proven to be quite effective in a vast array

of machine learning tasks, with recent examples in cyber security and

autonomous vehicles. Despite the superior performance of DNNs in these

applications, it has been recently shown that these models are susceptible to a

particular type of attack that exploits a fundamental flaw in their design.

This attack consists of generating particular synthetic examples referred to as

adversarial samples. These samples are constructed by slightly manipulating

real data-points in order to “fool” the original DNN model, forcing it to

mis-classify previously correctly classified samples with high confidence.

Addressing this flaw in the model is essential if DNNs are to be used in

critical applications such as those in cyber security. Previous work has

provided various defense mechanisms by either augmenting the training set or

enhancing model complexity. However, after a thorough analysis, we discover

that DNNs protected by these defense mechanisms are still susceptible to

adversarial samples, indicating that there are no theoretical guarantees of

resistance provided by these mechanisms. To the best of our knowledge, we are

the first to investigate this issue shared across previous research work and to

propose a unifying framework for protecting DNN models by integrating a data

transformation module with the DNN. More importantly, we provide a theoretical

guarantee for protection under our proposed framework. We evaluate our method

and several other existing solutions on both MNIST, CIFAR-10, and a malware

dataset, to demonstrate the generality of our proposed method and its potential

for handling cyber security applications. The results show that our framework

provides better resistance compared to state-of-art solutions while

experiencing negligible degradation in accuracy.

Implicit Modeling — A Generalization of Discriminative and Generative Approaches

Dmitrij Schlesinger , Carsten Rother Subjects : Learning (cs.LG)

We propose a new modeling approach that is a generalization of generative and

discriminative models. The core idea is to use an implicit parameterization of

a joint probability distribution by specifying only the conditional

distributions. The proposed scheme combines the advantages of both worlds — it

can use powerful complex discriminative models as its parts, having at the same

time better generalization capabilities. We thoroughly evaluate the proposed

method for a simple classification task with artificial data and illustrate its

advantages for real-word scenarios on a semantic image segmentation problem.

A Nearly Optimal Contextual Bandit Algorithm

Mohammadreza Mohaghegh Neyshabouri , Kaan Gokcesu , Selami Ciftci , Suleyman S. Kozat Subjects : Learning (cs.LG)

We investigate the contextual multi-armed bandit problem in an adversarial

setting and introduce an online algorithm that asymptotically achieves the

performance of the best contextual bandit arm selection strategy under certain

conditions. We show that our algorithm is highly efficient and provides

significantly improved performance with a guaranteed performance upper bound in

a strong mathematical sense. We have no statistical assumptions on the context

vectors and the loss of the bandit arms, hence our results are guaranteed to

hold even in adversarial environments. We use a tree notion in order to

partition the space of context vectors in a nested structure. Using this tree,

we construct a large class of context dependent bandit arm selection strategies

and adaptively combine them to achieve the performance of the best strategy. We

use the hierarchical nature of introduced tree to implement this combination

with a significantly low computational complexity, thus our algorithm can be

efficiently used in applications involving big data. Through extensive set of

experiments involving synthetic and real data, we demonstrate significant

performance gains achieved by the proposed algorithm with respect to the

state-of-the-art adversarial bandit algorithms.

Diagnostic Prediction Using Discomfort Drawings

Cheng Zhang , Hedvig Kjellstrom , Bo C. Bertilson

Comments: NIPS 2016 Workshop on Machine Learning for Health

Subjects

:

Learning (cs.LG)

In this paper, we explore the possibility to apply machine learning to make

diagnostic predictions using discomfort drawings. A discomfort drawing is an

intuitive way for patients to express discomfort and pain related symptoms.

These drawings have proven to be an effective method to collect patient data

and make diagnostic decisions in real-life practice. A dataset from real-world

patient cases is collected for which medical experts provide diagnostic labels.

Next, we extend a factorized multimodal topic model, Inter-Battery Topic Model

(IBTM), to train a system that can make diagnostic predictions given an unseen

discomfort drawing. Experimental results show reasonable predictions of

diagnostic labels given an unseen discomfort drawing. The positive result

indicates a significant potential of machine learning to be used for parts of

the pain diagnostic process and to be a decision support system for physicians

and other health care personnel.

A One class Classifier based Framework using SVDD : Application to an Imbalanced Geological Dataset

Soumi Chaki , Akhilesh Kumar Verma , Aurobinda Routray , William K. Mohanty , Mamata Jenamani

Comments: presented at IEEE Students Technology Symposium (TechSym), 28 February to 2 March 2014, IIT Kharagpur, India. 6 pages, 7 figures, 2tables

Subjects

:

Learning (cs.LG)

; Applications (stat.AP); Machine Learning (stat.ML)

Evaluation of hydrocarbon reservoir requires classification of petrophysical

properties from available dataset. However, characterization of reservoir

attributes is difficult due to the nonlinear and heterogeneous nature of the

subsurface physical properties. In this context, present study proposes a

generalized one class classification framework based on Support Vector Data

Description (SVDD) to classify a reservoir characteristic water saturation into

two classes (Class high and Class low) from four logs namely gamma ray, neutron

porosity, bulk density, and P sonic using an imbalanced dataset. A comparison

is carried out among proposed framework and different supervised classification

algorithms in terms of g metric means and execution time. Experimental results

show that proposed framework has outperformed other classifiers in terms of

these performance evaluators. It is envisaged that the classification analysis

performed in this study will be useful in further reservoir modeling.

Cryptocurrency Portfolio Management with Deep Reinforcement Learning

Zhengyao Jiang , Jinjun Liang Subjects : Learning (cs.LG)

We present a convolutional neural network with historic prices of a set of

financial assets as its input, outputting portfolio weights of the set. The

network is trained to achieve maximum accumulative return. A back-test trading

experiment is conducted in a cryptocurrency market, achieving 10-fold returns

in 1.8 month’s periods, outperforming many traditional portfolio management

algorithms and benchmarks.

Deep Symbolic Representation Learning for Heterogeneous Time-series Classification

Shengdong Zhang , Soheil Bahrampour , Naveen Ramakrishnan , Mohak Shah Subjects : Learning (cs.LG) ; Machine Learning (stat.ML)

In this paper, we consider the problem of event classification with

multi-variate time series data consisting of heterogeneous (continuous and

categorical) variables. The complex temporal dependencies between the variables

combined with sparsity of the data makes the event classification problem

particularly challenging. Most state-of-art approaches address this either by

designing hand-engineered features or breaking up the problem over homogeneous

variates. In this work, we propose and compare three representation learning

algorithms over symbolized sequences which enables classification of

heterogeneous time-series data using a deep architecture. The proposed

representations are trained jointly along with the rest of the network

architecture in an end-to-end fashion that makes the learned features

discriminative for the given task. Experiments on three real-world datasets

demonstrate the effectiveness of the proposed approaches.

Robust nonparametric nearest neighbor random process clustering

Michael Tschannen , Helmut Bölcskei

Comments: 13 pages, 4 figures

Subjects

:

Learning (cs.LG)

; Information Theory (cs.IT); Machine Learning (stat.ML)

We consider the problem of clustering noisy finite-length observations of

stationary ergodic random processes according to their generative models

without prior knowledge of the model statistics and the number of generative

models. Two algorithms, both using the L1-distance between estimated power

spectral densities (PSDs) as a measure of dissimilarity, are analyzed. The

first one, termed nearest neighbor process clustering (NNPC) is new and relies

on partitioning the nearest neighbor graph of the observations via spectral

clustering. The second algorithm, simply referred to as k-means (KM), consists

of a single k-means iteration with farthest point initialization and was

considered before in the literature, albeit with a different dissimilarity

measure and with asymptotic performance results only. We prove that both

algorithms succeed with high probability in the presence of noise and missing

entries, and even when the generative process PSDs overlap significantly, all

provided that the observation length is sufficiently large. Our results

quantify the tradeoff between the overlap of the generative process PSDs, the

observation length, the fraction of missing entries, and the noise variance.

Furthermore, we prove that treating the finite-length observations of

stationary ergodic random processes as vectors in Euclidean space and

clustering them using the thresholding-based subspace clustering (TSC)

algorithm, the subspace clustering cousin of NNPC, results in performance

strictly inferior to that of NNPC. We argue that the underlying cause is to be

found in TSC employing spherical distance as dissimilarity measure, thereby

ignoring the stationary process structure of the observations. Finally, we

provide extensive numerical results for synthetic and real data and find that

NNPC outperforms state-of-the-art algorithms in human motion sequence

clustering.

Learning to superoptimize programs – Workshop Version

Rudy Bunel , Alban Desmaison , M. Pawan Kumar , Philip H.S.Torr , Pushmeet Kohli

Comments: Workshop version for the NIPS NAMPI Workshop. Extended version at arXiv:1611.01787

Subjects

:

Learning (cs.LG)

Superoptimization requires the estimation of the best program for a given

computational task. In order to deal with large programs, superoptimization

techniques perform a stochastic search. This involves proposing a modification

of the current program, which is accepted or rejected based on the improvement

achieved. The state of the art method uses uniform proposal distributions,

which fails to exploit the problem structure to the fullest. To alleviate this

deficiency, we learn a proposal distribution over possible modifications using

Reinforcement Learning. We provide convincing results on the superoptimization

of “Hacker’s Delight” programs.

Trained Ternary Quantization

Chenzhuo Zhu , Song Han , Huizi Mao , William J. Dally

Comments: Submitted to ICLR 17

Subjects

:

Learning (cs.LG)

Product reviews contain a lot of useful information about product features

and customer opinions. One important product feature is the complementary

entity (products) that may potentially work together with the reviewed product.

Knowing complementary entities of the reviewed product is very important

because customers want to buy compatible products and avoid incompatible ones.

In this paper, we address the problem of Complementary Entity Recognition

(CER). Since no existing method can solve this problem, we first propose a

novel unsupervised method to utilize syntactic dependency paths to recognize

complementary entities. Then we expand category-level domain knowledge about

complementary entities using only a few general seed verbs on a large amount of

unlabeled reviews. The domain knowledge helps the unsupervised method to adapt

to different products and greatly improves the precision of the CER task. The

advantage of the proposed method is that it does not require any labeled data

for training. We conducted experiments on 7 popular products with about 1200

reviews in total to demonstrate that the proposed approach is effective.

Positive blood culture detection in time series data using a BiLSTM network

Leen De Baets , Joeri Ruyssinck , Thomas Peiffer , Johan Decruyenaere , Filip De Turck , Femke Ongenae , Tom Dhaene Subjects : Learning (cs.LG) ; Neural and Evolutionary Computing (cs.NE); Quantitative Methods (q-bio.QM); Machine Learning (stat.ML)

The presence of bacteria or fungi in the bloodstream of patients is abnormal

and can lead to life-threatening conditions. A computational model based on a

bidirectional long short-term memory artificial neural network, is explored to

assist doctors in the intensive care unit to predict whether examination of

blood cultures of patients will return positive. As input it uses nine

monitored clinical parameters, presented as time series data, collected from

2177 ICU admissions at the Ghent University Hospital. Our main goal is to

determine if general machine learning methods and more specific, temporal

models, can be used to create an early detection system. This preliminary

research obtains an area of 71.95% under the precision recall curve, proving

the potential of temporal neural networks in this context.

Success Probability of Exploration: a Concrete Analysis of Learning Efficiency

Liangpeng Zhang , Ke Tang , Xin Yao Subjects : Learning (cs.LG)

Exploration has been a crucial part of reinforcement learning, yet several

important questions concerning exploration efficiency are still not answered

satisfactorily by existing analytical frameworks. These questions include

exploration parameter setting, situation analysis, and hardness of MDPs, all of

which are unavoidable for practitioners. To bridge the gap between the theory

and practice, we propose a new analytical framework called the success

probability of exploration. We show that those important questions of

exploration above can all be answered under our framework, and the answers

provided by our framework meet the needs of practitioners better than the

existing ones. More importantly, we introduce a concrete and practical approach

to evaluating the success probabilities in certain MDPs without the need of

actually running the learning algorithm. We then provide empirical results to

verify our approach, and demonstrate how the success probability of exploration

can be used to analyse and predict the behaviours and possible outcomes of

exploration, which are the keys to the answer of the important questions of

exploration.

A Novel Framework based on SVDD to Classify Water Saturation from Seismic Attributes

Soumi Chaki , Akhilesh Kumar Verma , Aurobinda Routray , William K. Mohanty , Mamata Jenamani

Comments: 6 pages, 8 figures, 2table Presented at Fourth International Conference on Emerging Applications of Information Technology (EAIT 2014), ISI Kolkata, India

Subjects

:

Learning (cs.LG)

; Machine Learning (stat.ML)

Water saturation is an important property in reservoir engineering domain.

Thus, satisfactory classification of water saturation from seismic attributes

is beneficial for reservoir characterization. However, diverse and non-linear

nature of subsurface attributes makes the classification task difficult. In

this context, this paper proposes a generalized Support Vector Data Description

(SVDD) based novel classification framework to classify water saturation into

two classes (Class high and Class low) from three seismic attributes seismic

impedance, amplitude envelop, and seismic sweetness. G-metric means and program

execution time are used to quantify the performance of the proposed framework

along with established supervised classifiers. The documented results imply

that the proposed framework is superior to existing classifiers. The present

study is envisioned to contribute in further reservoir modeling.

A novel multiclassSVM based framework to classify lithology from well logs: a real-world application

Soumi Chaki , Aurobinda Routray , William K. Mohanty , Mamata Jenamani

Comments: 5 pages, 5 figures, 4 tables Presented at INDICON 2015 at New Delhi, India

Subjects

:

Learning (cs.LG)

; Applications (stat.AP); Machine Learning (stat.ML)

Support vector machines (SVMs) have been recognized as a potential tool for

supervised classification analyses in different domains of research. In

essence, SVM is a binary classifier. Therefore, in case of a multiclass

problem, the problem is divided into a series of binary problems which are

solved by binary classifiers, and finally the classification results are

combined following either the one-against-one or one-against-all strategies. In

this paper, an attempt has been made to classify lithology using a multiclass

SVM based framework using well logs as predictor variables. Here, the lithology

is classified into four classes such as sand, shaly sand, sandy shale and shale

based on the relative values of sand and shale fractions as suggested by an

expert geologist. The available dataset consisting well logs (gamma ray,

neutron porosity, density, and P-sonic) and class information from four closely

spaced wells from an onshore hydrocarbon field is divided into training and

testing sets. We have used one-against-all strategy to combine the results of

multiple binary classifiers. The reported results established the superiority

of multiclass SVM compared to other classifiers in terms of classification

accuracy. The selection of kernel function and associated parameters has also

been investigated here. It can be envisaged from the results achieved in this

study that the proposed framework based on multiclass SVM can further be used

to solve classification problems. In future research endeavor, seismic

attributes can be introduced in the framework to classify the lithology

throughout a study area from seismic inputs.

A Nonparametric Latent Factor Model For Location-Aware Video Recommendations

Ehtsham Elahi

Comments: NIPS 2016 Workshop on Practical Bayesian Nonparametrics

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG)

We are interested in learning customers’ video preferences from their

historic viewing patterns and geographical location. We consider a Bayesian

latent factor modeling approach for this task. In order to tune the complexity

of the model to best represent the data, we make use of Bayesian nonparameteric

techniques. We describe an inference technique that can scale to large

real-world data sets. Finally we show results obtained by applying the model to

a large internal Netflix data set, that illustrates that the model was able to

capture interesting relationships between viewing patterns and geographical

location.

Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

Balaji Lakshminarayanan , Alexander Pritzel , Charles Blundell Subjects : Machine Learning (stat.ML) ; Learning (cs.LG)

Deep neural networks are powerful black box predictors that have recently

achieved impressive performance on a wide spectrum of tasks. Quantifying

predictive uncertainty in neural networks is a challenging and yet unsolved

problem. Bayesian neural networks, which learn a distribution over weights, are

currently the state-of-the-art for estimating predictive uncertainty; however

these require significant modifications to the training procedure and are

computationally expensive compared to standard (non-Bayesian) neural neural

networks. We propose an alternative to Bayesian neural networks, that is simple

to implement, readily parallelisable and yields high quality predictive

uncertainty estimates. Through a series of experiments on classification and

regression benchmarks, we demonstrate that our method produces well-calibrated

uncertainty estimates which are as good or better than approximate Bayesian

neural networks. Finally, we evaluate the predictive uncertainty on test

examples from known and unknown classes, and show that our method is able to

express higher degree of uncertainty on unknown classes, unlike existing

methods which make overconfident predictions even on unknown classes.

Support vector regression model for BigData systems

Alessandro Maria Rizzi Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Learning (cs.LG); Performance (cs.PF)

Nowadays Big Data are becoming more and more important. Many sectors of our

economy are now guided by data-driven decision processes. Big Data and business

intelligence applications are facilitated by the MapReduce programming model

while, at infrastructural layer, cloud computing provides flexible and cost

effective solutions for allocating on demand large clusters. In such systems,

capacity allocation, which is the ability to optimally size minimal resources

for achieve a certain level of performance, is a key challenge to enhance

performance for MapReduce jobs and minimize cloud resource costs. In order to

do so, one of the biggest challenge is to build an accurate performance model

to estimate job execution time of MapReduce systems. Previous works applied

simulation based models for modeling such systems. Although this approach can

accurately describe the behavior of Big Data clusters, it is too

computationally expensive and does not scale to large system. We try to

overcome these issues by applying machine learning techniques. More precisely

we focus on Support Vector Regression (SVR) which is intrinsically more robust

w.r.t other techniques, like, e.g., neural networks, and less sensitive to

outliers in the training set. To better investigate these benefits, we compare

SVR to linear regression.

High-Performance Distributed Machine Learning using Apache SPARK

Celestine Dünner , Thomas Parnell , Kubilay Atasu , Manolis Sifalakis , Haralampos Pozidis Subjects : Distributed, Parallel, and Cluster Computing (cs.DC) ; Learning (cs.LG)

In this paper we compare the performance of distributed learning using Apache

SPARK and MPI by implementing a distributed linear learning algorithm from

scratch on the two programming frameworks. We then explore the performance gap

and show how SPARK learning can be accelerated, by reducing computational cost

as well as communication-related overheads, to reduce the relative loss in

performance versus MPI from 20x to 2x. With these different implementations at

hand, we will illustrate how the optimal parameters of the algorithm depend

strongly on the characteristics of the framework on which it is executed. We

will show that carefully tuning a distributed algorithm to trade-off

communication and computation can improve performance by orders of magnitude.

Hence, understanding system aspects of the framework and their implications,

and then correctly adapting the algorithm proves to be the key to performance.

Extracting Implicit Social Relation for Social Recommendation Techniques in User Rating Prediction

Seyed Mohammad Taheri , Hamidreza Mahyar , Mohammad Firouzi , Ali Movaghar Subjects : Social and Information Networks (cs.SI) ; Learning (cs.LG)

Recommendation plays an increasingly important role in our daily lives.

Recommender systems automatically suggest items to users that might be

interesting for them. Recent studies illustrate that incorporating social trust

in Matrix Factorization methods demonstrably improves accuracy of rating

prediction. Such approaches mainly use the trust scores explicitly expressed by

users. However, it is often challenging to have users provide explicit trust

scores of each other. There exist quite a few works, which propose Trust

Metrics to compute and predict trust scores between users based on their

interactions. In this paper, first we present how social relation can be

extracted from users’ ratings to items by describing Hellinger distance between

users in recommender systems. Then, we propose to incorporate the predicted

trust scores into social matrix factorization models. By analyzing social

relation extraction from three well-known real-world datasets, which both:

trust and recommendation data available, we conclude that using the implicit

social relation in social recommendation techniques has almost the same

performance compared to the actual trust scores explicitly expressed by users.

Hence, we build our method, called Hell-TrustSVD, on top of the

state-of-the-art social recommendation technique to incorporate both the

extracted implicit social relations and ratings given by users on the

prediction of items for an active user. To the best of our knowledge, this is

the first work to extend TrustSVD with extracted social trust information. The

experimental results support the idea of employing implicit trust into matrix

factorization whenever explicit trust is not available, can perform much better

than the state-of-the-art approaches in user rating prediction.

Ranking Biomarkers Through Mutual Information

Konstantinos Sechidis , Emily Turner , Paul D. Metcalfe , James Weatherall , Gavin Brown

Comments: Accepted at NIPS 2016 Workshop on Machine Learning for Health

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG); Applications (stat.AP)

We study information theoretic methods for ranking biomarkers. In clinical

trials there are two, closely related, types of biomarkers: predictive and

prognostic, and disentangling them is a key challenge. Our first step is to

phrase biomarker ranking in terms of optimizing an information theoretic

quantity. This formalization of the problem will enable us to derive rankings

of predictive/prognostic biomarkers, by estimating different, high dimensional,

conditional mutual information terms. To estimate these terms, we suggest

efficient low dimensional approximations, and we derive an empirical Bayes

estimator, which is suitable for small or sparse datasets. Finally, we

introduce a new visualisation tool that captures the prognostic and the

predictive strength of a set of biomarkers. We believe this representation will

prove to be a powerful tool in biomarker discovery.

Message Passing Multi-Agent GANs

Arnab Ghosh , Viveka Kulharia , Vinay Namboodiri

Comments: The first 2 authors contributed equally for this work

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

Communicating and sharing intelligence among agents is an important facet of

achieving Artificial General Intelligence. As a first step towards this

challenge, we introduce a novel framework for image generation: Message Passing

Multi-Agent Generative Adversarial Networks (MPM GANs). While GANs have

recently been shown to be very effective for image generation and other tasks,

these networks have been limited to mostly single generator-discriminator

networks. We show that we can obtain multi-agent GANs that communicate through

message passing to achieve better image generation. The objectives of the

individual agents in this framework are two fold: a co-operation objective and

a competing objective. The co-operation objective ensures that the message

sharing mechanism guides the other generator to generate better than itself

while the competing objective encourages each generator to generate better than

its counterpart. We analyze and visualize the messages that these GANs share

among themselves in various scenarios. We quantitatively show that the message

sharing formulation serves as a regularizer for the adversarial training.

Qualitatively, we show that the different generators capture different traits

of the underlying data distribution.

Deep Image Category Discovery using a Transferred Similarity Function

Yen-Chang Hsu , Zhaoyang Lv , Zsolt Kira

Comments: 13 pages, 9 figures

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG)

Automatically discovering image categories in unlabeled natural images is one

of the important goals of unsupervised learning. However, the task is

challenging and even human beings define visual categories based on a large

amount of prior knowledge. In this paper, we similarly utilize prior knowledge

to facilitate the discovery of image categories. We present a novel end-to-end

network to map unlabeled images to categories as a clustering network. We

propose that this network can be learned with contrastive loss which is only

based on weak binary pair-wise constraints. Such binary constraints can be

learned from datasets in other domains as transferred similarity functions,

which mimic a simple knowledge transfer. We first evaluate our experiments on

the MNIST dataset as a proof of concept, based on predicted similarities

trained on Omniglot, showing a 99/% accuracy which significantly outperforms

clustering based approaches. Then we evaluate the discovery performance on

Cifar-10, STL-10, and ImageNet, which achieves both state-of-the-art accuracy

and shows it can be scalable to various large natural images.

Known Unknowns: Uncertainty Quality in Bayesian Neural Networks

Ramon Oliveira , Pedro Tabacof , Eduardo Valle

Comments: Workshop on Bayesian Deep Learning, NIPS 2016, Barcelona, Spain

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

We evaluate the uncertainty quality in neural networks using anomaly

detection. We extract uncertainty measures (e.g. entropy) from the predictions

of candidate models, use those measures as features for an anomaly detector,

and gauge how well the detector differentiates known from unknown classes. We

assign higher uncertainty quality to candidate models that lead to better

detectors. We also propose a novel method for sampling a variational

approximation of a Bayesian neural network, called One-Sample Bayesian

Approximation (OSBA). We experiment on two datasets, MNIST and CIFAR10. We

compare the following candidate neural network models: Maximum Likelihood,

Bayesian Dropout, OSBA, and — for MNIST — the standard variational

approximation. We show that Bayesian Dropout and OSBA provide better

uncertainty information than Maximum Likelihood, and are essentially equivalent

to the standard variational approximation, but much faster.

Learnable Structured Clustering Framework for Deep Metric Learning

Hyun Oh Song , Stefanie Jegelka , Vivek Rathod , Kevin Murphy Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Learning (cs.LG)

Learning the representation and the similarity metric in an end-to-end

fashion with deep networks have demonstrated outstanding results for clustering

and retrieval. However, these recent approaches still suffer from the

performance degradation stemming from the local metric training procedure which

is unaware of the global structure of the embedding space.

We propose a global metric learning scheme for optimizing the deep metric

embedding with the learnable clustering function and the clustering metric

(NMI) in a novel structured prediction framework.

Our experiments on CUB200-2011, Cars196, and Stanford online products

datasets show state of the art performance both on the clustering and retrieval

tasks measured in the NMI and Recall@K evaluation metrics.

Optimal and Adaptive Off-policy Evaluation in Contextual Bandits

Yu-Xiang Wang , Alekh Agarwal , Miroslav Dudik Subjects : Machine Learning (stat.ML) ; Learning (cs.LG)

We consider the problem of off-policy evaluation—estimating the value of a

target policy using data collected by another policy—under the contextual

bandit model. We establish a minimax lower bound on the mean squared error

(MSE), and show that it is matched up to constant factors by the inverse

propensity scoring (IPS) estimator. Since in the multi-armed bandit problem the

IPS is suboptimal (Li et. al, 2015), our result highlights the difficulty of

the contextual setting with non-degenerate context distributions. We further

consider improvements on this minimax MSE bound, given access to a reward

model. We show that the existing doubly robust approach, which utilizes such a

reward model, may continue to suffer from high variance even when the reward

model is perfect. We propose a new estimator called SWITCH which more

effectively uses the reward model and achieves a superior bias-variance

tradeoff compared with prior work. We prove an upper bound on its MSE and

demonstrate its benefits empirically on a diverse collection of datasets, often

seeing orders of magnitude improvements over a number of baselines.

Intra-day Activity Better Predicts Chronic Conditions

Tom Quisel , David C. Kale , Luca Foschini

Comments: Presented at the NIPS 2016 Workshop on Machine Learning for Health

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG)

In this work we investigate intra-day patterns of activity on a population of

7,261 users of mobile health wearable devices and apps. We show that: (1) using

intra-day step and sleep data recorded from passive trackers significantly

improves classification performance on self-reported chronic conditions related

to mental health and nervous system disorders, (2) Convolutional Neural

Networks achieve top classification performance vs. baseline models when

trained directly on multivariate time series of activity data, and (3) jointly

predicting all condition classes via multi-task learning can be leveraged to

extract features that generalize across data sets and achieve the highest

classification performance.

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision (Short Version)

Chen Liang , Jonathan Berant , Quoc Le , Kenneth D. Forbus , Ni Lao

Comments: Published in NAMPI workshop at NIPS 2016. Short version of arXiv:1611.00020

Subjects

:

Computation and Language (cs.CL)

; Artificial Intelligence (cs.AI); Learning (cs.LG)

Extending the success of deep neural networks to natural language

understanding and symbolic reasoning requires complex operations and external

memory. Recent neural program induction approaches have attempted to address

this problem, but are typically limited to differentiable memory, and

consequently cannot scale beyond small synthetic tasks. In this work, we

propose the Manager-Programmer-Computer framework, which integrates neural

networks with non-differentiable memory to support abstract, scalable and

precise operations through a friendly neural computer interface. Specifically,

we introduce a Neural Symbolic Machine, which contains a sequence-to-sequence

neural “programmer”, and a non-differentiable “computer” that is a Lisp

interpreter with code assist. To successfully apply REINFORCE for training, we

augment it with approximate gold programs found by an iterative maximum

likelihood training process. NSM is able to learn a semantic parser from weak

supervision over a large knowledge base. It achieves new state-of-the-art

performance on WebQuestionsSP, a challenging semantic parsing dataset, with

weak supervision. Compared to previous approaches, NSM is end-to-end, therefore

does not rely on feature engineering or domain specific knowledge.

On the propriety of restricted Boltzmann machines

Andee Kaplan , Daniel Nordman , Stephen Vardeman

Comments: 35 pages, 13 figures

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG)

A restricted Boltzmann machine (RBM) is an undirected graphical model

constructed for discrete or continuous random variables, with two layers, one

hidden and one visible, and no conditional dependency within a layer. In recent

years, RBMs have risen to prominence due to their connection to deep learning.

By treating a hidden layer of one RBM as the visible layer in a second RBM, a

deep architecture can be created. RBMs are thought to thereby have the ability

to encode very complex and rich structures in data, making them attractive for

supervised learning. However, the generative behavior of RBMs is largely

unexplored. In this paper, we discuss the relationship between RBM parameter

specification in the binary case and the tendency to undesirable model

properties such as degeneracy, instability and uninterpretability. We also

describe the difficulties that arise in likelihood-based and Bayes fitting of

such (highly flexible) models, especially as Gibbs sampling (quasi-Bayes)

methods are often advocated for the RBM model structure.

Deep Learning of Robotic Tasks using Strong and Weak Human Supervision

Bar Hilleli , Ran El-Yaniv Subjects : Artificial Intelligence (cs.AI) ; Learning (cs.LG); Robotics (cs.RO)

We propose a scheme for training a computerized agent to perform complex

human tasks such as highway steering. The scheme resembles natural

teaching-learning processes used by humans to teach themselves and each other

complex tasks, and consists of the following four stages. In the first stage

the agent learns by itself an informative low-dimensional representations of

raw input signals in an unsupervised learning manner. In the second stage the

agent learns to mimic the human instructor using supervised learning so as to

reach a basic performance level; the third stage is devoted to learning an

instantaneous reward model. Here, the (human) instructor observes (possibly in

real time) the agent performing the task and provides reward feedback. During

this stage the agent monitors both itself and the instructor feedback and

learns a reward model using supervised learning. This stage terminates when the

reward model is sufficiently accurate. In the last stage a reinforcement

learning algorithm is deployed to optimize the agent policy. The guidance

reward signal in the reinforcement learning algorithm relies on the previously

learned reward model. As a proof of concept for the proposed scheme, we

designed a system consisting of deep convolutional neural networks, and applied

it to successfully learn a computerized agent capable of autonomous highway

steering over the well-known racing game Assetto Corsa.

Enhancing Use Case Points Estimation Method Using Soft Computing Techniques

Ali Bou Nassif , Luiz Fernando Capretz , Danny Ho Subjects : Software Engineering (cs.SE) ; Artificial Intelligence (cs.AI); Learning (cs.LG)

Software estimation is a crucial task in software engineering. Software

estimation encompasses cost, effort, schedule, and size. The importance of

software estimation becomes critical in the early stages of the software life

cycle when the details of software have not been revealed yet. Several

commercial and non-commercial tools exist to estimate software in the early

stages. Most software effort estimation methods require software size as one of

the important metric inputs and consequently, software size estimation in the

early stages becomes essential. One of the approaches that has been used for

about two decades in the early size and effort estimation is called use case

points. Use case points method relies on the use case diagram to estimate the

size and effort of software projects. Although the use case points method has

been widely used, it has some limitations that might adversely affect the

accuracy of estimation. This paper presents some techniques using fuzzy logic

and neural networks to improve the accuracy of the use case points method.

Results showed that an improvement up to 22% can be obtained using the proposed

approach.

Algorithmic Songwriting with ALYSIA

Margareta Ackerman , David Loker Subjects : Artificial Intelligence (cs.AI) ; Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD)

Support vector machines (SVMs) have been recognized as a potential tool for

supervised classification analyses in different domains of research. In

essence, SVM is a binary classifier. Therefore, in case of a multiclass

problem, the problem is divided into a series of binary problems which are

solved by binary classifiers, and finally the classification results are

combined following either the one-against-one or one-against-all strategies. In

this paper, an attempt has been made to classify lithology using a multiclass

SVM based framework using well logs as predictor variables. Here, the lithology

is classified into four classes such as sand, shaly sand, sandy shale and shale

based on the relative values of sand and shale fractions as suggested by an

expert geologist. The available dataset consisting well logs (gamma ray,

neutron porosity, density, and P-sonic) and class information from four closely

spaced wells from an onshore hydrocarbon field is divided into training and

testing sets. We have used one-against-all strategy to combine the results of

multiple binary classifiers. The reported results established the superiority

of multiclass SVM compared to other classifiers in terms of classification

accuracy. The selection of kernel function and associated parameters has also

been investigated here. It can be envisaged from the results achieved in this

study that the proposed framework based on multiclass SVM can further be used

to solve classification problems. In future research endeavor, seismic

attributes can be introduced in the framework to classify the lithology

throughout a study area from seismic inputs.

Modeling trajectories of mental health: challenges and opportunities

Lauren Erdman , Ekansh Sharma , Eva Unternahrer , Shantala Hari Dass , Kieran ODonnell , Sara Mostafavi , Rachel Edgar , Michael Kobor , Helene Gaudreau , Michael Meaney , Anna Goldenberg

Comments: extended abstract for ML4HC at NIPS 2016, 4 pages

Subjects

:

Machine Learning (stat.ML)

; Learning (cs.LG); Applications (stat.AP)

More than two thirds of mental health problems have their onset during

childhood or adolescence. Identifying children at risk for mental illness later

in life and predicting the type of illness is not easy. We set out to develop a

platform to define subtypes of childhood social-emotional development using

longitudinal, multifactorial trait-based measures. Subtypes discovered through

this study could ultimately advance psychiatric knowledge of the early

behavioural signs of mental illness. To this extent we have examined two types

of models: latent class mixture models and GP-based models. Our findings

indicate that while GP models come close in accuracy of predicting future

trajectories, LCMMs predict the trajectories as well in a fraction of the time.

Unfortunately, neither of the models are currently accurate enough to lead to

immediate clinical impact. The available data related to the development of

childhood mental health is often sparse with only a few time points measured

and require novel methods with improved efficiency and accuracy.

Large scale modeling of antimicrobial resistance with interpretable classifiers

Alexandre Drouin , Frédéric Raymond , Gaël Letarte St-Pierre , Mario Marchand , Jacques Corbeil , François Laviolette

Comments: Peer-reviewed and accepted for presentation at the Machine Learning for Health Workshop, NIPS 2016, Barcelona, Spain

Subjects

:

Genomics (q-bio.GN)

; Learning (cs.LG); Machine Learning (stat.ML)

Antimicrobial resistance is an important public health concern that has

implications in the practice of medicine worldwide. Accurately predicting

resistance phenotypes from genome sequences shows great promise in promoting

better use of antimicrobial agents, by determining which antibiotics are likely

to be effective in specific clinical cases. In healthcare, this would allow for

the design of treatment plans tailored for specific individuals, likely

resulting in better clinical outcomes for patients with bacterial infections.

In this work, we present the recent work of Drouin et al. (2016) on using Set

Covering Machines to learn highly interpretable models of antibiotic resistance

and complement it by providing a large scale application of their method to the

entire PATRIC database. We report prediction results for 36 new datasets and

present the Kover AMR platform, a new web-based tool allowing the visualization

and interpretation of the generated models.

Transformation Function Based Methods for Model Shift

Simon Shaolei Du , Jayanth Koushik , Aarti Singh , Barnabas Poczos Subjects : Machine Learning (stat.ML) ; Learning (cs.LG)

Transfer learning techniques are often used when one tries to adapt a model

learned from a source domain with abundant labeled samples to the target domain

with limited labeled samples. In this paper, we consider the regression problem

under model shift condition, i.e., regression functions are different but

related in the source and target domains. We approach this problem through the

use of transformation functions which characterize the relation between the

source and the target domain. These transformation functions are able to

transform the original problem of learning the complicated regression function

of target domain into a problem of learning a simple auxiliary function. This

transformation function based technique includes some previous works as special

cases, but the class we propose is significantly more general. In this work we

consider two widely used non-parametric estimators, Kernel Smoothing (KS) and

Kernel Ridge Regression (KRR) for this setting and show improved statistical

rates for excess risk than non-transfer learning. Through an (epsilon)-cover

technique, we show that we can find the best transformation function a function

class. Lastly, experiments on synthesized, robotics and neural imaging data

demonstrate the effectiveness of our framework.

Estimating latent feature-feature interactions in large feature-rich graphs

Corrado Monti , Paolo Boldi Subjects : Social and Information Networks (cs.SI) ; Learning (cs.LG); Machine Learning (stat.ML)

Real-world complex networks describe connections between objects; in reality,

those objects are often endowed with some kind of features. How does the

presence or absence of such features interplay with the network link structure?

Although the situation here described is truly ubiquitous, there is a limited

body of research dealing with large graphs of this kind. Many previous works

considered homophily as the only possible transmission mechanism translating

node features into links. Other authors, instead, developed more sophisticated

models, that are able to handle complex feature interactions, but are unfit to

scale to very large networks. We expand on the MGJ model, where interactions

between pairs of features can foster or discourage link formation. In this

work, we will investigate how to estimate the latent feature-feature

interactions in this model. We shall propose two solutions: the first one

assumes feature independence and it is essentially based on Naive Bayes; the

second one, which relaxes the independence assumption assumption, is based on

perceptrons. In fact, we show it is possible to cast the model equation in

order to see it as the prediction rule of a perceptron. We analyze how

classical results for the perceptrons can be interpreted in this context; then,

we define a fast and simple perceptron-like algorithm for this task, which can

process (10^8) links in minutes. We then compare these two techniques, first

with synthetic datasets that follows our model, gaining evidence that the Naive

independence assumptions are detrimental in practice. Secondly, we consider a

real, large-scale citation network where each node (i.e., paper) can be

described by different types of characteristics; there, our algorithm can

assess how well each set of features can explain the links, and thus finding

meaningful latent feature-feature interactions.

End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

Xuesong Yang , Yun-Nung Chen , Dilek Hakkani-Tur , Paul Crook , Xiujun Li , Jianfeng Gao , Li Deng Subjects : Computation and Language (cs.CL) ; Learning (cs.LG)

Natural language understanding and dialogue policy learning are both

essential in conversational systems that predict the next system actions in

response to a current user utterance. Conventional approaches aggregate

separate models of natural language understanding (NLU) and system action

prediction (SAP) as a pipeline that is sensitive to noisy outputs of

error-prone NLU. To address the issues, we propose an end-to-end deep recurrent

neural network with limited contextual dialogue memory by jointly training NLU

and SAP on DSTC4 multi-domain human-human dialogues. Experiments show that our

proposed model significantly outperforms the state-of-the-art pipeline models

for both NLU and SAP, which indicates that our joint model is capable of

mitigating the affects of noisy NLU outputs, and NLU model can be refined by

error flows backpropagating from the extra supervised signals of system

actions.

Parameter Compression of Recurrent Neural Networks and Degredation of Short-term Memory

Jonathan A. Cox

Comments: Submitted to IJCNN 2017

Subjects

:

Computer Vision and Pattern Recognition (cs.CV)

; Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)

The significant computational costs of deploying neural networks in

large-scale or resource constrained environments, such as data centers and

mobile devices, has spurred interest in model compression, which can achieve a

reduction in both arithmetic operations and storage memory. Several techniques

have been proposed for reducing or compressing the parameters for feed-forward

and convolutional neural networks, but less is understood about the effect of

parameter compression on recurrent neural networks (RNN). In particular, the

extent to which the recurrent parameters can be compressed and the impact on

short-term memory performance, is not well understood. In this paper, we study

the effect of complexity reduction, through singular value decomposition rank

reduction, on RNN and minimal gated recurrent unit (MGRU) networks for several

tasks. We show that considerable rank reduction is possible when compressing

recurrent weights, even without fine tuning. Furthermore, we propose a

perturbation model for the effect of general perturbations, such as a

compression, on the recurrent parameters of RNNs. The model is tested against a

noiseless memorization experiment that elucidates the short-term memory

performance. In this way, we demonstrate that the effect of compression of

recurrent parameters is dependent on the degree of temporal coherence present

in the data and task. This work can guide on-the-fly RNN compression for novel

environments or tasks, and provides insight for applying RNN compression in

low-power devices, such as hearing aids.

Information Theory

Capacity Regions of Two-Receiver Broadcast Erasure Channels with Feedback and Memory

Michael Heindlmaier , Shirin Saeedi Bidokhti Subjects : Information Theory (cs.IT)

The two-receiver broadcast packet erasure channel with feedback and memory is

studied. Memory is modeled using a finite- state Markov chain representing a

channel state. Two scenarios are considered: (i) when the transmitter has

causal knowledge of the channel state (i.e., the state is visible), and (ii)

when the channel state is unknown at the transmitter, but observations of it

are available at the transmitter through feedback (i.e., the state is hidden).

In both scenarios, matching outer and inner bounds are derived on the rates of

communication and the capacity region is determined. It is shown that similar

results carry over to channels with memory and delayed feedback, and memoryless

compound channels with feedback. When the state is visible, the capacity region

has a single-letter characterization and is in terms of a linear program. Two

optimal coding schemes are devised that use feedback to keep track of the

sent/received packets via a network of queues: a probabilistic scheme and a

deterministic backpressure-like algorithm. The former bases its decisions

solely on the past channel state information and the latter follows a

max-weight queue-based policy. The performance of the algorithms are analyzed

using the frameworks of rate-stability in networks of queues, max-flow min-cut

duality in networks, and finite-horizon Lyapunov drift analysis. When the state

is hidden, the capacity region does not have a single-letter characterization

and is, in this sense, uncomputable. Approximations of the capacity region are

provided and two optimal coding algorithms are outlined. The first algorithm is

a probabilistic coding scheme that bases its decisions on the past L feedback

sequences and its achievable rate-region approaches the capacity region

exponentially fast in L. The second algorithm is a backpressure-like algorithm

that performs optimally in the long run.

Approximate Support Recovery of Atomic Line Spectral Estimation: A Tale of Resolution and Precision

Qiuwei Li , Gongguo Tang Subjects : Information Theory (cs.IT) ; Optimization and Control (math.OC)

This work investigates the parameter estimation performance of

super-resolution line spectral estimation using atomic norm minimization. The

focus is on analyzing the algorithm’s accuracy of inferring the frequencies and

complex magnitudes from noisy observations. When the Signal-to-Noise Ratio is

reasonably high and the true frequencies are separated by (O(frac{1}{n})), the

atomic norm estimator is shown to localize the correct number of frequencies,

each within a neighborhood of size (O(sqrt{frac{log n}{n^3}} sigma)) of one

of the true frequencies. Here (n) is half the number of temporal samples and

(sigma^2) is the Gaussian noise variance. The analysis is based on a

primal-dual witness construction procedure. The obtained error bound matches

the Cram’er-Rao lower bound up to a logarithmic factor. The relationship

between resolution (separation of frequencies) and precision or accuracy of the

estimator is highlighted. Our analysis also reveals that the atomic norm

minimization can be viewed as a convex way to solve a (ell_1)-norm

regularized, nonlinear and nonconvex least-squares problem to global

optimality.

Repairing Reed-Solomon Codes With Multiple Erasures

Hoang Dau , Iwan Duursma , Han Mao Kiah , Olgica Milenkovic

Comments: 16 pages

Subjects

:

Information Theory (cs.IT)

Despite their exceptional error-correcting properties, Reed-Solomon (RS)

codes have been overlooked in distributed storage applications due to the

common belief that they have poor repair bandwidth: A naive repair approach

would require the whole file to be reconstructed in order to recover a single

erased codeword symbol. In a recent work, Guruswami and Wootters (STOC’16)

proposed a single-erasure repair method for RS codes that achieves the optimal

repair bandwidth amongst all linear encoding schemes. Their key idea is to

recover the erased symbol by collecting a sufficiently large number of its

traces, each of which can be constructed from a number of traces of other

symbols. As all traces belong to a subfield of the defining field of the RS

code and many of them are linearly dependent, the total repair bandwidth is

significantly reduced compared to that of the naive repair scheme. We extend

the trace collection technique to cope with multiple erasures.

Rate-Compatible Punctured Polar Codes: Optimal Construction Based on Polar Spectra

Kai Niu , Jincheng Dai , Kai Chen , Jiaru Lin , Q. T. Zhang , Athanasios V. Vasilakos

Comments: Submitted to IEEE Transactions on Information Theory

Subjects

:

Information Theory (cs.IT)

Polar codes are the first class of constructive channel codes achieving the

symmetric capacity of the binary-input discrete memoryless channels. But the

corresponding code length is limited to the power of two. In this paper, we

establish a systematic framework to design the rate-compatible punctured polar

(RCPP) codes with arbitrary code length. A new theoretic tool, called polar

spectra, is proposed to count the number of paths on the code tree with the

same number of zeros or ones respectively. Furthermore, a spectrum distance SD0

(SD1) and a joint spectrum distance (JSD) are presented as performance criteria

to optimize the puncturing tables. For the capacity-zero puncturing mode

(punctured bits are unknown to the decoder), we propose a quasi-uniform

puncturing algorithm, analyze the number of equivalent puncturings and prove

that this scheme can maximize SD1 and JSD. Similarly, for the capacity-one mode

(punctured bits are known to the decoder), we also devise a reversal

quasi-uniform puncturing scheme and prove that it has the maximum SD0 and JSD.

Both schemes have a universal puncturing table without any exhausted search.

These optimal RCPP codes outperform the performance of turbo codes in LTE

wireless communication systems.

On the Capacity of Discrete-Time Laguerre Optical Channels

Hossein Esmaeili , Jawad Salehi Subjects : Information Theory (cs.IT)

In this paper, new upper and lower bounds are proposed for the capacity of

discrete-time Laguerre channel. Laguerre behavior is used to model various

types of optical systems and networks such as optical amplifiers, short

distance visible light communication systems with direct detection and coherent

code division multiple access (CDMA) networks. Bounds are derived for short

distance visible light communication systems and coherent CDMA networks. These

bounds are separated in three main cases: when both average and peak power

constraints are imposed, when peak power constraint is inactive and when only

peak power constraint is active.

In-network Compression for Multiterminal Cascade MIMO Systems

Inaki Estella Aguerri , Abdellatif Zaidi

Comments: Submitted to IEEE Transactions on Communications

Subjects

:

Information Theory (cs.IT)

We study the problem of receive beamforming in uplink cascade multiple-input

multiple-output (MIMO) systems as an instance of that of cascade multiterminal

source coding for lossy function computation. Using this connection, we develop

two coding schemes for the second and show that their application leads to

efficient beamforming schemes for the first. In the first coding scheme, each

terminal in the cascade sends a description of the source that it observes; the

decoder reconstructs all sources, lossily, and then computes an estimate of the

desired function. This scheme improves upon standard routing in that every

terminal only compresses the innovation of its source w.r.t. the descriptions

that are sent by the previous terminals in the cascade. In the second scheme,

the desired function is computed gradually in the cascade network, and each

terminal sends a finer description of it. In the context of uplink cascade MIMO

systems, the application of these two schemes leads to efficient centralized

receive-beamforming and distributed receive-beamforming, respectively.

Towards a Tractable Delay Analysis in Large Wireless Networks

Yi Zhong , Martin Haenggi , Fu-Chun Zheng , Wenyi Zhang , Tony Q.S. Quek , Weili Nie

Comments: 17 pages, 4 figures, 1 table, submitted to IEEE Communications Magazine

Subjects

:

Information Theory (cs.IT)

; Networking and Internet Architecture (cs.NI)

Meeting the diverse delay requirements of next-generation wireless

communication networks is one of the most critical goals of wireless system

design. Though the delay of point-to-point communications has been well

investigated using classical queueing theory, the delay of multi-point to

multi-point communications has not been explored in depth. The main technical

difficulty lies in the interacting queues problem, in which the service rate is

coupled with the statuses of other queues. In this article, we elaborate on the

main challenges of delay analysis in large wireless networks. Several promising

approaches to bypass these difficulties are proposed and summarized to provide

useful guidance.

Cache-Enabled Physical-Layer Security for Video Streaming in Wireless Networks with Limited Backhaul

Lin Xiang , Derrick Wing Kwan Ng , Robert Schober , Vincent W.S. Wong

Comments: Accepted for presentation at IEEE Globecom 2016, Washington, DC, Dec. 2016

Subjects

:

Information Theory (cs.IT)

In this paper, we investigate for the first time the benefits of wireless

caching for the physical layer security (PLS) of wireless networks. In

particular, a caching scheme enabling power-efficient PLS is proposed for

cellular video streaming with constrained backhaul capacity. By sharing video

data across a subset of base stations (BSs) through both caching and backhaul

loading, secure cooperative transmission of several BSs is dynamically enabled

in accordance with the cache status, the channel conditions, and the backhaul

capacity. Thereby, caching reduces the data sharing overhead over the

capacity-constrained backhaul links. More importantly, caching introduces

additional secure degrees of freedom and enables a power-efficient design. We

investigate the optimal caching and transmission policies for minimizing the

total transmit power while providing quality of service (QoS) and guaranteeing

secrecy during video delivery. A two-stage non-convex mixed-integer

optimization problem is formulated, which optimizes the caching policy in an

offline video caching stage and the cooperative transmission policy in an

online video delivery stage. As the problem is NP-hard, suboptimal

polynomial-time algorithms are proposed for low-complexity cache training and

delivery control, respectively. Sufficient optimality conditions, under which

the proposed schemes attain global optimal solutions, are also provided.

Simulation results show that the proposed schemes achieve low secrecy outage

probability and high power efficiency simultaneously.

Vector Approximate Message Passing for the Generalized Linear Model

Philip Schniter , Sundeep Rangan , Alyson K. Fletcher Subjects : Information Theory (cs.IT)

The generalized linear model (GLM), where a random vector (oldsymbol{x}) is

observed through a noisy, possibly nonlinear, function of a linear transform

output (oldsymbol{z}=oldsymbol{Ax}), arises in a range of applications such

as robust regression, binary classification, quantized compressed sensing,

phase retrieval, photon-limited imaging, and inference from neural spike

trains. When (oldsymbol{A}) is large and i.i.d. Gaussian, the generalized

approximate message passing (GAMP) algorithm is an efficient means of MAP or

marginal inference, and its performance can be rigorously characterized by a

scalar state evolution. For general (oldsymbol{A}), though, GAMP can

misbehave. Damping and sequential-updating help to robustify GAMP, but their

effects are limited. Recently, a “vector AMP” (VAMP) algorithm was proposed for

additive white Gaussian noise channels. VAMP extends AMP’s guarantees from

i.i.d. Gaussian (oldsymbol{A}) to the larger class of rotationally invariant

(oldsymbol{A}). In this paper, we show how VAMP can be extended to the GLM.

Numerical experiments show that the proposed GLM-VAMP is much more robust to

ill-conditioning in (oldsymbol{A}) than damped GAMP.

Onsager-Corrected Deep Networks for Sparse Linear Inverse Problems

Mark Borgerding , Philip Schniter

Comments: arXiv admin note: text overlap with arXiv:1607.05966

Subjects

:

Information Theory (cs.IT)

Deep learning has gained great popularity due to its widespread success on

many inference problems. We consider the application of deep learning to the

sparse linear inverse problem encountered in compressive sensing, where one

seeks to recover a sparse signal from a few noisy linear measurements. In this

paper, we propose two novel neural-network architectures that decouple

prediction errors across layers in the same way that the approximate message

passing (AMP) algorithms decouple them across iterations: through Onsager

correction. We show numerically that our “learned AMP” network significantly

improves upon Gregor and LeCun’s “learned ISTA” when both use soft-thresholding

shrinkage. We then show that additional improvements result from jointly

learning the shrinkage functions together with the linear transforms. Finally,

we propose a network design inspired by an unfolding of the recently proposed

“vector AMP” (VAMP) algorithm, and show that it outperforms all previously

considered networks. Interestingly, the linear transforms and shrinkage

functions prescribed by VAMP coincide with the values learned through

backpropagation, yielding an intuitive explanation for the design of this deep

network.

Novel Delivery Schemes for Decentralized Coded Caching in the Finite File Size Regime

Kai Wan , Daniela Tuninetti , Pablo Piantanida

Comments: 6 pages, 1 figure, submitted to ICC 2017-WT10

Subjects

:

Information Theory (cs.IT)

This paper analyzes the achievable tradeoff between cache~size and

download~rate in decentralized caching systems with the uncoded cache placement

originally proposed by Maddah-Ali and Niesen. It proposes two novel delivery

schemes that take advantage of the multicasting opportunities that arise when a

file is demanded by multiple users. These delivery schemes are extensions of

known ones to the regime where the file size is finite. Numerical evaluations

for the case of file uniform popularity show that the proposed schemes

outperform previous ones for all value of the cache size.

On the Performance of Visible Light Communications Systems with Non-Orthogonal Multiple Access

Hanaa Marshoud , Paschalis C. Sofotasios , Sami Muhaidat , George K. Karagiannidis Subjects : Information Theory (cs.IT)

Visible light communications (VLC) have been recently proposed as a promising

and efficient solution to indoor ubiquitous broadband connectivity. In this

paper, non-orthogonal multiple access, which has been recently proposed as an

effective scheme for fifth generation (5G) wireless networks, is considered in

the context of VLC systems, under different channel uncertainty models. To this

end, we first derive a novel closed-form expression for the bit-error-rate

(BER) under perfect channel state information (CSI). Capitalizing on this, we

quantify the effect of noisy and outdated CSI by deriving a simple approximated

expression for the former and a tight upper bound for the latter. The offered

results are corroborated by respective results from extensive Monte Carlo

simulations and are used to provide useful insights on the effect of imperfect

CSI knowledge on the system performance. It was shown that, while noisy CSI

leads to slight degradation in the BER performance, outdated CSI can cause

detrimental performance degradation if the order of the users’ channel gains

change as a result of mobility

Linear Codes over (mathbb{F}_{q}[x]/(x^2)) and (GR(p^2,m)) Reaching the Griesmer Bound

Jin Li , Aixian Zhang , Keqin Feng Subjects : Information Theory (cs.IT)

We construct two series of linear codes over finite ring

(mathbb{F}_{q}[x]/(x^2)) and Galois ring (GR(p^2,m)) respectively reaching the

Griesmer bound. They derive two series of codes over finite field

(mathbb{F}_{q}) by Gray map. The first series of codes over (mathbb{F}_{q})

derived from (mathbb{F}_{q}[x]/(x^2)) are linear and also reach the Griesmer

bound in some cases. Many of linear codes over finite field we constructed have

two Hamming (non-zero) weights.

A Mathematical Proof of the Superiority of NOMA Compared to Conventional OMA

Zhiyong Chen , Zhiguo Ding , Xuchu Dai , Rui Zhang

Comments: 28 pages, 8 figures, submitted to IEEE Transactions on Signal Processing

Subjects

:

Information Theory (cs.IT)

While existing works about non-orthogonal multiple access (NOMA) have

indicated that NOMA can yield a significant performance gain over orthogonal

multiple access (OMA) with fixed resource allocation, it is not clear whether

such a performance gain will diminish when optimal resource

(Time/Frequency/Power) allocation is carried out. In this paper, the

performance comparison between NOMA and conventional OMA systems is

investigated, from an optimization point of view. Firstly, by using the idea of

power splitting, a closed-form expression for the optimum sum rate of NOMA

systems is derived. Then, with rigorous mathematical proofs, we reveal the fact

that NOMA can always outperform conventional OMA systems, even if both are

equipped with the optimal resource allocation policies. Finally, computer

simulations are conducted to validate the accuracy of the analytical results.

Placement Optimization of UAV-Mounted Mobile Base Stations

Jiangbin Lyu , Yong Zeng , Rui Zhang , Teng Joon Lim

Comments: 4 pages, 3 figures, 1 table, accepted for publications in IEEE Communications Letters

Subjects

:

Information Theory (cs.IT)

In terrestrial communication networks without fixed infrastructure, unmanned

aerial vehicle (UAV)-mounted mobile base stations (MBSs) provide an efficient

solution to achieve wireless connectivity. This letter aims to minimize the

number of MBSs needed to provide wireless coverage for a group of distributed

ground terminals (GTs), ensuring that each GT is within the communication range

of at least one MBS. We propose a polynomial-time algorithm with successive MBS

placement, where the MBSs are placed sequentially starting on the area

perimeter of the uncovered GTs along a spiral path towards the center, until

all GTs are covered. Each MBS is placed to cover as many uncovered GTs as

possible, with higher priority given to the GTs on the boundary to reduce the

occurrence of outlier GTs that each may require one dedicated MBS for its

coverage. Numerical results show that the proposed algorithm performs favorably

compared to other schemes in terms of the total number of required MBSs and/or

time complexity.

A Protocol for a Secure Remote Keyless Entry System Applicable in Vehicles using Symmetric-Key Cryptography

Tobias Glocker , Timo Mantere , Mohammed Elmusrati Subjects : Information Theory (cs.IT) ; Cryptography and Security (cs.CR)

In our modern society comfort became a standard. This comfort, especially in

cars can only be achieved by equipping the car with more electronic devices.

Some of the electronic devices must cooperate with each other and thus they

require a communication channel, which can be wired or wireless. In these days,

it would be hard to sell a new car operating with traditional keys. Almost all

modern cars can be locked or unlocked with a Remote Keyless System. A Remote

Keyless System consists of a key fob that communicates wirelessly with the car

transceiver that is responsible for locking and unlocking the car. However

there are several threats for wireless communication channels.

This paper describes the possible attacks against a Remote Keyless System and

introduces a secure protocol as well as a lightweight Symmetric Encryption

Algorithm for a Remote Keyless Entry System applicable in vehicles.

Two new families of two-weight codes

Minjia Shi , Yue Guan , Patrick Sole

Comments: 13 pages, submitted on 10 November

Subjects

:

Information Theory (cs.IT)

We construct two new infinite families of trace codes of dimension (2m), over

the ring (mathbb{F}_p+umathbb{F}_p,) when (p) is an odd prime. They have the

algebraic structure of abelian codes. Their Lee weight distribution is computed

by using Gauss sums. By Gray mapping, we obtain two infinite families of linear

(p)-ary codes of respective lengths ((p^m-1)^2) and (2(p^m-1)^2.) When (m) is

singly-even, the first family gives five-weight codes. When (m) is odd, and

(pequiv 3 pmod{4},) the first family yields (p)-ary two-weight codes, which

are shown to be optimal by application of the Griesmer bound. The second family

consists of two-weight codes that are shown to be optimal, by the Griesmer

bound, whenever (p=3) and (m ge 3,) or (pge 5) and (mge 4.) Applications to

secret sharing schemes are given.

Optimal two weight codes from trace codes over a chain ring

Minjia Shi , Rongsheng Wu , Patrick Sole

Comments: 12 pages, submitted on 4 October. arXiv admin note: text overlap with arXiv:1612.00123 , arXiv:1612.00128

Subjects

:

Information Theory (cs.IT)

In this paper, we construct an infinite family of two-weight codes for the

homogeneous metric over the ring (R=mathbb{F}_2+umathbb{F}_2+cdots

+u^{k-1}mathbb{F}_{2}), where (u^k=0.) These codes are defined as trace codes.

They have the algebraic structure of abelian codes. Their homogeneous weight

distribution is computed by using character sums. In particular, we give a

necessary and sufficient condition of optimality for the Gray image codes by

using the Griesmer bound. We have shown that if (k>3), these images are

projective. Their support structure is determined. An application to secret

sharing schemes is given.

Several Classes of (p)-ary Linear Codes with Few Weights

Rongsheng Wu , Minjia Shi , Patrick Sole

Comments: 11 pages

Subjects

:

Information Theory (cs.IT)

Few weights codes have been an important topic of coding theory for decades.

In this paper, a class of two-weight and three-weight codes for the homogeneous

metric over the chain ring (R=mathbb{F}_p+umathbb{F}_p+cdots

+u^{k-1}mathbb{F}_{p}) are constructed. These codes are defined as trace

codes. They are shown to be abelian. Their homogeneous weight distributions are

computed by using exponential sums. In particular, we give a necessary and

sufficient condition of the optimality for their Gray images by using the

Griesmer bound in the two-weight case, and the information about the dual

homogeneous distance is also given. In addition, the codewords of these codes

have been turn out to be minimal, it is significant to obtain secret sharing

schemes with interesting access structures.

Some ternary cubic two-weight codes

Minjia Shi , Daitao Huang , Patrick Sole

Comments: 11 pages, submitted on 2 December. arXiv admin note: text overlap with arXiv:1612.00118

Subjects

:

Information Theory (cs.IT)

We study trace codes with defining set (L,) a subgroup of the multiplicative

group of an extension of degree (m) of the alphabet ring

(mathbb{F}_3+umathbb{F}_3+u^{2}mathbb{F}_{3},) with (u^{3}=1.) These codes

are abelian, and their ternary images are quasi-cyclic of co-index three

(a.k.a. cubic codes). Their Lee weight distributions are computed by using

Gauss sums. These codes have three nonzero weights when (m) is singly-even and

(|L|=frac{3^{3m}-3^{2m}}{2}.) When (m) is odd, and

(|L|=frac{3^{3m}-3^{2m}}{2}), or (|L|={3^{3m}-3^{2m}}) and (m) is a positive

integer, we obtain two new infinite families of two-weight codes which are

optimal. Applications of the image codes to secret sharing schemes are also

given.

SNIPE for Memory-Limited PCA From Incomplete Data

Armin Eftekhari , Laura Balzano , Dehui Yang , Michael B. Wakin Subjects : Information Theory (cs.IT)

The linear subspace model is pervasive in science and engineering and

particularly in large datasets which are often incomplete due to missing

measurements and privacy issues. Therefore, a critical problem in modeling is

to develop algorithms for estimating a low-dimensional subspace model from

incomplete data efficiently in terms of both computational complexity and

memory storage. In this paper we study an algorithm that processes blocks of

incomplete data to estimate the underlying subspace model. Our algorithm has a

simple interpretation as optimizing the subspace to fit the observed data block

but remain close to the previous estimate. We prove a linear rate of

convergence for the algorithm and our rate holds with high probability.

Analysis of finite sample size quantum hypothesis testing via martingale concentration inequalities

Cambyse Rouze , Nilanjana Datta

Comments: 24 pages, 2 figures. arXiv admin note: text overlap with arXiv:1510.04682

Subjects

:

Quantum Physics (quant-ph)

; Information Theory (cs.IT); Statistics Theory (math.ST)

Martingale concentration inequalities constitute a powerful mathematical tool

in the analysis of problems in a wide variety of fields ranging from

probability and statistics to information theory and machine learning. Here we

apply such inequalities to finite sample size quantum hypothesis testing, which

is the problem of discriminating quantum states belonging to two different

sequences ({

ho_n}_n) and ({sigma_n}_n), for a finite (n). We obtain

upper bounds on the type II Stein- and Hoeffding errors, which, for

i.i.d.~states, are in general tighter than the corresponding bounds obtained by

Audenaert, Mosonyi and Verstrate. Moreover, our bounds are also valid for

sequences of non-i.i.d.~states which satisfy certain conditions. We also use a

martingale concentration inequality to obtain an upper bound on the second

order asymptotics of the type II error exponent in the setting of quantum

Stein’s lemma, for certain classes of states.

Robust nonparametric nearest neighbor random process clustering

Michael Tschannen , Helmut Bölcskei

Comments: 13 pages, 4 figures

Subjects

:

Learning (cs.LG)

; Information Theory (cs.IT); Machine Learning (stat.ML)

We consider the problem of clustering noisy finite-length observations of

stationary ergodic random processes according to their generative models

without prior knowledge of the model statistics and the number of generative

models. Two algorithms, both using the L1-distance between estimated power

spectral densities (PSDs) as a measure of dissimilarity, are analyzed. The

first one, termed nearest neighbor process clustering (NNPC) is new and relies

on partitioning the nearest neighbor graph of the observations via spectral

clustering. The second algorithm, simply referred to as k-means (KM), consists

of a single k-means iteration with farthest point initialization and was

considered before in the literature, albeit with a different dissimilarity

measure and with asymptotic performance results only. We prove that both

algorithms succeed with high probability in the presence of noise and missing

entries, and even when the generative process PSDs overlap significantly, all

provided that the observation length is sufficiently large. Our results

quantify the tradeoff between the overlap of the generative process PSDs, the

observation length, the fraction of missing entries, and the noise variance.

Furthermore, we prove that treating the finite-length observations of

stationary ergodic random processes as vectors in Euclidean space and

clustering them using the thresholding-based subspace clustering (TSC)

algorithm, the subspace clustering cousin of NNPC, results in performance

strictly inferior to that of NNPC. We argue that the underlying cause is to be

found in TSC employing spherical distance as dissimilarity measure, thereby

ignoring the stationary process structure of the observations. Finally, we

provide extensive numerical results for synthetic and real data and find that

NNPC outperforms state-of-the-art algorithms in human motion sequence

clustering.

The Optimality of Correlated Sampling

Mohammad Bavarian , Badih Ghazi , Elad Haramaty , Pritish Kamath , Ronald L. Rivest , Madhu Sudan Subjects : Computational Complexity (cs.CC) ; Information Theory (cs.IT)

In the “correlated sampling” problem, two players, say Alice and Bob, are

given two distributions, say (P) and (Q) respectively, over the same universe

and access to shared randomness. The two players are required to output two

elements, without any interaction, sampled according to their respective

distributions, while trying to minimize the probability that their outputs

disagree. A well-known protocol due to Holenstein, with close variants (for

similar problems) due to Broder, and to Kleinberg and Tardos, solves this task

with disagreement probability at most (2 delta/(1+delta)), where (delta) is

the total variation distance between (P) and (Q). This protocol has been used

in several different contexts including sketching algorithms, approximation

algorithms based on rounding linear programming relaxations, the study of

parallel repetition and cryptography.

In this note, we give a surprisingly simple proof that this protocol is in

fact tight. Specifically, for every (delta in (0,1)), we show that any

correlated sampling scheme should have disagreement probability at least

(2delta/(1+delta)). This partially answers a recent question of Rivest.

Our proof is based on studying a new problem we call “constrained agreement”.

Here, Alice is given a subset (A subseteq [n]) and is required to output an

element (i in A), Bob is given a subset (B subseteq [n]) and is required to

output an element (j in B), and the goal is to minimize the probability that

(i

eq j). We prove tight bounds on this question, which turn out to imply

tight bounds for correlated sampling. Though we settle basic questions about

the two problems, our formulation also leads to several questions that remain

open.

Online Page Migration on Ring Networks in Uniform Model

Amanj Khorramian , Akira Matsubayashi

Comments: 8 pages, 5 figures

Subjects

:

Data Structures and Algorithms (cs.DS)

; Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT)

This paper explores the problem of page migration in ring networks. A ring

network is a connected graph, in which each node is connected with exactly two

other nodes. In this problem, one of the nodes in a given network holds a page

of size D. This node is called the server and the page is a non-duplicable data

in the network. Requests are issued by nodes to access the page one after

another. Every time a new request is issued, the server must serve the request

and may migrate to another node before the next request arrives. A service

costs the distance between the server and the requesting node, and the

migration costs the distance of the migration multiplied by D. The problem is

to minimize the total costs of services and migrations. We study this problem

in uniform model, for which the page has a unit size, i.e. D=1. A

3.326-competitive algorithm improving the current best upper bound is designed.

We show that this ratio is tight for our algorithm.

On The Fundamental Energy Tradeoffs of Geographical Load Balancing

Abbas Kiani , Nirwan Ansari

Comments: to appear IEEE Communications Magazine

Subjects

:

Networking and Internet Architecture (cs.NI)

; Information Theory (cs.IT)

Geographical load balancing can optimize the utilization of green energy and

the cost of electricity by taking the advantages of green and price diversities

at geographical dispersed data centers. However, higher green energy

utilization or lower electricity cost may actually increase the total energy

consumption, and is not necessarily the best option. The achievable energy

tradeoffs can be captured by taking into consideration of a defined service

efficiency parameter for geo-dispersed data centers.

Asymptotically Good Convolutional Codes

Giuliano Gadioli La Guardia Subjects : Combinatorics (math.CO) ; Information Theory (cs.IT)

In this paper, we construct new sequences of asymptotically good

convolutional codes. These sequences are obtained from sequences of transitive,

self-orthogonal and self-dual block codes that attain the Tsfasman-Vladut-Zink

bound. Furthermore, by applying the techniques of expanding, extending,

puncturing, direct sum, the |u|u+v| construction and the product code

construction to these block codes, we construct more new sequences of

asymptotically good convolutional codes. Additionally, we show that the

proposed construction method presented here also works when applied for all

sequences of good block codes where lim kj/nj and lim dj/nj exist.

欢迎加入我爱机器学习QQ7群: 467165306

arXiv Paper Daily: Tue, 6 Dec 2016

微信扫一扫,关注我爱机器学习公众号

微博:我爱机器学习

原文  https://www.52ml.net/21362.html
正文到此结束
Loading...