转载

arXiv Paper Daily: Mon, 16 Jan 2017

Neural and Evolutionary Computing

LIDE: Language Identification from Text Documents

Priyank Mathur , Arkajyoti Misra , Emrah Budur Subjects : Computation and Language (cs.CL) ; Neural and Evolutionary Computing (cs.NE)

The increase in the use of microblogging came along with the rapid growth on

short linguistic data. On the other hand deep learning is considered to be the

new frontier to extract meaningful information out of large amount of raw data

in an automated manner. In this study, we engaged these two emerging fields to

come up with a robust language identifier on demand, namely Language

Identification Engine (LIDE). As a result, we achieved 95.12% accuracy in

Discriminating between Similar Languages (DSL) Shared Task 2015 dataset, which

is comparable to the maximum reported accuracy of 95.54% achieved so far.

Computer Vision and Pattern Recognition

Tumour Ellipsification in Ultrasound Images for Treatment Prediction in Breast Cancer

Mehrdad J. Gangeh , Hamid R. Tizhoosh , Kan Wu , Dun Huang , Hadi Tadayyon , Gregory J. Czarnota

Comments: Accepted at BHI 2017

Subjects

Computer Vision and Pattern Recognition (cs.CV)

Recent advances in using quantitative ultrasound (QUS) methods have provided

a promising framework to non-invasively and inexpensively monitor or predict

the effectiveness of therapeutic cancer responses. One of the earliest steps in

using QUS methods is contouring a region of interest (ROI) inside the tumour in

ultrasound B-mode images. While manual segmentation is a very time-consuming

and tedious task for human experts, auto-contouring is also an extremely

difficult task for computers due to the poor quality of ultrasound B-mode

images. However, for the purpose of cancer response prediction, a rough

boundary of the tumour as an ROI is only needed. In this research, a

semi-automated tumour localization approach is proposed for ROI estimation in

ultrasound B-mode images acquired from patients with locally advanced breast

cancer (LABC). The proposed approach comprised several modules, including 1)

feature extraction using keypoint descriptors, 2) augmenting the feature

descriptors with the distance of the keypoints to the user-input pixel as the

centre of the tumour, 3) supervised learning using a support vector machine

(SVM) to classify keypoints as “tumour” or “non-tumour”, and 4) computation of

an ellipse as an outline of the ROI representing the tumour. Experiments with

33 B-mode images from 10 LABC patients yielded promising results with an

accuracy of 76.7% based on the Dice coefficient performance measure. The

results demonstrated that the proposed method can potentially be used as the

first stage in a computer-assisted cancer response prediction system for

semi-automated contouring of breast tumours.

Real-Time Optical flow-based Video Stabilization for Unmanned Aerial Vehicles

Anli Lim , Bharath Ramesh , Yue Yang , Cheng Xiang , Zhi Gao , Feng Lin

Comments: Journal Paper

Subjects

Computer Vision and Pattern Recognition (cs.CV)

This paper describes the development of a novel algorithm to tackle the

problem of real-time video stabilization for unmanned aerial vehicles (UAVs).

There are two main components in the algorithm: (1) By designing a suitable

model for the global motion of UAV, the proposed algorithm avoids the necessity

of estimating the most general motion model, projective transformation, and

considers simpler motion models, such as rigid transformation and similarity

transformation. (2) To achieve a high processing speed, optical-flow based

tracking is employed in lieu of conventional tracking and matching methods used

by state-of-the-art algorithms. These two new ideas resulted in a real-time

stabilization algorithm, developed over two phases. Stage I considers

processing the whole sequence of frames in the video while achieving an average

processing speed of 50fps on several publicly available benchmark videos. Next,

Stage II undertakes the task of real-time video stabilization using a

multi-threading implementation of the algorithm designed in Stage I.

Active Self-Paced Learning for Cost-Effective and Progressive Face Identification

Liang Lin , Keze Wang , Deyu Meng , Wangmeng Zuo , Lei Zhang

Comments: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence 2017

Subjects

Computer Vision and Pattern Recognition (cs.CV)

This paper aims to develop a novel cost-effective framework for face

identification, which progressively maintains a batch of classifiers with the

increasing face images of different individuals. By naturally combining two

recently rising techniques: active learning (AL) and self-paced learning (SPL),

our framework is capable of automatically annotating new instances and

incorporating them into training under weak expert re-certification. We first

initialize the classifier using a few annotated samples for each individual,

and extract image features using the convolutional neural nets. Then, a number

of candidates are selected from the unannotated samples for classifier

updating, in which we apply the current classifiers ranking the samples by the

prediction confidence. In particular, our approach utilizes the high-confidence

and low-confidence samples in the self-paced and the active user-query way,

respectively. The neural nets are later fine-tuned based on the updated

classifiers. Such heuristic implementation is formulated as solving a concise

active SPL optimization problem, which also advances the SPL development by

supplementing a rational dynamic curriculum constraint. The new model finely

accords with the “instructor-student-collaborative” learning mode in human

education. The advantages of this proposed framework are two-folds: i) The

required number of annotated samples is significantly decreased while the

comparable performance is guaranteed. A dramatic reduction of user effort is

also achieved over other state-of-the-art active learning techniques. ii) The

mixture of SPL and AL effectively improves not only the classifier accuracy

compared to existing AL/SPL methods but also the robustness against noisy data.

We evaluate our framework on two challenging datasets, and demonstrate very

promising results. ( this http URL )

Cost-Effective Active Learning for Deep Image Classification

Keze Wang , Dongyu Zhang , Ya Li , Ruimao Zhang , Liang Lin

Comments: Accepted by IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) 2016

Subjects

Computer Vision and Pattern Recognition (cs.CV)

Recent successes in learning-based image classification, however, heavily

rely on the large number of annotated training samples, which may require

considerable human efforts. In this paper, we propose a novel active learning

framework, which is capable of building a competitive classifier with optimal

feature representation via a limited amount of labeled training instances in an

incremental learning manner. Our approach advances the existing active learning

methods in two aspects. First, we incorporate deep convolutional neural

networks into active learning. Through the properly designed framework, the

feature representation and the classifier can be simultaneously updated with

progressively annotated informative samples. Second, we present a

cost-effective sample selection strategy to improve the classification

performance with less manual annotations. Unlike traditional methods focusing

on only the uncertain samples of low prediction confidence, we especially

discover the large amount of high confidence samples from the unlabeled set for

feature learning. Specifically, these high confidence samples are automatically

selected and iteratively assigned pseudo-labels. We thus call our framework

“Cost-Effective Active Learning” (CEAL) standing for the two

advantages.Extensive experiments demonstrate that the proposed CEAL framework

can achieve promising results on two challenging image classification datasets,

i.e., face recognition on CACD database [1] and object categorization on

Caltech-256 [2].

An OpenCL(TM) Deep Learning Accelerator on Arria 10

Utku Aydonat , Shane O'Connell , Davor Capalija , Andrew C. Ling , Gordon R. Chiu

Comments: To be published at FPGA 2017

Subjects

Distributed, Parallel, and Cluster Computing (cs.DC)

; Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV)

Convolutional neural nets (CNNs) have become a practical means to perform

vision tasks, particularly in the area of image classification. FPGAs are well

known to be able to perform convolutions efficiently, however, most recent

efforts to run CNNs on FPGAs have shown limited advantages over other devices

such as GPUs. Previous approaches on FPGAs have often been memory bound due to

the limited external memory bandwidth on the FPGA device. We show a novel

architecture written in OpenCL(TM), which we refer to as a Deep Learning

Accelerator (DLA), that maximizes data reuse and minimizes external memory

bandwidth. Furthermore, we show how we can use the Winograd transform to

significantly boost the performance of the FPGA. As a result, when running our

DLA on Intel’s Arria 10 device we can achieve a performance of 1020 img/s, or

23 img/s/W when running the AlexNet CNN benchmark. This comes to 1382 GFLOPs

and is 10x faster with 8.4x more GFLOPS and 5.8x better efficiency than the

state-of-the-art on FPGAs. Additionally, 23 img/s/W is competitive against the

best publicly known implementation of AlexNet on nVidia’s TitanX GPU.

Artificial Intelligence

On the links between argumentation-based reasoning and nonmonotonic reasoning

Zimi Li , Nir Oren , Simon Parsons Subjects : Artificial Intelligence (cs.AI)

In this paper we investigate the links between instantiated argumentation

systems and the axioms for non-monotonic reasoning described in [9] with the

aim of characterising the nature of argument based reasoning. In doing so, we

consider two possible interpretations of the consequence relation, and describe

which axioms are met by ASPIC+ under each of these interpretations. We then

consider the links between these axioms and the rationality postulates. Our

results indicate that argument based reasoning as characterised by ASPIC+ is –

according to the axioms of [9] – non-cumulative and non-monotonic, and

therefore weaker than the weakest non-monotonic reasoning systems they

considered possible. This weakness underpins ASPIC+’s success in modelling

other reasoning systems, and we conclude by considering the relationship

between ASPIC+ and other weak logical systems.

Fuzzy Clustering Data Given in the Ordinal Scale

Zhengbing Hu , Yevgeniy V. Bodyanskiy , Oleksii K. Tyshchenko , Viktoriia O. Samitova

Journal-ref: I.J. Intelligent Systems and Applications, 2017, Vol. 9, No. 1,

pp. 67-74

Subjects

Artificial Intelligence (cs.AI)

A fuzzy clustering algorithm for multidimensional data is proposed in this

article. The data is described by vectors whose components are linguistic

variables defined in an ordinal scale. The obtained results confirm the

efficiency of the proposed approach.

A Savage-Like Axiomatization for Nonstandard Expected Utility

Grant Molnar

Comments: 5 pages

Subjects

Artificial Intelligence (cs.AI)

Since Leonard Savage’s epoch-making memoir, Subjective Expected Utility

Theory has been the presumptive model for decision-making. Savage provided an

act-based axiomatization of standard expected utility theory. In this article,

we provide a Savage-like axiomatization of nonstandard expected utility theory.

It corresponds to a weakening of Savage’s (6^{th}) axiom.

Deep Probabilistic Programming

Dustin Tran , Matthew D. Hoffman , Rif A. Saurous , Eugene Brevdo , Kevin Murphy , David M. Blei Subjects : Machine Learning (stat.ML) ; Artificial Intelligence (cs.AI); Learning (cs.LG); Programming Languages (cs.PL); Computation (stat.CO)

We propose Edward, a Turing-complete probabilistic programming language.

Edward builds on two compositional representations—random variables and

inference. By treating inference as a first class citizen, on a par with

modeling, we show that probabilistic programming can be as flexible and

computationally efficient as traditional deep learning. For flexibility, Edward

makes it easy to fit the same model using a variety of composable inference

methods, ranging from point estimation, to variational inference, to MCMC. In

addition, Edward can reuse the modeling representation as part of inference,

facilitating the design of rich variational models and generative adversarial

networks. For efficiency, Edward is integrated into TensorFlow, providing

significant speedups over existing probabilistic systems. For example, on a

benchmark logistic regression task, Edward is at least 35x faster than Stan and

PyMC3.

Linear Disentangled Representation Learning for Facial Actions

Xiang Xiang , Trac D. Tran

Comments: Codes available at this https URL and this https URL arXiv admin note: text overlap with arXiv:1410.1606

Subjects

Computer Vision and Pattern Recognition (cs.CV)

; Artificial Intelligence (cs.AI); Learning (cs.LG); Machine Learning (stat.ML)

Limited annotated data available for the recognition of facial expression and

action units embarrasses the training of deep networks, which can learn

disentangled invariant features. However, a linear model with just several

parameters normally is not demanding in terms of training data. In this paper,

we propose an elegant linear model to untangle confounding factors in

challenging realistic multichannel signals such as 2D face videos. The simple

yet powerful model does not rely on huge training data and is natural for

recognizing facial actions without explicitly disentangling the identity. Base

on well-understood intuitive linear models such as Sparse Representation based

Classification (SRC), previous attempts require a prepossessing of explicit

decoupling which is practically inexact. Instead, we exploit the low-rank

property across frames to subtract the underlying neutral faces which are

modeled jointly with sparse representation on the action components with group

sparsity enforced. On the extended Cohn-Kanade dataset (CK+), our one-shot

automatic method on raw face videos performs as competitive as SRC applied on

manually prepared action components and performs even better than SRC in terms

of true positive rate. We apply the model to the even more challenging task of

facial action unit recognition, verified on the MPI Face Video Database

(MPI-VDB) achieving a decent performance. All the programs and data have been

made publicly available.

Information Retrieval

Scalable, Trie-based Approximate Entity Extraction for Real-Time Financial Transaction Screening

Emrah Budur Subjects : Computation and Language (cs.CL) ; Information Retrieval (cs.IR)

Financial institutions have to screen their transactions to ensure that they

are not affiliated with terrorism entities. Developing appropriate solutions to

detect such affiliations precisely while avoiding any kind of interruption to

large amount of legitimate transactions is essential. In this paper, we present

building blocks of a scalable solution that may help financial institutions to

build their own software to extract terrorism entities out of both structured

and unstructured financial messages in real time and with approximate

similarity matching approach.

Computation and Language