Fusion of Heterogeneous Data in Convolutional Networks for Urban Semantic Labeling (Invited Paper)
Comments: Joint Urban Remote Sensing Event (JURSE), Mar 2017, Dubai, United Arab Emirates. Joint Urban Remote Sensing Event 2017
Subjects:
Neural and Evolutionary Computing (cs.NE)
; Computer Vision and Pattern Recognition (cs.CV)
In this work, we present a novel module to perform fusion of heterogeneous
data using fully convolutional networks for semantic labeling. We introduce
residual correction as a way to learn how to fuse predictions coming out of a
dual stream architecture. Especially, we perform fusion of DSM and IRRG optical
data on the ISPRS Vaihingen dataset over a urban area and obtain new
state-of-the-art results.
Comments: Link to the IEEE published version: this http URL
Subjects:
Neural and Evolutionary Computing (cs.NE)
The paper describes an approach to implementing genetic programming, which
uses the LLVM library to just-in-time compile/interpret the evolved abstract
syntax trees. The solution is described in some detail, including a parser
(based on FlexC++ and BisonC++) that can construct the trees from a simple toy
language with C-like syntax. The approach is compared with a previous
implementation (based on direct execution of trees using polymorphic functors)
in terms of execution speed.
Comments: Accepted for publication, ICASSP 2017
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
; Computation and Language (cs.CL)
Traditional visual speech recognition systems consist of two stages, feature
extraction and classification. Recently, several deep learning approaches have
been presented which automatically extract features from the mouth images and
aim to replace the feature extraction stage. However, research on joint
learning of features and classification is very limited. In this work, we
present an end-to-end visual speech recognition system based on Long-Short
Memory (LSTM) networks. To the best of our knowledge, this is the first model
which simultaneously learns to extract features directly from the pixels and
perform classification and also achieves state-of-the-art performance in visual
speech classification. The model consists of two streams which extract features
directly from the mouth and difference images, respectively. The temporal
dynamics in each stream are modelled by an LSTM and the fusion of the two
streams takes place via a Bidirectional LSTM (BLSTM). An absolute improvement
of 9.7% over the base line is reported on the OuluVS2 database, and 1.5% on the
CUAVE database when compared with other methods which use a similar visual
front-end.
Osman Tursun , Cemal Aker , Sinan Kalkan Subjects : Computer Vision and Pattern Recognition (cs.CV)
Trademark retrieval (TR) has become an important yet challenging problem due
to an ever increasing trend in trademark applications and infringement
incidents. There have been many promising attempts for the TR problem, which,
however, fell impracticable since they were evaluated with limited and mostly
trivial datasets. In this paper, we provide a large-scale dataset with
benchmark queries with which different TR approaches can be evaluated
systematically. Moreover, we provide a baseline on this benchmark using the
widely-used methods applied to TR in the literature. Furthermore, we identify
and correct two important issues in TR approaches that were not addressed
before: reversal of contrast, and presence of irrelevant text in trademarks
severely affect the TR methods. Lastly, we applied deep learning, namely,
several popular Convolutional Neural Network models, to the TR problem. To the
best of the authors, this is the first attempt to do so.
Filippo Basso , Emanuele Menegatti , Alberto Pretto Subjects : Computer Vision and Pattern Recognition (cs.CV) ; Robotics (cs.RO)
Color-depth cameras (RGB-D cameras) have become the primary sensors in most
robotics systems, from service robotics to industrial robotics applications.
Typical consumer-grade RGB-D cameras are provided with a coarse intrinsic and
extrinsic calibration that generally does not meet the accuracy requirements
needed by many robotics applications (e.g., high accuracy 3D environment
reconstruction and mapping, high precision object recognition and localization,
…). In this paper, we propose a human-friendly, reliable and accurate
calibration framework that enables to easily estimate both the intrinsic and
extrinsic parameters of a general color-depth sensor couple. Our approach is
based on a novel, two components, measurement error model that unifies the
error sources of different RGB-D pairs based on different technologies, such as
structured-light 3D cameras and time-of-flight cameras. The proposed correction
model is implemented using two different parametric undistortion maps that
provide the calibrated readings by means of linear combinations of control
functions. Our method provides some important advantages compared to other
state-of-the-art systems: it is general (i.e., well suited for different types
of sensors), it is based on an easy and stable calibration protocol, it
provides a greater calibration accuracy, and it has been implemented within the
ROS robotics framework. We report detailed and comprehensive experimental
validations and performance comparisons to support our statements.
Comments: 12 pages, 17 figures
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
This paper addresses the automatic generation of a typographic font from a
subset of characters. Specifically, we use a subset of a typographic font to
extrapolate additional characters. Consequently, we obtain a complete font
containing a number of characters sufficient for daily use. The automated
generation of Japanese fonts is in high demand because a Japanese font requires
over 1,000 characters. Unfortunately, professional typographers create most
fonts, resulting in significant financial and time investments for font
generation. The proposed method can be a great aid for font creation because
designers do not need to create the majority of the characters for a new font.
The proposed method uses strokes from given samples for font generation. The
strokes, from which we construct characters, are extracted by exploiting a
character skeleton dataset. This study makes three main contributions: a novel
method of extracting strokes from characters, which is applicable to both
standard fonts and their variations; a fully automated approach for
constructing characters; and a selection method for sample characters. We
demonstrate our proposed method by generating 2,965 characters in 47 fonts.
Objective and subjective evaluations verify that the generated characters are
similar to handmade characters.
Comments: 9 pages including 1 references, 9 figures, 2 tables
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
We present a novel feature matching algorithm that systematically utilizes
the geometric properties of features such as position, scale, and orientation,
in addition to the conventional descriptor vectors. In challenging scenes with
the presence of repetitive patterns or with a large viewpoint change, it is
hard to find the correct correspondences using feature descriptors only, since
the descriptor distances of the correct matches may not be the least among the
candidates due to appearance changes. Assuming that the layout of the nearby
features does not changed much, we propose the bidirectional transfer measure
to gauge the geometric consistency of a pair of feature correspondences. The
feature matching problem is formulated as a Markov random field (MRF) which
uses descriptor distances and relative geometric similarities together. The
unmatched features are explicitly modeled in the MRF to minimize its negative
impact. For speed and stability, instead of solving the MRF on the entire
features at once, we start with a small set of confident feature matches, and
then progressively search the candidates in nearby features and expand the MRF
with them. Experimental comparisons show that the proposed algorithm finds
better feature correspondences, i.e. more matches with higher inlier ratio, in
many challenging scenes with much lower computational cost than the
state-of-the-art algorithms.
Comments: submitted to ICME
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
The image super-resolution (SR) methods will essentially lead to a loss of
some high-frequency (HF) information when predicting high-resolution (HR)
images from low-resolution (LR) images without using external references. To
address that, we additionally utilize online retrieved data to facilitate image
SR in a unified deep framework. A novel dual high-frequency recovery network
(DHN) is proposed to predict an HR image with three parts: an LR image, an
internal inferred HF (IHF) map (HF missing part inferred solely from the LR
image) and an external extracted HF (EHF) map. In particular, we infer the HF
information based on both the LR image and similar HR
efc{references} which
is retrieved online. For the EHF map, we align the
efc{references} with
affine transformation and then in the aligned {references}, part of HF signals
are extracted by the proposed DHN to compensate for the HF loss. Extensive
experimental results demonstrate that our DHN achieves notably better
performance than state-of-the-art SR methods.
Holistic Interstitial Lung Disease Detection using Deep Convolutional Neural Networks: Multi-label Learning and Unordered Pooling
Mingchen Gao , Ziyue Xu , Le Lu , Adam P. Harrison , Ronald M. Summers , Daniel J. Mollura Subjects : Computer Vision and Pattern Recognition (cs.CV)
Accurately predicting and detecting interstitial lung disease (ILD) patterns
given any computed tomography (CT) slice without any pre-processing
prerequisites, such as manually delineated regions of interest (ROIs), is a
clinically desirable, yet challenging goal. The majority of existing work
relies on manually-provided ILD ROIs to extract sampled 2D image patches from
CT slices and, from there, performs patch-based ILD categorization. Acquiring
manual ROIs is labor intensive and serves as a bottleneck towards
fully-automated CT imaging ILD screening over large-scale populations.
Furthermore, despite the considerable high frequency of more than one ILD
pattern on a single CT slice, previous works are only designed to detect one
ILD pattern per slice or patch.
To tackle these two critical challenges, we present multi-label deep
convolutional neural networks (CNNs) for detecting ILDs from holistic CT slices
(instead of ROIs or sub-images). Conventional single-labeled CNN models can be
augmented to cope with the possible presence of multiple ILD pattern labels,
via 1) continuous-valued deep regression based robust norm loss functions or 2)
a categorical objective as the sum of element-wise binary logistic losses. Our
methods are evaluated and validated using a publicly available database of 658
patient CT scans under five-fold cross-validation, achieving promising
performance on detecting four major ILD patterns: Ground Glass, Reticular,
Honeycomb, and Emphysema. We also investigate the effectiveness of a CNN
activation-based deep-feature encoding scheme using Fisher vector encoding,
which treats ILD detection as spatially-unordered deep texture classification.
Mohammad Reza Mahmoodi Subjects : Computer Vision and Pattern Recognition (cs.CV)
In this paper, an efficient skin detection system is proposed. The algorithm
is based on a very fast efficient pre-processing step utilizing the concept of
ternary conversion in order to identify candidate windows and subsequently, a
novel local two-stage diffusion method which has F-score accuracy of 0.5978 on
SDD dataset. The pre-processing step has been proven to be useful to boost the
speed of the system by eliminating 82% of an image in average. This is obtained
by keeping the true positive rate above 98%. In addition, a novel segmentation
algorithm is also designed to process candidate windows which is quantitatively
and qualitatively proven to be very efficient in term of accuracy. The
algorithm has been implemented in FPGA to obtain real-time processing speed.
The system is designed fully pipeline and the inherent parallel structure of
the algorithm is fully exploited to maximize the performance. The system is
implemented on a Spartan-6 LXT45 Xilinx FPGA and it is capable of processing 98
frames of 640*480 24-bit color images per second.
Mohammad Reza Mahmoodi Subjects : Computer Vision and Pattern Recognition (cs.CV)
Skin Segmentation is widely used in biometric applications such as face
detection, face recognition, face tracking, and hand gesture recognition.
However, several challenges such as nonlinear illumination, equipment effects,
personal interferences, ethnicity variations, etc., are involved in detection
process that result in the inefficiency of color based methods. Even though
many ideas have already been proposed, the problem has not been satisfactorily
solved yet. This paper introduces a technique that addresses some limitations
of the previous works. The proposed algorithm consists of three main steps
including initial seed generation of skin map, Otsu segmentation in color
images, and finally a two-stage diffusion. The initial seed of skin pixels is
provided based on the idea of ternary image as there are certain pixels in
images which are associated to human complexion with very high probability. The
Otsu segmentation is performed on several color channels in order to identify
homogeneous regions. The result accompanying with the edge map of the image is
utilized in two consecutive diffusion steps in order to annex initially
unidentified skin pixels to the seed. Both quantitative and qualitative results
demonstrate the effectiveness of the proposed system in compare with the
state-of-the-art works.
Fusion of Heterogeneous Data in Convolutional Networks for Urban Semantic Labeling (Invited Paper)
Comments: Joint Urban Remote Sensing Event (JURSE), Mar 2017, Dubai, United Arab Emirates. Joint Urban Remote Sensing Event 2017
Subjects:
Neural and Evolutionary Computing (cs.NE)
; Computer Vision and Pattern Recognition (cs.CV)
In this work, we present a novel module to perform fusion of heterogeneous
data using fully convolutional networks for semantic labeling. We introduce
residual correction as a way to learn how to fuse predictions coming out of a
dual stream architecture. Especially, we perform fusion of DSM and IRRG optical
data on the ISPRS Vaihingen dataset over a urban area and obtain new
state-of-the-art results.
Vinh Nguyen , Amit Sheth Subjects : Artificial Intelligence (cs.AI) ; Databases (cs.DB)
Logical inference, an integral feature of the Semantic Web, is the process of
deriving new triples by applying entailment rules on knowledge bases. The
entailment rules are determined by the model-theoretic semantics. Incorporating
context of an RDF triple (e.g., provenance, time, and location) into the
inferencing process requires the formal semantics to be capable of describing
the context of RDF triples also in the form of triples, or in other words, RDF
contextual triples about triples. The formal semantics should also provide the
rules that could entail new contextual triples about triples. In this paper, we
propose the first inferencing mechanism that allows context of RDF triples,
represented in the form of RDF triples about triples, to be the first-class
citizens in the model-theoretic semantics and in the logical rules. Our
inference mechanism is well-formalized with all new concepts being captured in
the model-theoretic semantics. This formal semantics also allows us to derive a
new set of entailment rules that could entail new contextual triples about
triples. To demonstrate the feasibility and the scalability of the proposed
mechanism, we implement a new tool in which we transform the existing knowledge
bases to our representation of RDF triples about triples and provide the option
for this tool to compute the inferred triples for the proposed rules. We
evaluate the computation of the proposed rules on a large scale using various
real-world knowledge bases such as Bio2RDF NCBI Genes and DBpedia. The results
show that the computation of the inferred triples can be highly scalable. On
average, one billion inferred triples adds 5-6 minutes to the overall
transformation process. NCBI Genes, with 20 billion triples in total, took only
232 minutes for the transformation of 12 billion triples and added 42 minutes
for inferring 8 billion triples to the overall process.
Comments: 23 pages, 9 figures
Subjects:
Information Retrieval (cs.IR)
Image retrieval is a complex task that differs according to the context and
the user requirements in any specific field, for example in a medical
environment. Search by text is often not possible or optimal and retrieval by
the visual content does not always succeed in modelling high-level concepts
that a user is looking for. Modern image retrieval techniques consist of
multiple steps and aim to retrieve information from large–scale datasets and
not only based on global image appearance but local features and if possible in
a connection between visual features and text or semantics. This paper presents
the Parallel Distributed Image Search Engine (ParaDISE), an image retrieval
system that combines visual search with text–based retrieval and that is
available as open source and free of charge. The main design concepts of
ParaDISE are flexibility, expandability, scalability and interoperability.
These concepts constitute the system, able to be used both in real-world
applications and as an image retrieval research platform. Apart from the
architecture and the implementation of the system, two use cases are described,
an application of ParaDISE in retrieval of images from the medical literature
and a visual feature evaluation for medical image retrieval. Future steps
include the creation of an open source community that will contribute and
expand this platform based on the existing parts.
Saeedeh Shekarpour , Valerie Shalin , Krishnaprasad Thirunarayan , Amit P. Sheth Subjects : Computation and Language (cs.CL)
While the general analysis of named entities has received substantial
research attention, the analysis of relations over named entities has not. In
fact, a review of the literature on unstructured as well as structured data
revealed a deficiency in research on the abstract conceptualization required to
organize relations. We believe that such an abstract conceptualization can
benefit various communities and applications such as natural language
processing, information extraction, machine learning and ontology engineering.
In this paper, we present CEVO (i.e., a Comprehensive EVent Ontology) built on
Levin’s conceptual hierarchy of English verbs that categorizes verbs with the
shared meaning and syntactic behavior. We present the fundamental concepts and
requirements for this ontology. Furthermore, we present three use cases for
demonstrating the benefits of this ontology on annotation tasks: 1) annotating
relations in plain text, 2) annotating ontological properties and 3) linking
textual relations to ontological properties.
Comments: The SIGNLL Conference on Computational Natural Language Learning (CoNLL 2016)
Subjects:
Computation and Language (cs.CL)
Sentiments expressed in user-generated short text and sentences are nuanced
by subtleties at lexical, syntactic, semantic and pragmatic levels. To address
this, we propose to augment traditional features used for sentiment analysis
and sarcasm detection, with cognitive features derived from the eye-movement
patterns of readers. Statistical classification using our enhanced feature set
improves the performance (F-score) of polarity detection by a maximum of 3.7%
and 9.3% on two datasets, over the systems that use only traditional features.
We perform feature significance analysis, and experiment on a held-out dataset,
showing that cognitive features indeed empower sentiment analyzers to handle
complex constructs.
Comments: The 54th Annual Meeting of The Association for Computational Linguistics (ACL 2016)
Subjects:
Computation and Language (cs.CL)
In this paper, we propose a novel mechanism for enriching the feature vector,
for the task of sarcasm detection, with cognitive features extracted from
eye-movement patterns of human readers. Sarcasm detection has been a
challenging research problem, and its importance for NLP applications such as
review summarization, dialog systems and sentiment analysis is well recognized.
Sarcasm can often be traced to incongruity that becomes apparent as the full
sentence unfolds. This presence of incongruity- implicit or explicit- affects
the way readers eyes move through the text. We observe the difference in the
behaviour of the eye, while reading sarcastic and non sarcastic sentences.
Motivated by his observation, we augment traditional linguistic and stylistic
features for sarcasm detection with the cognitive features obtained from
readers eye movement data. We perform statistical classification using the
enhanced feature set so obtained. The augmented cognitive features improve
sarcasm detection by 3.7% (in terms of F-score), over the performance of the
best reported system.
Comments: Accepted for publication, ICASSP 2017
Subjects:
Computer Vision and Pattern Recognition (cs.CV)
; Computation and Language (cs.CL)
Traditional visual speech recognition systems consist of two stages, feature
extraction and classification. Recently, several deep learning approaches have
been presented which automatically extract features from the mouth images and
aim to replace the feature extraction stage. However, research on joint
learning of features and classification is very limited. In this work, we
present an end-to-end visual speech recognition system based on Long-Short
Memory (LSTM) networks. To the best of our knowledge, this is the first model
which simultaneously learns to extract features directly from the pixels and
perform classification and also achieves state-of-the-art performance in visual
speech classification. The model consists of two streams which extract features
directly from the mouth and difference images, respectively. The temporal
dynamics in each stream are modelled by an LSTM and the fusion of the two
streams takes place via a Bidirectional LSTM (BLSTM). An absolute improvement
of 9.7% over the base line is reported on the OuluVS2 database, and 1.5% on the
CUAVE database when compared with other methods which use a similar visual
front-end.
Git Blame Who?: Stylistic Authorship Attribution of Small, Incomplete Source Code Fragments
Edwin Dauber , Aylin Caliskan-Islam , Richard Harang , Rachel Greenstadt Subjects : Learning (cs.LG) ; Cryptography and Security (cs.CR)
Program authorship attribution has implications for the privacy of
programmers who wish to contribute code anonymously. While previous work has
shown that complete files that are individually authored can be attributed, we
show here for the first time that accounts belonging to open source
contributors containing short, incomplete, and typically uncompilable fragments
can also be effectively attributed.
We propose a technique for authorship attribution of contributor accounts
containing small source code samples, such as those that can be obtained from
version control systems or other direct comparison of sequential versions. We
show that while application of previous methods to individual small source code
samples yields an accuracy of about 73% for 106 programmers as a baseline, by
ensembling and averaging the classification probabilities of a sufficiently
large set of samples belonging to the same author we achieve 99% accuracy for
assigning the set of samples to the correct author. Through these results, we
demonstrate that attribution is an important threat to privacy for programmers
even in real-world collaborative environments such as GitHub. Additionally, we
propose the use of calibration curves to identify samples by unknown and
previously unencountered authors in the open world setting. We show that we can
also use these calibration curves in the case that we do not have linking
information and thus are forced to classify individual samples directly. This
is because the calibration curves allow us to identify which samples are more
likely to have been correctly attributed. Using such a curve can help an
analyst choose a cut-off point which will prevent most misclassifications, at
the cost of causing the rejection of some of the more dubious correct
attributions.
Comments: 5 pages, 4 figures
Subjects:
Social and Information Networks (cs.SI)
; Learning (cs.LG); Physics and Society (physics.soc-ph); Machine Learning (stat.ML)
We study the inference of a model of temporal networks in which both
communities and links keep memory of previous network state. By considering
maximum likelihood inference from single snapshot observation of the network,
we show that link persistence decreases the detectability threshold, preventing
the inference of communities even when they are in principle strong enough,
while community persistence tends to restore this possibility. Then we show
that the inferred communities share a maximum overlap with those of a specific
previous instant of time, corresponding to the maximum of a time-lagged
assortativity parameter, and therefore they can be closer to those of the past
than of the present. We analytically characterize these transitions as a
function of the memory parameters of the model.
Empirical Study of Drone Sound Detection in Real-Life Environment with Deep Neural Networks
Comments: IEEE 5 Pages, Submitted
Subjects:
Sound (cs.SD)
; Learning (cs.LG)
This work aims to investigate the use of deep neural network to detect
commercial hobby drones in real-life environments by analyzing their sound
data. The purpose of work is to contribute to a system for detecting drones
used for malicious purposes, such as for terrorism. Specifically, we present a
method capable of detecting the presence of commercial hobby drones as a binary
classification problem based on sound event detection. We recorded the sound
produced by a few popular commercial hobby drones, and then augmented this data
with diverse environmental sound data to remedy the scarcity of drone sound
data in diverse environments. We investigated the effectiveness of
state-of-the-art event sound classification methods, i.e., a Gaussian Mixture
Model (GMM), Convolutional Neural Network (CNN), and Recurrent Neural Network
(RNN), for drone sound detection. Our empirical results, which were obtained
with a testing dataset collected on an urban street, confirmed the
effectiveness of these models for operating in a real environment. In summary,
our RNN models showed the best detection performance with an F-Score of 0.8009
with 240 ms of input audio with a short processing time, indicating their
applicability to real-time detection systems.
A Novel Variable Selection Method based on Frequent Pattern Tree for Real-time Traffic Accident Risk Prediction
Comments: OPT-i 2014 – 1st International Conference on Engineering and Applied Sciences Optimization, Proceedings
Subjects:
Applications (stat.AP)
; Learning (cs.LG)
Traffic accident data are usually noisy, contain missing values, and
heterogeneous. How to select the most important variables to improve real-time
traffic accident risk prediction has become a concern of many recent studies.
This paper proposes a novel variable selection method based on the Frequent
Pattern tree (FP tree) algorithm. First, all the frequent patterns in the
traffic accident dataset are discovered. Then for each frequent pattern, a new
criterion, called the Relative Object Purity Ratio (ROPR) which we proposed, is
calculated. This ROPR is added to the importance score of the variables that
differentiates one frequent pattern from the others. To test the proposed
method, a dataset was compiled from the traffic accidents records detected by
only one detector on interstate highway I-64 in Virginia in 2005. This data set
was then linked to other variables such as real-time traffic information and
weather conditions. Both the proposed method based on the FP tree algorithm, as
well as the widely utilized, random forest method, were then used to identify
the important variables or the Virginia data set. The results indicate that
there are some differences between the variables deemed important by the FP
tree and those selected by the random forest method. Following this, two
baseline models (i.e. a nearest neighbor (k-NN) method and a Bayesian network)
were developed to predict accident risk based on the variables identified by
both the FP tree method and the random forest method. The results show that the
models based on the variable selection using the FP tree performed better than
those based on the random forest method for several versions of the k-NN and
Bayesian network models.The best results were derived from a Bayesian network
model using variables from FP tree. That model could predict 61.11% of
accidents accurately, while having a false alarm rate of 38.16%.
Yong Cai , Yunlong Wang , Dong Dai Subjects : Machine Learning (stat.ML) ; Learning (cs.LG)
In rare disease physician targeting, a major challenge is how to identify
physicians who are treating diagnosed or underdiagnosed rare diseases patients.
Rare diseases have extremely low incidence rate. For a specified rare disease,
only a small number of patients are affected and a fractional of physicians are
involved. The existing targeting methodologies, such as segmentation and
profiling, are developed under mass market assumption. They are not suitable
for rare disease market where the target classes are extremely imbalanced. The
authors propose a graphical model approach to predict targets by jointly
modeling physician and patient features from different data spaces and
utilizing the extra relational information. Through an empirical example with
medical claim and prescription data, the proposed approach demonstrates better
accuracy in finding target physicians. The graph representation also provides
visual interpretability of relationship among physicians and patients. The
model can be extended to incorporate more complex dependency structures. This
article contributes to the literature of exploring the benefit of utilizing
relational dependencies among entities in healthcare industry.
Comments: Appeared in the Proceedings of the 29th Advances in Neural Information Processing Systems (NIPS 2016)
Subjects:
Machine Learning (stat.ML)
; Learning (cs.LG)
We introduce a new dynamical system for sequentially observed multivariate
count data. This model is based on the gamma–Poisson construction—a natural
choice for count data—and relies on a novel Bayesian nonparametric prior that
ties and shrinks the model parameters, thus avoiding overfitting. We present an
efficient MCMC inference algorithm that advances recent work on augmentation
schemes for inference in negative binomial models. Finally, we demonstrate the
model’s inductive bias using a variety of real-world data sets, showing that it
exhibits superior predictive performance over other models and infers highly
interpretable latent structure.
Comments: A short version of this work will be submitted to the 2017 IEEE International Symposium on Information Theory (ISIT)
Subjects:
Information Theory (cs.IT)
We consider a system, containing a library of multiple files and a general
memoryless communication network through which a server is connected to
multiple users, each equipped with a local isolated cache of certain size that
can be used to store part of the library. Each user will ask for one of the
files in the library, which needs to be delivered by the server through the
intermediate communication network. The objective is to design the cache
placement (without prior knowledge of users’ future requests) and the delivery
phase in order to minimize the (normalized) delivery delay. We assume that the
delivery phase consists of two steps: (1) generation of a set of multicast
messages at the server, one for each subset of users, and (2) delivery of the
multicast messages to the users.
In this setting, we show that there exists a universal scheme for cache
placement and multicast message generation, which is independent of the
underlying communication network between the server and the users, and achieves
the optimal delivery delay to within a constant factor for all memoryless
networks. We prove this result, even though the capacity region of the
underlying communication network is not known, even approximately. This result
demonstrates that in the aforementioned setting, a separation between caching
and multicast message generation on one hand, and delivering the multicast
messages to the users on the other hand is approximately optimal. This result
has the important practical implication that the prefetching can be done
independent of network structure in the upcoming delivery phase.
Mutual Information and Optimality of Approximate Message-Passing in Random Linear Estimation
Comments: Submitted to the IEEE Transactions in Information Theory
Subjects:
Information Theory (cs.IT)
; Disordered Systems and Neural Networks (cond-mat.dis-nn); Mathematical Physics (math-ph)
We consider the estimation of a signal from the knowledge of its noisy linear
random Gaussian projections. A few examples where this problem is relevant are
compressed sensing, sparse superposition codes, and code division multiple
access. There has been a number of works considering the mutual information for
this problem using the replica method from statistical physics. Here we put
these considerations on a firm rigorous basis. First, we show, using a
Guerra-Toninelli type interpolation, that the replica formula yields an upper
bound to the exact mutual information. Secondly, for many relevant practical
cases, we present a converse lower bound via a method that uses spatial
coupling, state evolution analysis and the I-MMSE theorem. This yields a single
letter formula for the mutual information and the minimal-mean-square error for
random Gaussian linear estimation of all discrete bounded signals. In addition,
we prove that the low complexity approximate message-passing algorithm is
optimal outside of the so-called hard phase, in the sense that it
asymptotically reaches the minimal-mean-square error. In this work spatial
coupling is used primarily as a proof technique. However our results also prove
two important features of spatially coupled noisy linear random Gaussian
estimation. First there is no algorithmically hard phase. This means that for
such systems approximate message-passing always reaches the minimal-mean-square
error. Secondly, in a proper limit the mutual information associated to such
systems is the same as the one of uncoupled linear random Gaussian estimation.
Low-complexity Receiver for Multi-Level Polar Coded Modulation in Non-Orthogonal Multiple Access
Beatrice Tomasi , Frédéric Gabry , Valerio Bioglio , Ingmar Land , Jean-Claude Belfiore Subjects : Information Theory (cs.IT)
Non-orthogonal multiple access (NOMA) schemes have been proved to increase
the multiple-access achievable rate with respect to orthogonal multiple access
(OMA). In this paper we propose a novel communication system that combines
multi-level coded modulation and polar codes in a NOMA scenario. Computational
complexity decreases with the proposed scheme with respect to state-of-the-art
solutions. We also highlight the trade-off between error rate performance and
computational complexity.
Comments: This is the author’s version of an article that has been accepted for publication in IEEE Communications Letters
Subjects:
Information Theory (cs.IT)
; Networking and Internet Architecture (cs.NI)
Proportional fair scheduling (PFS) has been adopted as a standard solution
for fair resource allocation in modern wireless cellular networks. With the
emergence of heterogeneous networks with widely varying user loads, it is of
great importance to characterize the performance of PFS under bursty traffic,
which is the case in most wireless streaming and data transfer services. In
this letter, we provide the first analytical solution to the performance of PFS
under bursty on-off traffic load. We use the Gaussian approximation model to
derive a closed-form expression of the achievable user data rates. In order to
further improve the accuracy of our baseline analytical solution for multi-cell
networks, we design a hybrid approximation by employing multi-interference
analysis. The simulation results verify that our model guarantees extremely low
data rate estimation error, which is further insensitive to changes in session
duration, traffic load and user density.
Dor Itzhak , Yossef Steinberg Subjects : Information Theory (cs.IT)
As demonstrated in many recent studies, cooperation between users can greatly
improve the performance of communication systems. Most of the works in the
literature present models where all the users are aware of the resources
available for cooperation. However, the scenario where cooperation links are
sometimes unavailable or that some users cannot be updated whether the
cooperation links are present or not, is more realistic in today’s dynamic
ad-hoc communication systems. In such a case we need coding schemes that
exploit the cooperation links if they are present, and can still operate if
cooperation is not possible. In this work we study the general broadcast
channel model with degraded message sets and cooperation links that may be
absent, and derive it’s capacity region under such uncertainty conditions.
Physical Layer Security Jamming: Theoretical Limits and Practical Designs in Wireless Networks
Kanapathippillai Cumanan , Hong Xing , Peng Xu , Gan Zheng , Xuchu Dai , Arumugam Nallanathan , Zhiguo Ding , George K. Karagiannidis Subjects : Information Theory (cs.IT)
Physical layer security has been recently recognized as a promising new
design paradigm to provide security in wireless networks. In addition to the
existing conventional cryptographic methods, physical layer security exploits
the dynamics of fading channels to enhance secured wireless links. In this
approach, jamming plays a key role by generating noise signals to confuse the
potential eavesdroppers, and significantly improves quality and reliability of
secure communications between legitimate terminals. This article presents
theoretical limits and practical designs of jamming approaches for physical
layer security. In particular, the theoretical limits explore the achievable
secrecy rates of user cooperation based jamming whilst the centralized, and
game theoretic based precoding techniques are reviewed for practical
implementations. In addition, the emerging wireless energy harvesting
techniques are exploited to harvest the required energy to transmit jamming
signals. Future directions of these approaches, and the associated research
challenges are also briefly outlined.
Comments: 30 pages, 7 figures, submitted to ICC, TWC
Subjects:
Information Theory (cs.IT)
Base station cooperation in heterogeneous wireless networks (HetNets) is a
promising approach to improve the network performance, but it also imposes a
significant challenge on backhaul. On the other hand, caching at small base
stations (SBSs) is considered as an efficient way to reduce backhaul load in
HetNets. In this paper, we jointly consider SBS caching and cooperation in a
downlink largescale HetNet. We propose two SBS cooperative transmission schemes
under random caching at SBSs with the caching distribution as a design
parameter. Using tools from stochastic geometry and adopting appropriate
integral transformations, we first derive a tractable expression for the
successful transmission probability under each scheme. Then, under each scheme,
we consider the successful transmission probability maximization by optimizing
the caching distribution, which is a challenging optimization problem with a
non-convex objective function. By exploring optimality properties and using
optimization techniques, under each scheme, we obtain a local optimal solution
in the general case and global optimal solutions in some special cases.
Compared with some existing caching designs in the literature, e.g., the most
popular caching, the i.i.d. caching and the uniform caching, the optimal random
caching under each scheme achieves better successful transmission probability
performance. The analysis and optimization results provide valuable design
insights for practical HetNets.
Ahmed M. Bedewy , Yin Sun , Ness B. Shroff Subjects : Information Theory (cs.IT) ; Networking and Internet Architecture (cs.NI)
The problem of reducing the age-of-information has been extensively studied
in the single-hop setting. In this paper, we minimize the age-of-information in
general multihop networks. We first prove that for exponential packet
transmission times, a preemptive Last Generated First Served (LGFS) policy
results in smaller age processes at all nodes of the network (in a stochastic
ordering sense) than any other causal policy. We then show that for the class
of non-preemptive work-conserving policies, for any general packet transmission
time distributions, the non-preemptive LGFS policy minimizes the age processes
at all nodes of the network (again in a stochastic ordering sense). These
optimality results not only hold for the age process, but also for any
non-decreasing functional of the age process.
Antenna Deployment Method for MIMO Radar under the Situation of Multiple Interference Regions
Comments: 12 pages
Subjects:
Information Theory (cs.IT)
In this paper, considering multiple interference regions simultaneously, an
optimal antenna deployment problem for distributed Multi-Input Multi-Output
(MIMO) radar is investigated. The optimal antenna deployment problem is solved
by proposing an antenna deployment method based on Multi-Objective Particle
Swarm Optimization (MOPSO). Firstly, we construct a multi-objective
optimization problem for MIMO radar antenna deployment by choosing the
interference power densities of different regions as objective functions. Then,
to obtain the optimal deployment result without wasting time and computational
resources, an iteration convergence criterion based on interval distance is
proposed. The iteration convergence criterion can be used to stop the MOPSO
optimization process efficiently when the optimal antenna deployment algorithm
reaches the desired convergence level. Finally, numerical results are provided
to verify the validity of the proposed algorithm.
Lingxiao Huang , Yifei Jin , Jian Li , Haitao Wang Subjects : Information Theory (cs.IT) ; Data Structures and Algorithms (cs.DS)
It is known that certain structures of the signal in addition to the standard
notion of sparsity (called structured sparsity) can improve the sample
complexity in several compressive sensing applications. Recently, Hegde et al.
proposed a framework, called approximation-tolerant model-based compressive
sensing, for recovering signals with structured sparsity. Their framework
requires two oracles, the head- and the tail-approximation projection oracles.
The two oracles should return approximate solutions in the model which is
closest to the query signal. In this paper, we consider two structured sparsity
models and obtain improved projection algorithms. The first one is the tree
sparsity model, which captures the support structure in the wavelet
decomposition of piecewise-smooth signals. We propose a linear time
((1-epsilon))-approximation algorithm for head-approximation projection and a
linear time ((1+epsilon))-approximation algorithm for tail-approximation
projection. The best previous result is an ( ilde{O}(nlog n)) time
bicriterion approximation algorithm (meaning that their algorithm may return a
solution of sparsity larger than (k)) by Hegde et al. Our result provides an
affirmative answer to the open problem mentioned in the survey of Hegde and
Indyk. As a corollary, we can recover a constant approximate (k)-sparse signal.
The other is the Constrained Earth Mover Distance (CEMD) model, which is useful
to model the situation where the positions of the nonzero coefficients of a
signal do not change significantly as a function of spatial (or temporal)
locations. We obtain the first single criterion constant factor approximation
algorithm for the head-approximation projection. The previous best known
algorithm is a bicriterion approximation. Using this result, we can get a
faster constant approximation algorithm with fewer measurements for the
recovery problem in CEMD model.
Rigorous Dynamics of Expectation-Propagation-Based Signal Recovery from Unitarily Invariant Measurements
Comments: Submitted to ISIT2017. A short version of arXiv:1701.05284
Subjects:
Information Theory (cs.IT)
This paper investigates sparse signal recovery based on expectation
propagation (EP) from unitarily invariant measurements. A rigorous analysis is
presented for the state evolution (SE) of an EP-based message-passing algorithm
in the large system limit, where both input and output dimensions tend to
infinity at an identical speed. The main result is the justification of an SE
formula conjectured by Ma and Ping in 2016.
Comments: 30 pages, 7 figures, journal
Subjects:
Networking and Internet Architecture (cs.NI)
; Information Theory (cs.IT)
In this paper, we study a X-duplex relay system with one source, one
amplify-and-forward (AF) relay and one destination, where the relay is equipped
with a shared antenna and two radio frequency (RF) chains used for transmission
or reception. X-duplex relay can adaptively configure the connection between
its RF chains and antenna to operate in either HD or FD mode, according to the
instantaneous channel conditions. We first derive the distribution of the
signal to interference plus noise ratio (SINR), based on which we then analyze
the outage probability, average symbol error rate (SER), and average sum rate.
We also investigate the X-duplex relay with power allocation and derive the
lower bound and upper bound of the corresponding outage probability. Both
analytical and simulated results show that the X-duplex relay achieves a better
performance over pure FD and HD schemes in terms of SER, outage probability and
average sum rate, and the performance floor caused by the residual self
interference can be eliminated using flexible RF chain configurations.
Comments: 11 pages
Subjects:
Combinatorics (math.CO)
; Information Theory (cs.IT)
This paper presents a combinatorial construction of low-density parity-check
(LDPC) codes from difference covering arrays. While the original construction
by Gallagher was by randomly allocating bits in a sparse parity-check matrix,
over the past 20 years researchers have used a variety of more structured
approaches to construct these codes, with the more recent constructions of
well-structured LDPC coming from balanced incomplete block designs (BIBDs) and
from Latin squares over finite fields. However these constructions have
suffered from the limited orders for which these designs exist. Here we present
a construction of LDPC codes of length (4n^2 – 2n) for all (n) using the cyclic
group of order (2n). These codes achieve high information rate (greater than
0.8) for (n geq 8), have girth at least 6 and have minimum distance 6 for (n)
odd.
Comments: 23 pages, 2 figures
Subjects:
Quantum Physics (quant-ph)
; Information Theory (cs.IT)
For any given channel (W) with classical inputs and possibly quantum outputs,
a dual classical-input channel (W^perp) can be defined by embedding the
original into a channel (mathcal N) with quantum inputs and outputs. Here we
give new uncertainty relations for a general class of entropies that lead to
very close relationships between the original channel and its dual. Moreover,
we show that channel duality can be combined with duality of linear codes,
whereupon the uncertainty relations imply that the performance of a given code
over a given channel is entirely characterized by the performance of the dual
code on the dual channel. This has several applications. In the context of
polar codes, it implies that the rates of polarization to ideal and useless
channels must be identical. Duality also relates the tasks of channel coding
and privacy amplification, implying that the finite blocklength performance of
extractors and codes is precisely linked, and that optimal rate extractors can
be transformed into capacity-achieving codes, and vice versa. Finally, duality
also extends to the EXIT function of any channel and code. Here it implies that
for any channel family, if the EXIT function for a fixed code has a sharp
transition, then it must be such that the rate of the code equals the capacity
at the transition. This may give a different route to proving a code family
achieves capacity by establishing sharp EXIT function transitions.
微信扫一扫,关注我爱机器学习公众号
微博:我爱机器学习